Ok, maybe a total ban is overkill, but right now streams are used even on
some hot paths like getAllAsync [1].

Another issue is that Collectors.toList() and other variants don't accept
capacity, and we end up with unnecessary reallocations of underlying arrays.

[1]
https://github.com/apache/ignite-3/blob/1d7d703ff2b18234b15a9a7b03104fbb73388edf/modules/table/src/main/java/org/apache/ignite/internal/table/KVBinaryViewImpl.java#L83

On Wed, Sep 8, 2021 at 1:06 PM Konstantin Orlov <kor...@gridgain.com> wrote:

> Hi!
>
> Agree with Ivan that it’s an overkill. Code readability and
> maintainability should have
> the same priority as the performance (with some exceptions of course).
>
> BTW the result of your benchmark looks quite strange. The performance
> penalty on
> my laptop (Core i7 9750H, 6 cores up to 4.50 GHz) is 25%, not 8 times:
>
> Benchmark                         Mode  Cnt      Score     Error   Units
> JmhIncrementBenchmark.loopSum    thrpt   10  32347.819 ± 676.548  ops/ms
> JmhIncrementBenchmark.streamSum  thrpt   10  24459.196 ± 610.152  ops/ms
>
>
> --
> Regards,
> Konstantin Orlov
>
>
> > On 8 Sep 2021, at 12:23, Ivan Bessonov <bessonov...@gmail.com> wrote:
> >
> > Hello Igniters,
> >
> > I object, banning streams is an overkill. I would argue that most of the
> > code
> > is not on hot paths and that allocations in TLAB don't create much
> pressure
> > on GC.
> >
> > Streams must be used cautiously, developers should know whether they
> > write hot methods or not. And if methods are not hot, code simplicity
> must
> > be
> > the first priority. I don't want Ignite 3 code to look like Ignite 2
> code,
> > where
> > people would iterate over Lists using explicit access by indexes,
> because it
> > saves them a single Iterator allocation. That's absurd.
> >
> > ср, 8 сент. 2021 г. в 11:43, Pavel Tupitsyn <ptupit...@apache.org>:
> >
> >> Igniters,
> >>
> >> Java streams are known to be slower and cause more GC pressure than an
> >> equivalent loop.
> >> Below is a simple filter/map/reduce scenario (code [1]):
> >>
> >> * Benchmark                                                     Mode
> Cnt
> >>    Score     Error   Units
> >>
> >> * StreamVsLoopBenchmark.loopSum                                 thrpt
>   3
> >> 7987.016 ± 293.013  ops/ms
> >> * StreamVsLoopBenchmark.loopSum:·gc.alloc.rate                  thrpt
>   3
> >>   ≈ 10⁻⁴            MB/sec
> >> * StreamVsLoopBenchmark.loopSum:·gc.count                       thrpt
>   3
> >>      ≈ 0            counts
> >>
> >> * StreamVsLoopBenchmark.streamSum                               thrpt
>   3
> >> 1060.244 ±  36.485  ops/ms
> >> * StreamVsLoopBenchmark.streamSum:·gc.alloc.rate                thrpt
>   3
> >>  315.819 ±  10.844  MB/sec
> >> * StreamVsLoopBenchmark.streamSum:·gc.count                     thrpt
>   3
> >>   55.000            counts
> >>
> >> Loop is several times faster and does not allocate at all.
> >>
> >> 1. Performance is one of the most important features of our product.
> >> 2. Most of our APIs will be on the hot path.
> >>
> >> One can argue about performance differences in real-world scenarios,
> >> but increasing GC pressure just to make the code a little bit nicer is
> >> unacceptable.
> >>
> >> I propose to ban streams usage in the codebase (except for the tests).
> >>
> >> Thoughts, objections?
> >>
> >> [1] https://gist.github.com/ptupitsyn/5934bbbf8f92ac4937e534af9386da97
> >>
> >
> >
> > --
> > Sincerely yours,
> > Ivan Bessonov
>
>

Reply via email to