> From: "Viktor Klang" <viktor.kl...@oracle.com> > To: "Remi Forax" <fo...@univ-mlv.fr>, "Fabian Meumertzheim" > <fab...@buildbuddy.io> > Cc: "Paul Sandoz" <paul.san...@oracle.com>, "core-libs-dev" > <core-libs-dev@openjdk.org> > Sent: Wednesday, February 19, 2025 11:43:33 AM > Subject: Re: [External] : Re: JDK-8072840: Presizing for Stream Collectors
> Might be possible to extract the "descriptive" accessors to a superinterface > of > Spliterator. > And Gatherer could either transform an incoming such instance to an estimated > outbound instance, plus an overload of initializer , and Collector would > likely > only need the supplier overload. yes, something like that. > Cheers, > √ regards, Rémi > Viktor Klang > Software Architect, Java Platform Group > Oracle > From: Remi Forax <fo...@univ-mlv.fr> > Sent: Saturday, 15 February 2025 15:44 > To: Viktor Klang <viktor.kl...@oracle.com>; Fabian Meumertzheim > <fab...@buildbuddy.io> > Cc: Paul Sandoz <paul.san...@oracle.com>; core-libs-dev > <core-libs-dev@openjdk.org> > Subject: [External] : Re: JDK-8072840: Presizing for Stream Collectors >> From: "Viktor Klang" <viktor.kl...@oracle.com> >> To: "Paul Sandoz" <paul.san...@oracle.com>, "Fabian Meumertzheim" >> <fab...@buildbuddy.io> >> Cc: "core-libs-dev" <core-libs-dev@openjdk.org> >> Sent: Thursday, February 13, 2025 11:30:59 PM >> Subject: Re: JDK-8072840: Presizing for Stream Collectors >> Indeed. I hope I didn't sound discouraging about the possibility to propagate >> the stream size information. >> I merely want to emphasize that it may necessitate a slightly broader take on >> the problem of propagation of stream-instance metadata, especially in the >> face >> of Gatherers becoming a finalized feature. > We already have an abstraction for propagating metadata, it's the query part > of > Spliterator (characteristics/estimateSize/comparator etc, technically all > abstract methods that does not starts with "try"). > For a Gatherer, we need a way to say if a characteristics is preserved or > removed. > For a collector, we need a way to have a supplier that takes a Spliterator (a > synthetic one, not the one that powers the actual stream) so the > characteristics can be queried. >> It's great that you started this conversation, Fabian! >> Cheers, >> √ > regards, > Rémi >> Viktor Klang >> Software Architect, Java Platform Group >> Oracle >> From: core-libs-dev <core-libs-dev-r...@openjdk.org> on behalf of Paul Sandoz >> <paul.san...@oracle.com> >> Sent: Thursday, 13 February 2025 20:18 >> To: Fabian Meumertzheim <fab...@buildbuddy.io> >> Cc: core-libs-dev <core-libs-dev@openjdk.org> >> Subject: Re: JDK-8072840: Presizing for Stream Collectors >> Hi Fabian, >> Thanks for sharing and reaching out with the idea before getting too >> beholden to >> it. >> I logged this is quite a while ago. It seemed like a possible good idea at >> the >> time, although I never liked the duplication of suppliers. I have become less >> enthusiastic overtime, especially so as Gatherers have been added. (Gatherer >> is >> the underlying primitive we could not find when we were furiously developing >> streams and meeting the Java 8 deadline.) My sense is if we are going to >> address we need to think more broadly about Gatherers. And, Viktor being the >> lead on Gatherers has a good take on where this might head. >> Paul. >> > On Feb 12, 2025, at 2:09 AM, Fabian Meumertzheim <fab...@buildbuddy.io> >> > wrote: >> > As an avid user of Guava's ImmutableCollections, I have been >> > interested in ways to close the efficiency gap between the built-in >> > `Stream#toList()` and third-party `Collector` implementations such as >> > `ImmutableList#toImmutableList()`. I've found the biggest problem to >> > be the lack of sizing information in `Collector`s, which led to me to >> > draft a solution to JDK-8072840: >>> [ >>> https://urldefense.com/v3/__https://github.com/openjdk/jdk/pull/23461__;!!ACWV5N9M2RV99hQ!LTEREzHdc6ygpw-3ySfNxSnGgAE_lEgi-NVIohRizUbXqPEDTxxWv25DQMlv_BO37jHevX51iL7Jtzd7YAOB$ >> > | >> https://github.com/openjdk/jdk/pull/23461 ] >> > The benchmark shows pretty significant gains for sized streams that >> > mostly reshape data (e.g. slice records or turn a list into a map by >> > associating keys), which I've found to be a pretty common use case. >> > Before I formally send out the PR for review, I would like to gather >> > feedback on the design aspects of it (rather than the exact >> > implementation). I will thus leave it in draft mode for now, but >> > invite anyone to comment on it or on this thread. >> > Fabian