> On 21 Nov 2016, at 20:00, Tagir Valeev <amae...@gmail.com> wrote: > > Hello! > >> Hi Tagir, >> >> In the original issue i was pondering using SIZED for smaller bit sets and >> non-SIZED for larger bit sets, since we stride over the longs themselves >> when calculating the size (should be intrinsic to count the set bits, at >> least on x86). Supporting both is fairly simple, but you are correct not >> reporting SIZED would simplify things further and it may not matter in >> practice. >> >> Can you log an issue? I will follow up with some performance analysis. > > Sure, here it is: > https://bugs.openjdk.java.net/browse/JDK-8170159 >
Thanks! > Probably it would be optimal not to call cardinality until it's > actually queried. Unfortunately current stream implementation always > queries getExactSizeIfKnown() (passing it into sink.begin()) even if > it's not actually used (which is most of the cases). > The expectation being that if a spliterator reports SIZED then the exact size it can be calculated efficiently, which is not necessarily the case for BitStream. For sequential streams i think you have a point, but for parallel streams the estimated size will also be used for the splitting threshold. (A spliterator cannot report both an estimated size and an exact size.) Paul.