Hello! Just for information: as finite stream size is very unlikely to exceed Long.MAX_VALUE (and even if exceeds it will take enormous CPU time to process such stream), the problem can be fixed just by tracking 128 bit sum instead of 64 bit. It's not very difficult to implement it: just two long values should be used with proper carry. Here's the sample implementation:
https://gist.github.com/amaembo/1edf2045b687a0ac9db8 Here averageInt and averageLong collectors are implemented which work exactly like ones available in JDK, but don't overflow. The final division is implemented using BigDecimal only if long overflow occurred, thus for non-overflow scenarios the overhead compared to current implementation is quite low (would be even lower had Long.compareUnsigned be intrinsified). Tracking the sum in BigInteger variable would be significantly slower and require continuous heap allocations as BigInteger is immutable. With best regards, Tagir Valeev. TFV> The following code prints -1 twice, while users would normally expect TFV> something like 9.223372036854776E18: TFV> double avg1 = Stream.of(Long.MAX_VALUE, Long.MAX_VALUE).collect( TFV> Collectors.averagingLong(Long::valueOf)); TFV> System.out.println(avg1); TFV> double avg2 = LongStream.of(Long.MAX_VALUE, Long.MAX_VALUE).average() TFV> .getAsDouble(); TFV> System.out.println(avg2); TFV> That's because in both cases internally sum is calculated in the long TFV> variable where it may silently overflow. The documentation for both TFV> methods says nothing about such possibility. I guess if it's too late TFV> to fix this behavior, then probably it should be at least properly TFV> documented? TFV> With best regards, TFV> Tagir Valeev.