freakyzoidberg opened a new pull request, #676: URL: https://github.com/apache/datasketches-java/pull/676
This pull request introduces several enhancements and build fixes to the `CountMinSketch` class, focusing on performance improvements, better validation, and code simplification. Additionally, it includes a minor change in the `CpcSketchCrossLanguageTest` test file. The most important changes are grouped below by theme. ### Performance Improvements: * Introduced a thread-local `ByteBuffer` in `CountMinSketch` to avoid frequent allocations during conversions of `long` values to byte arrays, improving efficiency in hot paths (`longToBytes` method and related updates). [[1]](diffhunk://#diff-a27b2c7ae95edb924c40143bf39f23acc3cab7f6a03b674262849f067b6ac2d6R42-R44) [[2]](diffhunk://#diff-a27b2c7ae95edb924c40143bf39f23acc3cab7f6a03b674262849f067b6ac2d6L174-R206) [[3]](diffhunk://#diff-a27b2c7ae95edb924c40143bf39f23acc3cab7f6a03b674262849f067b6ac2d6L214-R245) [[4]](diffhunk://#diff-a27b2c7ae95edb924c40143bf39f23acc3cab7f6a03b674262849f067b6ac2d6L257-R288) [[5]](diffhunk://#diff-a27b2c7ae95edb924c40143bf39f23acc3cab7f6a03b674262849f067b6ac2d6L294-R324) ### Validation Enhancements: * Added detailed validation checks for `numHashes` and `numBuckets` in the `CountMinSketch` constructor, including mathematical justifications and overflow prevention for array size calculations. ### Code Simplification: * Simplified the `getEstimate` method by avoiding redundant processing of the first hash location during frequency estimation. ### Dependency Updates: * Updated imports in `CountMinSketch` to use `org.apache.datasketches.common.Util` instead of `org.apache.datasketches.tuple.Util`. ### Test File Adjustment: * Modified `CpcSketchCrossLanguageTest` to use `MemorySegment.ofArray` instead of `Memory.wrap` for heapifying sketches, aligning with updated memory handling practices. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@datasketches.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@datasketches.apache.org For additional commands, e-mail: dev-h...@datasketches.apache.org