freakyzoidberg opened a new pull request, #679:
URL: https://github.com/apache/datasketches-java/pull/679

   This pull request introduces a new test class, `HllSketchMergeOrderTest`, to 
investigate and demonstrate the order dependency of DataSketch HLL merge 
operations. The test highlights that merging HLL sketches in different orders 
can produce varying cardinality estimates, which has significant implications 
for applications relying on consistent results.
   
   ### Key Additions:
   
   #### New Test for Merge Order Dependency:
   * Added `HllSketchMergeOrderTest` class to demonstrate that merging HLL 
sketches in different orders (e.g., ABC, CBA, BAC) can lead to different 
cardinality estimates, proving that the operations are not always commutative 
or associative. This is especially evident with specific data patterns like 
powers of 2.
   
   #### Supporting Methods:
   * Implemented `createPowersOf2Sketch` method to generate HLL sketches with 
powers-of-2 values, which are prone to triggering order dependency.
   * Added `mergeThreeSketches` method to merge three sketches in a specified 
order and return the cardinality estimate.
   
   #### Documentation and Findings:
   * Included detailed comments and documentation within the test class to 
explain the findings, key implications, and the mathematical expectations 
violated by the observed behavior.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@datasketches.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@datasketches.apache.org
For additional commands, e-mail: dev-h...@datasketches.apache.org

Reply via email to