thomasrebele commented on issue #693:
URL: 
https://github.com/apache/datasketches-java/issues/693#issuecomment-3710105210

   @jmalkin, what do you mean with "Using a fixed seed across all sketches in a 
run"? Something like [line 228 to 236 of 
ExperimentDeterministicMerge](https://github.com/thomasrebele/datasketches-java/commit/e540a85bd9021c3faeb3118bc1437311bd7a8671#diff-f15e17269b86ec14ccbe6dbc1aec506b9cd4fff5bf2be5f9cdda351447cecc92R228-R236)?
 That is indeed problematic, as the experiment shows. The [new merge 
method](https://github.com/apache/datasketches-java/commit/6caa284a0ab01fee000ee0d42dd0d919ee387aed#diff-dacbef77551b8e94e3095811486b859725948654627b5141625c453f0ecad774)
 mentions that the error bounds might be broken. The new merge method could be 
renamed to, e.g., `unsafeMerge`, so that the caller is aware of the problem.
   
   Surely using the API wrongly may lead to incorrect results. However, in the 
use case that I'm trying to support, there is one run where n KLL sketches are 
merged, and AFAIK the resulting sketch is never merged with another sketch 
afterwards. So the code follows the intended use of the algorithm. 
Unfortunately, with the current API of the datasketches library, this is not 
possible.
   
   I've added some other 
[experiments](https://github.com/thomasrebele/datasketches-java/commit/e540a85bd9021c3faeb3118bc1437311bd7a8671):
 I mocked the RNG so that it generates the sequence 0,1,0,1,... (or its 
companion 1,0,1,0,...). The errors are quite similar to the original KLL errors 
(entries named `alternating` in 
https://github.com/thomasrebele/datasketches-java/commit/e540a85bd9021c3faeb3118bc1437311bd7a8671).
 (Interestingly, this is not the case with a sequence, e.g., 
0,1,0,0,1,0,0,1,0,...).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to