tisonkun opened a new issue, #10:
URL: https://github.com/apache/datasketches-rust/issues/10

   Yeah. I think it's technically reasonable to have snapshots for tests. The 
key point here is how these snapshots get generated, can we reproducibly make 
them? And how we can modify/update them.
   
   In other words, how the snapshots in datasketches-go gets generated 
originally?
   
   _Originally posted by @tisonkun in 
https://github.com/apache/datasketches-rust/pull/1#discussion_r2616336904_
   
   ---
   
   @freakyzoidberg told me there are some "tests" which generate those files in 
the go repository 
(https://github.com/apache/datasketches-go/blob/f7bc4b1db865c2dd1be9134d8a61eeb8bc24b1c6/hll/hll_sketch_serialization_test.go#L29)
 I assumed it's similar for the other implementations.
   
   _Originally posted by @notfilippo in 
https://github.com/apache/datasketches-rust/pull/1#discussion_r2616338736_
   
   ---
   
   Great! Then at least we can reuse the Go logic to generate Go snapshot. But 
I can see that it would require extra engineer effort so I won't block this PR 
by such potential improvement to avoid (mysterious) binaries as much as 
possible.
   
   For the Java and C++ snapshot, perhaps @leerho and @AlexanderSaydakov can 
give some inputs here.
   
   _Originally posted by @tisonkun in 
https://github.com/apache/datasketches-rust/pull/1#discussion_r2616343650_
   
   ---
   
   CPP / Java and Go repo do have some test that generate and cross-test the 
synopsis from the other repos.
   
   It's vey much convention and quite manual - and as Lee hinted in the other 
thread we didn't really think about how to scale this with more language (very 
much M*N issue)
   
   you can find the Java HLL x-check 
[here](https://github.com/apache/datasketches-java/blob/main/src/test/java/org/apache/datasketches/hll/HllSketchCrossLanguageTest.java)
 and the cpp ones for 
[ser](https://github.com/apache/datasketches-cpp/blob/master/hll/test/hll_sketch_serialize_for_java.cpp)/[de](https://github.com/apache/datasketches-cpp/blob/master/hll/test/hll_sketch_deserialize_from_java_test.cpp)
 there
   
   Also worth noting that not all synopsis are guaranteed to have byte for byte 
equality (they'll behave the same, logically equivalent from a behavior aspect 
and are fully serializable/deserializable between language - but not all 
provide guarantee of idempotency generation when looking at raw bytes - tldr 
not all rng are seeded - they could I suppose though)
   
   _Originally posted by @freakyzoidberg in 
https://github.com/apache/datasketches-rust/pull/1#discussion_r2616369040_
   
   ---
   
   > very much M*N issue
   
   Not quite. Each language can implement its own serialized snapshots, and any 
language should leverage the existing snapshots while patching its own.
   
   We can have a shared snapshot library like shared proto definitions in other 
projects.
   
   Anyway, this is another topic, so I'll open a new issue to track it.
   
   _Originally posted by @tisonkun in 
https://github.com/apache/datasketches-rust/pull/1#discussion_r2616639771_
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to