tisonkun opened a new issue, #34: URL: https://github.com/apache/datasketches-rust/issues/34
Currently, we're using [`mur3`](https://crates.io/crates/mur3), which has not been updated for over 4 years. It uses a u32 seed while other datasketches impls use a 64-bit seed (the default is `9001`, which can be presented in u32, though). There is another crate [`murmur3`](https://crates.io/crates/murmur3) that hasn't been updated for over 3 years as well. Even more `murmur3` uses `std::io::Read` that causes an unnecessary `Result` where we don't have with `&[u8]` as the input. datasketches-java implements MurmurHash3 inside, and perhaps we should follow that to ensure the implementations are the same: * https://github.com/aappleby/smhasher/blob/0ff96f78/src/MurmurHash3.cpp * https://github.com/apache/datasketches-cpp/blob/7bb979d3/common/include/MurmurHash3.h * https://github.com/apache/datasketches-java/blob/ca1fef02/src/main/java/org/apache/datasketches/hash/MurmurHash3.java * https://github.com/apache/datasketches-go/blob/f7bc4b1d/internal/murmur3.go What do you think @notfilippo @freakyzoidberg @leerho @AlexanderSaydakov? I'm a bit hestitate between leveraging other hash crates (mur3, [xxhash](https://github.com/shepmaster/twox-hash)) and implement them inside datasketches-rust; even expose them as part of the public API? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
