save-buffer commented on PR #13487: URL: https://github.com/apache/arrow/pull/13487#issuecomment-1190686364
Weston is correct, the `key_hash` interface is only meant for internal use of hash tables. It subtly doesn't match xxh in some cases (iirc it for example can pad with 0's instead of having special handling for arbitrary lengths). The interface is also optimized for hashing in mini batches and reusing as much memory as possible. In general I would suggest using `util/hashing` instead of `key_hash`, as `key_hash` is for internal use; we may want to change it at any time (e.g. we had an idea to see if we can make a really crappy hash that's very fast to compute but is "good enough" for a hash table). I can leave a review of the current code, but I'd suggest moving away from the `key_hash` interface overall. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
