Weston Pace created ARROW-16513:
-----------------------------------
Summary: [C++] Add a compute function to hash inputs
Key: ARROW-16513
URL: https://issues.apache.org/jira/browse/ARROW-16513
Project: Apache Arrow
Issue Type: Bug
Components: C++
Reporter: Weston Pace
We have a lot of internal logic for hashing inputs and it might be nice to
expose some of this to users (e.g.
https://stackoverflow.com/questions/72177022/how-to-get-hash-of-string-column-in-polars-or-pyarrow)
The `HashBatch` method in `key_hash.h` (not quite merged but close) is likely
to be the most performant. However, it does make some sacrifices on uniqueness
of hashes in the spirit of performance (so we should make sure to document
these).
--
This message was sent by Atlassian Jira
(v8.20.7#820007)