js8544 opened a new issue, #38372: URL: https://github.com/apache/arrow/issues/38372
### Describe the enhancement requested This is a follow up of https://github.com/apache/arrow/issues/36059#issuecomment-1592037527. There are many cases where we use a `MemoTable`, e.g. set lookup functions, vector hash functions, the `count_distinct` aggregate function, dictionary unification etc. Their performance can be boosted with a SwissTable. We can either: 1. Use an existing swiss table library. This requires some work on dependency management. I recommend `absl::flat_hash_map` since they are the original author of swiss tables and we already has `absl` in our 3rd party toolchain. 2. Write one by ourselves. The current one in `acero` is too customized for the join node and it seems hard to extract a general hashtable from it. If I were to write one I would probably follow the structure of `absl`'s and replace things like memory management and bit tweaking with Arrow ones. What do you think? @pitrou @westonpace ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
