jhorstmann commented on issue #790: URL: https://github.com/apache/arrow-datafusion/issues/790#issuecomment-890995232
> Collisions are handled by the fact that the entry in the hash table is a _list_ of indicies into the mutable area -- if there are multiple values in that list each entry in the mutable area needs to be checked for equality to find the correct one. I don't really see how we can to this equality checks faster than the hashmap itself could do it, but I would be very happy to be proven wrong. For optimized eq and hash handling, hashbrown has some more low-level functionality that we are not yet using: - [`from_hash(hash, eq_fn)`](https://docs.rs/hashbrown/0.11.2/hashbrown/hash_map/struct.RawEntryBuilderMut.html#method.from_hash): allows looking up an entry with a given hash and equals function, bypassing the hashbuilder of the map. This allows implementing eq on values that are not directly stored in the map, for example if only an index into another structure is part of the key. - [`insert_with_hasher(hash, key, value, hasher)`](https://docs.rs/hashbrown/0.11.2/hashbrown/hash_map/struct.RawVacantEntryMut.html#method.insert_with_hasher): Again allows bypassing the hashbuilder and looking up the hash value in some other structure. The `hasher` parameter is only called if the table needs rehashing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org