gianm opened a new pull request #9308: Add MemoryOpenHashTable, a table similar to ByteBufferHashTable. URL: https://github.com/apache/druid/pull/9308 With some key differences to improve speed and design simplicity: 1) Uses Memory rather than ByteBuffer for its backing storage. 2) Uses faster hashing and comparison routines (see HashTableUtils). 3) Capacity is always a power of two, allowing simpler design and more efficient implementation of findBucket. 4) Does not implement growability; instead, leaves that to its callers. The idea is this removes the need for subclasses, while still giving callers flexibility in how to handle table-full scenarios. The combination of these techniques above can boost performance of per-segment groupBy processing on realistic queries by 30%+ (in some cases I tested, it was even more extreme: 2–4x when grouping by single long dimensions). The idea is that in time, users of ByteBufferHashTable will be migrated to this implementation.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
