gianm commented on issue #9689: groupBy query: limit push down to segment scan is poor performance URL: https://github.com/apache/druid/issues/9689#issuecomment-613646993 > it sucks that we need to implement grow-ability to save cost of zeroing out where we already have allocated all the memory or is there any other advantage?. The problem is the memory is not initialized when we allocated it, so it has garbage in it, and therefore we need to initialize the parts we're going to use. And initialization can take a while if your buffer size is big (like a gigabyte, as some people do). > Instead, if we reserved numBucket bytes at the start of buffer and used those for marking. I bet it would be faster, especially if we use the Memory.clear API. I did some work to migrate some of the groupBy code to use Memory instead of ByteBuffer (#9308, #9314; see also MemoryBenchmark) partially motivated by performance. If we moved this part too it should help with initialization speed. But growability would probably help even more. It should be relatively straightforward for AlternatingByteBufferHashTable: each time we grow we swap into the other half of the buffer.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
