[GitHub] [druid] gianm commented on issue #9689: groupBy query: limit push down to segment scan is poor performance

GitBox Tue, 14 Apr 2020 12:50:26 -0700

gianm commented on issue #9689: groupBy query: limit push down to segment scan 
is poor performance
URL: https://github.com/apache/druid/issues/9689#issuecomment-613646993
 
 
   > it sucks that we need to implement grow-ability to save cost of zeroing 
out where we already have allocated all the memory or is there any other 
advantage?.
   
   The problem is the memory is not initialized when we allocated it, so it has 
garbage in it, and therefore we need to initialize the parts we're going to 
use. And initialization can take a while if your buffer size is big (like a 
gigabyte, as some people do).
   
   > Instead, if we reserved numBucket bytes at the start of buffer and used 
those for marking.
   
   I bet it would be faster, especially if we use the Memory.clear API. I did 
some work to migrate some of the groupBy code to use Memory instead of 
ByteBuffer (#9308, #9314; see also MemoryBenchmark) partially motivated by 
performance. If we moved this part too it should help with initialization speed.
   
   But growability would probably help even more. It should be relatively 
straightforward for AlternatingByteBufferHashTable: each time we grow we swap 
into the other half of the buffer.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] gianm commented on issue #9689: groupBy query: limit push down to segment scan is poor performance

Reply via email to