sanastas commented on issue #5698: Oak: New Concurrent Key-Value Map 
URL: 
https://github.com/apache/incubator-druid/issues/5698#issuecomment-507994908
 
 
   In the meanwhile we would like to share some insights we had while playing 
with IncrementalIngestionBenchmark. We continued the experiments that had 
started and were presented above under “IncrementalIngestionBenchmark” title. 
We compare native Druid’s incremental index and newly suggested 
OakIncrementalIndex (data taken off-heap). The data distribution/generation is 
exactly as in IncrementalIngestionBenchmark, we just insert more rows.
   
   This time we inserted 6 Million rows (about 8GB of data) while giving 24GB 
RAM. Number 24GB appear as this is almost the lowest number allowing native 
Druid’s IncrementalIndex to run properly. Even taking into consideration that 
in IncrementalIngestionBenchmark, the rows come prepared before the benchmark 
measurement and actually already take big chunk of on-heap memory, x3 memory 
requirement sounds a lot… Druid’s incremental index gets 24GB on heap memory. 
OakIncrementalIndex always gets 16GB on-heap and 8GB off-heap (in total 24GB 
RAM).  The results can be see in the file bellow. The graph shows throughput 
(number of operations in seconds) so the bigger the better. OakIncrementalIndex 
performs about twice faster.
   
   [Ingestion Throughput      6M rows (8GB data) 
ingested.pdf](https://github.com/apache/incubator-druid/files/3353954/Ingestion.Throughput.6M.rows.8GB.data.ingested.pdf)
   
   
   In order to stress the memory overhead requirement, we have run yet another 
experiment, this time inserting 7 Million rows, which is up to 9GB data. We 
gradualy increased the memory requirement and present the throughput as 
function of total RAM used. Results in the file below. OakIncrementalIndex 
off-heap memory requirement was always 9GB, as it is what is written there. We 
have started by giving 24GB of total RAM as this is where OakIncrementalIndex 
was able to operate, although its throughput was very low. Native Druid’s 
IncrementalIndex was unable to operate until 28GB of on-heap memory was allowed 
to be used.
   
   [Ingestion 9GB data into 
Druid.pdf](https://github.com/apache/incubator-druid/files/3353955/Ingestion.9GB.data.into.Druid.pdf)
   
   
   
   Does the question of metadata memory overhead bother you? Also would you be 
interested in working with bigger IncrementalIndexes, in order to later have 
less flushes to disk (persist), causing less merges, and thus higher 
performance?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to