[GitHub] [incubator-druid] sanastas commented on issue #5698: Oak: New Concurrent Key-Value Map

GitBox Wed, 03 Jul 2019 01:29:01 -0700

sanastas commented on issue #5698: Oak: New Concurrent Key-Value Map
URL:
https://github.com/apache/incubator-druid/issues/5698#issuecomment-507994908

In the meanwhile we would like to share some insights we had while playing
with IncrementalIngestionBenchmark. We continued the experiments that had
started and were presented above under “IncrementalIngestionBenchmark” title.
We compare native Druid’s incremental index and newly suggested
OakIncrementalIndex (data taken off-heap). The data distribution/generation is
exactly as in IncrementalIngestionBenchmark, we just insert more rows.

This time we inserted 6 Million rows (about 8GB of data) while giving 24GB
RAM. Number 24GB appear as this is almost the lowest number allowing native
Druid’s IncrementalIndex to run properly. Even taking into consideration that
in IncrementalIngestionBenchmark, the rows come prepared before the benchmark
measurement and actually already take big chunk of on-heap memory, x3 memory
requirement sounds a lot… Druid’s incremental index gets 24GB on heap memory.
OakIncrementalIndex always gets 16GB on-heap and 8GB off-heap (in total 24GB
RAM). The results can be see in the file bellow. The graph shows throughput
(number of operations in seconds) so the bigger the better. OakIncrementalIndex
performs about twice faster.

[Ingestion Throughput 6M rows (8GB data)
ingested.pdf](https://github.com/apache/incubator-druid/files/3353954/Ingestion.Throughput.6M.rows.8GB.data.ingested.pdf)

In order to stress the memory overhead requirement, we have run yet another
experiment, this time inserting 7 Million rows, which is up to 9GB data. We
gradualy increased the memory requirement and present the throughput as
function of total RAM used. Results in the file below. OakIncrementalIndex
off-heap memory requirement was always 9GB, as it is what is written there. We
have started by giving 24GB of total RAM as this is where OakIncrementalIndex
was able to operate, although its throughput was very low. Native Druid’s
IncrementalIndex was unable to operate until 28GB of on-heap memory was allowed
to be used.

[Ingestion 9GB data into
Druid.pdf](https://github.com/apache/incubator-druid/files/3353955/Ingestion.9GB.data.into.Druid.pdf)

Does the question of metadata memory overhead bother you? Also would you be
interested in working with bigger IncrementalIndexes, in order to later have
less flushes to disk (persist), causing less merges, and thus higher
performance?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-druid] sanastas commented on issue #5698: Oak: New Concurrent Key-Value Map

Reply via email to