[GitHub] [incubator-druid] dclim commented on issue #7900: Develop a new BigIndexer process for running ingestion tasks

GitBox Wed, 26 Jun 2019 12:01:52 -0700

dclim commented on issue #7900: Develop a new BigIndexer process for running 
ingestion tasks
URL: 
https://github.com/apache/incubator-druid/issues/7900#issuecomment-506004288
 
 
   @jon-wei thanks! Two other things that I was thinking about that might be 
interesting, or might be out of scope for this proposal:
   
   - A GC watchdog that tries to save an indexer that's going to die 
unnecessarily - this has been alleviated somewhat by using `maxBytesInMemory` 
instead of `maxRowsInMemory`, but I think can still be an issue because query 
heap usage is not strictly bounded and unexpected query load could cause OOM 
issues (and this effect would be compounded if you have multiple tasks running 
in the same JVM), but basically: indexing tasks often die because of heap 
exhaustion unnecessarily - meaning that the data in the heap didn't _need_ to 
be there, and if a spill would have been triggered, the indexing task could 
continue on. I'm thinking of some kind of watchdog thread that uses 
`GarbageCollectorMXBean` similar to `JvmMonitor` and if a threshold of number 
of GCs or GC time in a given period is hit, the watchdog will force a global 
spill to try to save the indexer.
   
   
   - Some intelligence around task assignment that's better than random - an 
easy win here would be to weight known resource-light tasks (Hadoop indexing 
task, parallel master task?) lesser than other tasks. We might also want to 
consider query-serving realtime tasks separately from batch jobs and optimize 
for those (distributing them evenly across the indexers so that they're not all 
running on the same machine and sharing processing threads/buffers while the 
buffers are unused on other machines because they're running all batch jobs) - 
I think being able to respond to realtime queries quickly is typically more 
important than ingestion throughput. Later on, it would be pretty interesting 
to look at historical task report information to try to understand what are the 
heavy datasources (in terms of both query and ingestion volume) and what are 
the light ones and further optimize using this information.
   
   Otherwise, I'm +1 on this proposal. Assuming that the new indexer will have 
enable/disable routes that toggle whether it will receive new task assignments 
similar to the MM for supporting rolling updates.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-druid] dclim commented on issue #7900: Develop a new BigIndexer process for running ingestion tasks

Reply via email to