dclim commented on issue #7900: Develop a new BigIndexer process for running ingestion tasks URL: https://github.com/apache/incubator-druid/issues/7900#issuecomment-506004288 @jon-wei thanks! Two other things that I was thinking about that might be interesting, or might be out of scope for this proposal: - A GC watchdog that tries to save an indexer that's going to die unnecessarily - this has been alleviated somewhat by using `maxBytesInMemory` instead of `maxRowsInMemory`, but I think can still be an issue because query heap usage is not strictly bounded and unexpected query load could cause OOM issues (and this effect would be compounded if you have multiple tasks running in the same JVM), but basically: indexing tasks often die because of heap exhaustion unnecessarily - meaning that the data in the heap didn't _need_ to be there, and if a spill would have been triggered, the indexing task could continue on. I'm thinking of some kind of watchdog thread that uses `GarbageCollectorMXBean` similar to `JvmMonitor` and if a threshold of number of GCs or GC time in a given period is hit, the watchdog will force a global spill to try to save the indexer. - Some intelligence around task assignment that's better than random - an easy win here would be to weight known resource-light tasks (Hadoop indexing task, parallel master task?) lesser than other tasks. We might also want to consider query-serving realtime tasks separately from batch jobs and optimize for those (distributing them evenly across the indexers so that they're not all running on the same machine and sharing processing threads/buffers while the buffers are unused on other machines because they're running all batch jobs) - I think being able to respond to realtime queries quickly is typically more important than ingestion throughput. Later on, it would be pretty interesting to look at historical task report information to try to understand what are the heavy datasources (in terms of both query and ingestion volume) and what are the light ones and further optimize using this information. Otherwise, I'm +1 on this proposal. Assuming that the new indexer will have enable/disable routes that toggle whether it will receive new task assignments similar to the MM for supporting rolling updates.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
