AKarbas opened a new issue #9996: URL: https://github.com/apache/druid/issues/9996
Hi. I'm on Druid 0.16.0-incubating on OpenJDK 8u162-jre. I had this overly large datasource with ~170K segments, being ingested from Kafka. The problem was the data was wrongly timestamped and on each run of the ingestion task, hundreds of segments were created if not thousands. I realized this, and in an attempt to clean up, stopped the supervisor, dropped the datasource, and issued a kill task to clean it up all the way. -- all through the druid console. Now, the kill task fails with this as it's last log line: ``` Terminating due to java.lang.OutOfMemoryError: Java heap space ``` Some logged configurations (Tell me what to add.): ``` druid.indexer.runner.javaOpts: -server -Xms1g -Xmx1g -XX:MaxDirectMemorySize=1536m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -XX:+ExitOnOutOfMemoryError -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager druid.indexer.fork.property.druid.processing.buffer.sizeBytes: 268435456 druid.indexer.fork.property.druid.processing.numMergeBuffers: 2 druid.indexer.fork.property.druid.processing.numThreads: 1 ``` Any ideas? I was thinking submitting kill tasks with smaller time chunks than `1000-...` to `3000-...` could reduce the number of segments to kill in each run, but shouldn't this be handled automatically? Lastly, if this in fact is a bug that still exists in the latest version, I'd be happy to submit a PR to fix it if you point me in the right direction. Cheers ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
