jon-wei commented on issue #7962: Killing hadoop ingestion task does not kill spawned Hadoop MR task URL: https://github.com/apache/incubator-druid/issues/7962#issuecomment-511590864 @a2l007 @ankit0811 @clintropolis I was testing the hadoop job kill functionality with PR #8085 from @clintropolis applied, this allowed the logs of the graceful shutdown for HadoopIndexTask to be shown. I observed inconsistent behavior when killing a hadoop task using the overlord shutdown API: - Sometimes killing the task worked, the corresponding hadoop task was killed successfully. It seems like issuing the task shutdown right when the job is submitted allows the kill to succeed. Waiting some time until the job runs seems to lead to the condition described below. - Other times, the kill request would not happen until after the hadoop job finished, and the kill request would then fail because the job was already completed (shown in the following logs): ``` [org.apache.druid.segment.writeout.TmpFileSegmentWriteOutMediumFactory@5dad4df5] 2019-07-15T21:52:13,212 INFO [Thread-24] org.apache.druid.guice.JsonConfigurator - Loaded class[class org.apache.druid.indexer.HadoopKerberosConfig] from props[druid.hadoop.security.kerberos.] as [org.apache.druid.indexer.HadoopKerberosConfig@0] 2019-07-15T21:52:13,214 INFO [Thread-24] org.apache.druid.guice.JsonConfigurator - Loaded class[class org.apache.druid.storage.hdfs.HdfsDataSegmentPusherConfig] from props[druid.storage.] as [org.apache.druid.storage.hdfs.HdfsDataSegmentPusherConfig@250027e5] 2019-07-15T21:52:13,215 INFO [Thread-24] org.apache.druid.storage.hdfs.HdfsDataSegmentPusher - Configured HDFS as deep storage 2019-07-15T21:52:13,313 INFO [Thread-24] org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032 2019-07-15T21:52:31,795 INFO [task-runner-0-priority-0] org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-07-15T21:52:31,796 INFO [Thread-24] org.apache.hadoop.mapred.ClientServiceDelegate - Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server 2019-07-15T21:52:31,864 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_1563225314625_0012 running in uber mode : false 2019-07-15T21:52:31,864 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - map 100% reduce 100% 2019-07-15T21:52:31,885 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_1563225314625_0012 completed successfully Could not kill the job job_1563225314625_0012, as it has already succeeded. 2019-07-15T21:52:31,895 INFO [Thread-24] org.apache.druid.indexing.common.task.HadoopIndexTask - Tried killing job: [job_1563225314625_0012], status: [Fail] 2019-07-15T21:52:31,895 INFO [Thread-24] org.apache.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_wikipedia_2019-07-15T21:51:41.711Z] status changed to [FAILED]. ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
