ankit0811 opened a new issue #6803: [Proposal] Kill Hadoop MR task on kill of ingestion task and resume ability for Hadoop ingestion tasks URL: https://github.com/apache/incubator-druid/issues/6803 We plan to implement these features in a two phase solution: **Phase I** : _Implement the kill task feature_ Currently on killing the Hadoop ingestion task from the overlord ui does not kill the MR job resulting in unnecessary wastage of resources. The way we are thinking of doing this is by writing the job id of the MR job in a file `mapReduceJobId.json` which will be stored under the `taskBaseDir` A separate file will be required as the MR job is executed in different JVM and there is no way it could communicate with kill task snippet, so writing the required info in a file was the only option Currently, this file will only store the running JobID, this way when some one wishes to kill the ingestion task it can read the current running MR job (if any) and issue a yarn kill command **Phase II**: _Implement the resume ability for Hadoop ingestion tasks_ The above file will now store the job Id of each intermediate MR task so that we can track till what step the ingestion was completed and can resume from the following step instead of executing from the beginning
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
