hzyangkai commented on PR #13030: URL: https://github.com/apache/dolphinscheduler/pull/13030#issuecomment-1329986034
Hi @ruanwenjun ,thanks for your review. If a shell task uses spark-submit to submit the task to yarn, the appid will be printed in the log as soon as the task is submitted successfully, regardless of whether the process exits or not.The implementation might be easy, in the thread that executes the method WorkerTaskExecuteRunnable#executeTask, first waiting for yarn to accept the app and get the appid from log, and then blocking for the process to end.However, some shell tasks are simple shell scripts like 'echo hello' or datax tasks, which have no appid. In my opinion, we should think about it together instead of focusing on how to get the appid. When the master crashes, and then the worker crashes, if the worker crashes without going to the shutdown function,for type 1 and 2 tasks (not just "submiting spark tasks in the shell task" that you mentioned ) it is possible to be duplicated. but it just keeps the existing logic. In my opinion, a good way is to change all types of tasks into tasks type 3. The same goes for shell tasks. Shell tasks(it can not only submit spark tasks , but also submit worker local processes like datax) should be submitted to external resource manager(such as yarn), so that it can be converted to task of type 3. Like the spark task, when yarn accepts shell task, we can get appid from log immediately. This pr modifies spark tasks in cluster mode and keeps the behavior of other types of tasks unchanged. Then there should be the transformation of Spark-SQL task, flink task, shell task and so on in the near future. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
