hzyangkai commented on issue #12968: URL: https://github.com/apache/dolphinscheduler/issues/12968#issuecomment-1326996778
Hi @Radeity, thanks for your review. How to separate submitApplication and monitorApplication is realized by the submit script of the computing engine. For the computing engine, such as flink or yarn, after submitting tasks to yarn, appid will be printed in the log immediately, and it will also provide parameters to control exiting the submitting process or blocking until the task is finished. For example, we can use "spark-submit --master yarn --deploy-mode cluster --conf spark.yarn.submit.waitAppCompletion=false" in spark. After a task is submitted to yarn, the submitting process immediately prints the appid to the log and exits automatically. The submitting process does not wait for the task to end. Then ds can start a thread(or reuse the task-execute-thread) of monitorApplication that polls the status of the app in yarn based on the appid. For tasks submitted in spark client mode, such as "spark-submit --master yarn --deploy-mode client" or "spark-sql --master yarn" script, we can not separate the submission process and the monitoring process, because the end of the client process means the end of the entire application, and a wrapper is required to submit the sql task in cluster mode. For tasks submitted using beeline, taking spark as an example, usually , the task is submitted to a thrift server and then the thrift server runs the job in a shared yarn application. The appid is usually shared by many jobs. We need to focus on how to submit job in detached mode and get the jobid , this should probably be reported to client like beeline by thrifter server, and client like beeline can query job status by jobid. As far as I know, beeline should be able to submit sql in detached mode, but may not print jobid, which should be given more thought in the future. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
