[GitHub] [dolphinscheduler] hzyangkai commented on pull request #13030: [Feature-12968][Master]improvement failover process

GitBox Mon, 28 Nov 2022 18:14:32 -0800


hzyangkai commented on PR #13030:
URL: 
https://github.com/apache/dolphinscheduler/pull/13030#issuecomment-1329986034


   Hi @ruanwenjun ,thanks for your review.
   
   If a shell task uses spark-submit to submit the task to yarn, the appid will 
be printed in the log as soon as the task is submitted successfully, regardless 
of whether the process exits or not.The implementation might be easy, in the 
thread that executes the method WorkerTaskExecuteRunnable#executeTask, first 
waiting for  yarn to accept the app and get the appid from log, and then 
blocking for the process to end.However, some shell tasks are simple shell 
scripts like 'echo hello' or datax tasks, which have no appid. In my opinion, 
we should think about it together instead of focusing on how to get the appid. 
   
   When the master crashes, and then the worker crashes, if the worker crashes 
without going to the shutdown function,for type 1 and 2 tasks (not just 
"submiting spark tasks in the shell task" that you mentioned ) it is possible  
to be duplicated. but it just keeps the existing logic.  
   
   In my opinion, a good way is to change all types of tasks into tasks type 3. 
 The same goes for shell tasks. Shell tasks（it can not only submit spark tasks 
, but also submit worker local processes like datax） should be submitted to 
external resource manager(such as yarn), so that it can be converted to task of 
type 3. Like the spark task， when yarn accepts shell task， we can get appid 
from log immediately.
   
   This pr modifies spark tasks in cluster mode and keeps the behavior of other 
types of tasks unchanged. Then there should be the transformation of Spark-SQL 
task,  flink task, shell task and so on in the near future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [dolphinscheduler] hzyangkai commented on pull request #13030: [Feature-12968][Master]improvement failover process

Reply via email to