[GitHub] [dolphinscheduler] hzyangkai commented on pull request #13030: [Feature-12968][Master]improvement failover process

GitBox Thu, 15 Dec 2022 19:43:50 -0800


hzyangkai commented on PR #13030:
URL: 
https://github.com/apache/dolphinscheduler/pull/13030#issuecomment-1354155901


   Adding two abstractions methods to the class AbstractTask.
   1. AbstractTask#oneAppIdPerTask: task confirmation generates only one appid. 
This method affects fault tolerance.
     1. If the task subclass implements oneAppIdPerTask=true, it can collect an 
appid and report it when the task starts. Then fault tolerance is performed 
based on the appid.  By default AbstractYarnTask#oneAppIdPerTask=true. 
FlinkStreamTask original implementation is not good enough, confusing the appid 
and jobid.  Therefore, FlinkStreamTask#oneAppIdPerTask=false, the 
implementation of FlinkStreamTask should be changed later to adjust 
oneAppIdPerTask=true
     2. If the task subclass does not implement oneAppIdPerTask, use the 
default setting oneAppIdPerTask=false. Appids will not be collected when the 
task starts.  Task will be killed remotely by ssh kill -9 processId and then 
restart a new task when worker crashes.
   2. AbstractTask#exitAfterSubmitTask: The submitting process exits 
immediately after a task is submitted. This method is used to optimize the 
submission method and is optional. The default value is false. Currently, only 
the spark cluster mode task is true.
   
   Test:
   Master crashes:
   1. when master crashes, and then restart , all types of tasks will rebuild 
channel to worker , keep running.
   Worker crashes:
   1. When the worker crashes, the task implementing oneAppIdPerTask=true could 
keep running.  Otherwise, it will be killed and restarted.
   Master & Worker crash
   1. When the master & worker crash, the task implementing 
oneAppIdPerTask=true could keep running.  Otherwise, it will be killed and 
restarted.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [dolphinscheduler] hzyangkai commented on pull request #13030: [Feature-12968][Master]improvement failover process

Reply via email to