Technoboy- commented on issue #1658: Refactor WorkerServer
URL: 
https://github.com/apache/incubator-dolphinscheduler/issues/1658#issuecomment-570015840
 
 
   > Regarding the third point of Failover, my consideration is this. When the 
MasterServer before assigning tasks to WorkerServer, the first step is to 
insert the task into the DB to generate an id, then send the task id to the 
WorkerServer for execution.
   > 
   > It is assumed that the WorkerServer dies or the network overlaps after 
receiving the task, the MasterServer does not receive a task execution 
heartbeat from the WorkerServer within a certain period of time, it indicates 
that the task execution failed, and the MasterServer modifies the task status 
in the DB to a failed state.
   > 
   > After that, if the network recovers and receives the heartbeat of the task 
that has been marked as failed before, the MasterServer directly sends a task 
termination command to the WorkerServer.
   > 
   > If the user sets the number of retries, the task is retried in the 
MasterServer, and if not, an alert is send.
   
   Yes, very good . we should do like what you said.
   Task has many different status in scheduling system, we have to insert DB 
before scheduling.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to