morningman opened a new issue #2077: Awareness of Backend down when loading data
URL: https://github.com/apache/incubator-doris/issues/2077
 
 
   ## Motivation
   
   In the current implementation, one BE's crash may cause the load job being 
executed unable to finish and will not be cancelled automatically, and can only 
wait for a timeout. This is mainly  because, after the BE that is responsible 
for reporting the load result is down, the FE cannot  obtain the result of the 
load job, and thus the job job cannot be further processed.
   
   ## How to resolve
   I add a variable called `lastMissingHeartbeatTime` in Backend object. And 
this variable will be updated every time a heartbeat failed on that Backend. 
   
   Also, before a load job is executed, it will save the current 
`lastMissingHeartbeatTime` of related BE in its Coordinator. 
   
   And I modify the Coordinator's `join()` method. I divide the entire waiting 
process(join) into multiple rounds, with a maximum of 30 seconds per round. And 
after each round of waiting, check the `lastMissingHeartbeatTime` of the BE. If 
the `lastMissingHeartbeatTime` is larger than what we saved before(which means 
the BE is down during the load process), the wait is ended and the error result 
is returned. Otherwise, continue to the next round of waiting.
   
   ## What's next
   
   This modification can only resolve the problem of cancelling a failed load 
job that cannot be ended for a long time. But another problem is that in 
current loading framework, a BE's downtime is very likely to cause most load 
jobs to fail and has to be retried. This is not high available at all!
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to