RexXiong commented on PR #3531:
URL: https://github.com/apache/celeborn/pull/3531#issuecomment-3532885543

   I don't think this change is quite right. For example, if attempts 0 and 1 
have already failed, attempt 2 is running, and attempt 3 reports failed, 
according to the modified logic it would return true. But in reality, it should 
return false because attempt 2 is still running. So we should check how many 
attempts have already failed. If the number of failed attempts has reached 
maxTaskFails, then it should return true.
   
   In the scenario you provided, when attempt 3 reports, although attempt 4 is 
running, attempts 0, 1, and 2 have already failed three times, plus this time 
makes it four failures. At this point, we shouldn’t ignore it; instead, we 
should determine that maxTaskFails has been reached and report a fetch failure. 
@leixm @turboFei 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to