onebox-li commented on PR #2921:
URL: https://github.com/apache/celeborn/pull/2921#issuecomment-2595105756

   Thanks @turboFei  for this work.
   We encountered an occasional problem recently. When the speculation 
conditions were loose, a speculation task was waiting for `updateFileGroup` 
result. Unfortunately, another attempt succeeded and the task was Interrupted. 
Since the load file group failed, a fetch failure was reported to 
LifecycleManager. 
   Other newly started tasks would get an error below when they tried 
`getShuffleId`.
   ```
   unexpected! there is no finished map stage associated with appShuffleId xx
   ```
   This may cause the whole job to fail. I think this PR is very help to solve 
this situation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to