yl09099 commented on PR #1137: URL: https://github.com/apache/incubator-uniffle/pull/1137#issuecomment-1683317647
> > I did a quick overview of this diagram, seems like that it only handles shuffle write failures, how about the shuffle read failure? In which case, some of the shuffle servers are done or unable to serving shuffle data, the shuffle read client would report `FetchFailedException` to trigger a parent stage recompute. Would you mind to elaborate a bit more on how that would be handled? > > And by the way, I don't think it's a good idea to throw an `FetchFailedException` when writing to uniffle server has been failed for multiple times. > > Also kindly remind of these two high level questions. 1、Read failure Someone has submitted the relevant PR earlier#787,There is a situation that is indeed not implemented, and there is no way to notify the upstream rewrite when the read fails, which would like to be implemented later. 2、FetchFailedException is a way to handle a read failure without intruding into SPARK code. I can't think of any other better way. Do you have any good suggestions? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
