Ngone51 commented on pull request #33451: URL: https://github.com/apache/spark/pull/33451#issuecomment-885707103
@otterc > Though it avoids re-fetch of a corrupted block for which the cause of corruption is disk_issue, the act of finding the cause of corruption, which is by sending another message to the server, is as high as just retrying the corrupt block. The main motivation behind the shuffle checksum project is to give the cause of data corruption to users/developers to help debug the underlying root causes further. It doesn't really try to bring performance improvement here. And please also note that diagnosis only happens for the corruption error (which is a corner case). So, it won't have a big impact on performance. > I feel that this broad classification of corruption may not be that helpful to the user These are the only causes we can give under the current solution. And I think it's actually helpful. Without this change, people can only guess the cause. Even if we all suspect the most cause is due to disk issues, but no one can tell it for sure. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
