itskals commented on pull request #29082: URL: https://github.com/apache/spark/pull/29082#issuecomment-675585868
@warrenzhu25 thanks for the work... I think it will add more value if we could add some quantisation fo these errors... So that people understand if it is really worth solving it... One very usefull qualification can be the "time lost" because the time spent earlier by a task is wasted. For example, a partition data was generated by a map task by running for say 1 minute... the reducers were using this data. But due to some reason, the map data was lost, and the work done by the mapper task( 1 minute worth) had to be redone. If in my over all application time, if such wasted time is significant, then it is worth doing something to fix it.. else one could just ignore it... @HeartSaVioR @gengliangwang ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
