Ngone51 commented on pull request #29331: URL: https://github.com/apache/spark/pull/29331#issuecomment-669971017
> The throughput improvement and reduced FetchFailedException is the real benefit. If we don't have HDFS, this is the only viable option. IIUC, `FetchFailedException` only raised when we try to fetch shuffle blocks, while `StorageLevel` is only related to RDD blocks. So I have no idea how `DISK_ONLY_3` could help reduce `FetchFailedException`. And how do we get the throughput improvement by using `DISK_ONLY_3`? Higher task parallelism? Or something else? > I think here the motivation was to try and deal with a workload with a lot of failures on the executors and avoiding a lot of recomputes more than the locality. If that's the case, I think we should care more about shuffle data, which only has one single copy in the disk. And shuffle data loss would lead to stage recompute, which is more terrible compares to task recompute caused by RDD block loss. I don't object to the change here, but just want to figure out what's the real case it tires to improve. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
