[GitHub] [spark] gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost

2019-08-13 Thread GitBox
gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost URL: https://github.com/apache/spark/pull/24462#issuecomment-520799822 Thank you @squito This is an automated message from

[GitHub] [spark] gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost

2019-08-09 Thread GitBox
gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost URL: https://github.com/apache/spark/pull/24462#issuecomment-519802295 @squito Yeah it saves us much, from a TPC-DS 1T benchmark, 30% queries get 1.1x+ performance boost, 13% get 1.2x +

[GitHub] [spark] gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost

2019-08-07 Thread GitBox
gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost URL: https://github.com/apache/spark/pull/24462#issuecomment-519354047 @squito Index and data files are both stored on DFS, the difference is that: data files are directly read from DFS,

[GitHub] [spark] gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost

2019-07-23 Thread GitBox
gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost URL: https://github.com/apache/spark/pull/24462#issuecomment-514078853 @squito I met with a condition that cannot be satisfied without this PR: - On map side, all shuffle files are

[GitHub] [spark] gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost

2019-07-12 Thread GitBox
gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost URL: https://github.com/apache/spark/pull/24462#issuecomment-510766013 @yifeih Thank you, I understand now. But can your way (making `MapStatus` able to contain an empty location in order

[GitHub] [spark] gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost

2019-07-09 Thread GitBox
gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost URL: https://github.com/apache/spark/pull/24462#issuecomment-509537346 @yifeih It's very interesting, but I didn't 100% get it. The new `ShuffleIO` API will not influence the existed

[GitHub] [spark] gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost

2019-07-08 Thread GitBox
gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost URL: https://github.com/apache/spark/pull/24462#issuecomment-509251934 @squito I agree with you, but still want to make sure I understand it right: The function

[GitHub] [spark] gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost

2019-04-30 Thread GitBox
gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost URL: https://github.com/apache/spark/pull/24462#issuecomment-487843910 @bsidhom I agree that it would be ideal if there's a field in `ShuffleManager` indicating 'whether it can serve

[GitHub] [spark] gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost

2019-04-29 Thread GitBox
gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost URL: https://github.com/apache/spark/pull/24462#issuecomment-487484432 @liupc Thanks for the explanation. This is an

[GitHub] [spark] gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost

2019-04-27 Thread GitBox
gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost URL: https://github.com/apache/spark/pull/24462#issuecomment-487336835 @liupc Under remote shuffle, if certain executors are lost, we can still fetch the shuffle data from remote

[GitHub] [spark] gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost

2019-04-25 Thread GitBox
gczsjdy commented on issue #24462: [SPARK-26268][CORE] Do not resubmit tasks when executors are lost URL: https://github.com/apache/spark/pull/24462#issuecomment-486900975 @liupc Since shuffle manager is pluggable in Spark, this 'resubmit switch' in scheduler should also be configurable.