[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread scwf
Github user scwf commented on the issue: https://github.com/apache/spark/pull/15213 > actual problem is not in abortStage but rather in improper additions to failedStages correct, i think a more accurate description for this issue is "do not add `failedStages` when abortStage

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread markhamstra
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/15213 Right, but `abortStage` occurs elsewhere. "When abort stage" seems to imply that this fix is necessary for all usages of `abortStage` when the actual problem is not in `abortStage` but rather i

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread scwf
Github user scwf commented on the issue: https://github.com/apache/spark/pull/15213 Actually the failedStages only added here in spark. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread markhamstra
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/15213 @scwf That description would actually be at least as bad since there are multiple routes to `abortStage` and this issue of adding to `failedStages` only applies to these two. I'll take another

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread markhamstra
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/15213 Ok, that makes better sense. The `disallowStageRetryForTest` case doesn't worry me too much since it is only used in tests. If we can fix this case, great; else if it remains possible

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread scwf
Github user scwf commented on the issue: https://github.com/apache/spark/pull/15213 Thanks @zsxwing to explain this. @markhamstra the issue happens in the case of my PR description. It usually depends on muti-thread submitting jobs cases and the order of fetch failure, so i said

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15213 @markhamstra I agreed this is not a race condition since there is only one single thread. This issue is the code doesn't handle the following two corner cases: - `failedStage.failed

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread markhamstra
Github user markhamstra commented on the issue: https://github.com/apache/spark/pull/15213 This doesn't make sense to me. The DAGSchedulerEventProcessLoop runs on a single thread and processes a single event from its queue at a time. When the first CompletionEvent is run as a

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15213 **[Test build #65853 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65853/consoleFull)** for PR 15213 at commit [`1127ca1`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65829/ Test PASSed. ---

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15213 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15213 **[Test build #65829 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65829/consoleFull)** for PR 15213 at commit [`d92adfc`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/15213 /cc @kayousterhout @markhamstra for review of scheduler changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project doe

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15213 **[Test build #65829 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65829/consoleFull)** for PR 15213 at commit [`d92adfc`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15213 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65823/ Test PASSed. ---

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15213 **[Test build #65823 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65823/consoleFull)** for PR 15213 at commit [`1f7bd88`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15213 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65822/ Test PASSed. ---

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15213 **[Test build #65822 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65822/consoleFull)** for PR 15213 at commit [`7056cd6`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15213 **[Test build #65823 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65823/consoleFull)** for PR 15213 at commit [`1f7bd88`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15213 **[Test build #65822 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65822/consoleFull)** for PR 15213 at commit [`7056cd6`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15213 **[Test build #65818 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65818/consoleFull)** for PR 15213 at commit [`d02cf93`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15213 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65818/ Test FAILed. ---

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15213 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15213: [SPARK-17644] [CORE] Fix the race condition when DAGSche...

2016-09-23 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15213 **[Test build #65818 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65818/consoleFull)** for PR 15213 at commit [`d02cf93`](https://github.com/apache/spark/commit/d