[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2022-07-14 Thread Dongjoon Hyun (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566945#comment-17566945
 ] 

Dongjoon Hyun commented on SPARK-25250:
---

To sync with the comment and commit history, I removed 2.4.1 and 2.3.4 from the 
Fix Version of this JIRA.

Please note that SPARK-27590 is also fixed at Spark 3.0.0+.

 

!Screen Shot 2022-07-14 at 11.24.58 AM.png!

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Assignee: Parth Gandhi
>Priority: Major
> Fix For: 3.0.0
>
> Attachments: Screen Shot 2022-07-14 at 11.24.58 AM.png
>
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2019-12-16 Thread Wenchen Fan (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16997306#comment-16997306
 ] 

Wenchen Fan commented on SPARK-25250:
-

It's https://issues.apache.org/jira/browse/SPARK-27474

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Assignee: Parth Gandhi
>Priority: Major
> Fix For: 2.3.4, 2.4.1, 3.0.0
>
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2019-12-11 Thread Mairtin Deady (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16993759#comment-16993759
 ] 

Mairtin Deady commented on SPARK-25250:
---

[~cloud_fan] Has another ticket been opened for this? If so can you share link 
please? I am interested to know the plans for releasing fix for this bug.

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Assignee: Parth Gandhi
>Priority: Major
> Fix For: 2.3.4, 2.4.1, 3.0.0
>
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2019-04-16 Thread Wenchen Fan (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818707#comment-16818707
 ] 

Wenchen Fan commented on SPARK-25250:
-

good points! I'm creating another ticket for the issue, and leave this as 
resolved.

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Assignee: Parth Gandhi
>Priority: Major
> Fix For: 2.3.4, 2.4.1, 3.0.0
>
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2019-04-15 Thread Imran Rashid (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818318#comment-16818318
 ] 

Imran Rashid commented on SPARK-25250:
--

I agree about opening a new jira.

Wenchen discussed reverting it here: https://github.com/apache/spark/pull/24359
I agree we made a major mistake in that fix. I don't care too much about how 
the commits look in git, I am fine with having a revert followed by a different 
fix, rather than rolling it into one change.



> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Assignee: Parth Gandhi
>Priority: Major
> Fix For: 2.3.4, 2.4.1, 3.0.0
>
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2019-04-15 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16818053#comment-16818053
 ] 

Thomas Graves commented on SPARK-25250:
---

[~cloud_fan] can you please add details as to where and why this was reverted?

 

This went into multiple branches 2.4.1 has already been released so I don't 
necessarily agree with a revert here.  I would prefer to see another bug since 
its been released already.

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Assignee: Parth Gandhi
>Priority: Major
> Fix For: 2.3.4, 2.4.1, 3.0.0
>
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2019-03-07 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786825#comment-16786825
 ] 

Apache Spark commented on SPARK-25250:
--

User 'Ngone51' has created a pull request for this issue:
https://github.com/apache/spark/pull/24007

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Assignee: Parth Gandhi
>Priority: Major
> Fix For: 2.4.1, 3.0.0
>
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2019-03-07 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786823#comment-16786823
 ] 

Apache Spark commented on SPARK-25250:
--

User 'Ngone51' has created a pull request for this issue:
https://github.com/apache/spark/pull/24007

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Assignee: Parth Gandhi
>Priority: Major
> Fix For: 2.4.1, 3.0.0
>
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2019-03-07 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786819#comment-16786819
 ] 

Apache Spark commented on SPARK-25250:
--

User 'Ngone51' has created a pull request for this issue:
https://github.com/apache/spark/pull/24006

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Assignee: Parth Gandhi
>Priority: Major
> Fix For: 2.4.1, 3.0.0
>
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2019-03-07 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786817#comment-16786817
 ] 

Apache Spark commented on SPARK-25250:
--

User 'Ngone51' has created a pull request for this issue:
https://github.com/apache/spark/pull/24006

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Assignee: Parth Gandhi
>Priority: Major
> Fix For: 2.4.1, 3.0.0
>
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2019-02-22 Thread Parth Gandhi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775188#comment-16775188
 ] 

Parth Gandhi commented on SPARK-25250:
--

[~Ngone51] I understand that you had a proposal and we were actively discussing 
on various solutions in the PR #22806 , but however, I have been working on 
that PR tirelessly for a few months and we still have an ongoing discussion 
going on there. Any specific reasons as to why did you create your own PR for 
the same issue? WDYT [~irashid] [~cloud_fan] ?

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Priority: Major
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2019-02-22 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774957#comment-16774957
 ] 

Apache Spark commented on SPARK-25250:
--

User 'Ngone51' has created a pull request for this issue:
https://github.com/apache/spark/pull/23871

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Priority: Major
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2018-10-23 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660746#comment-16660746
 ] 

Apache Spark commented on SPARK-25250:
--

User 'pgandhi999' has created a pull request for this issue:
https://github.com/apache/spark/pull/22806

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Priority: Major
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2018-10-23 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16660744#comment-16660744
 ] 

Apache Spark commented on SPARK-25250:
--

User 'pgandhi999' has created a pull request for this issue:
https://github.com/apache/spark/pull/22806

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Priority: Major
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2018-09-23 Thread Parth Gandhi (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16625283#comment-16625283
 ] 

Parth Gandhi commented on SPARK-25250:
--

Currently working on this issue, will file a pull request soon. Thank you.

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Priority: Major
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-25250) Race condition with tasks running when new attempt for same stage is created leads to other task in the next attempt running on the same partition id retry multiple ti

2018-08-27 Thread Thomas Graves (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594281#comment-16594281
 ] 

Thomas Graves commented on SPARK-25250:
---

We are hitting a race condition here between the taskSetManager and the 
DAGScheduler.  

There is code in the TaskSetManager that is supposed to mark all task attempts 
for a partition completed when any of them succeed, but in this case the second 
attempt has been finished. There is also code that only starts tasks in the 
second attempt that have not yet finished, but again there is a race here 
between when the taskSetManager sends the message that the task has ended and 
when it starts a new stage attempt.

 In the example given above stage 4.0 has fetch failed, it reran the map stage, 
task 9000 for partition 9000 in stage 4.0 finishes and sends a taskEnded 
messages to DAGSCheduler, before the DAGScheduler processed that task finished, 
it calculates the tasks needed for stage 4.1 which included the task for 
partition 9000, so it runs a task for partition 9000 but it always just fails 
with commitDenied and continues to rerun that task.

when task 9000 for stage 4.0 finished the taskSetManager calls into 
sched.markPartitionCompletedInAllTaskSets but since the 4.1 stage attempt 
hadn't been created yet it didn't mark the task 2000 for that partition as 
completed and since the DAGScheduler hadn't processed the taskEnd event for it, 
when it started stage 4.1 it started a task when it didn't need to.  We need to 
figure out how to handle the race.

> Race condition with tasks running when new attempt for same stage is created 
> leads to other task in the next attempt running on the same partition id 
> retry multiple times
> --
>
> Key: SPARK-25250
> URL: https://issues.apache.org/jira/browse/SPARK-25250
> Project: Spark
>  Issue Type: Bug
>  Components: Scheduler, Spark Core
>Affects Versions: 2.3.1
>Reporter: Parth Gandhi
>Priority: Major
>
> We recently had a scenario where a race condition occurred when a task from 
> previous stage attempt just finished before new attempt for the same stage 
> was created due to fetch failure, so the new task created in the second 
> attempt on the same partition id was retrying multiple times due to 
> TaskCommitDenied Exception without realizing that the task in earlier attempt 
> was already successful.  
> For example, consider a task with partition id 9000 and index 9000 running in 
> stage 4.0. We see a fetch failure so thus, we spawn a new stage attempt 4.1. 
> Just within this timespan, the above task completes successfully, thus, 
> marking the partition id 9000 as complete for 4.0. However, as stage 4.1 has 
> not yet been created, the taskset info for that stage is not available to the 
> TaskScheduler so, naturally, the partition id 9000 has not been marked 
> completed for 4.1. Stage 4.1 now spawns task with index 2000 on the same 
> partition id 9000. This task fails due to CommitDeniedException and since, it 
> does not see the corresponding partition id as been marked successful, it 
> keeps retrying multiple times until the job finally succeeds. It doesn't 
> cause any job failures because the DAG scheduler is tracking the partitions 
> separate from the task set managers.
>  
> Steps to Reproduce:
>  # Run any large job involving shuffle operation.
>  # When the ShuffleMap stage finishes and the ResultStage begins running, 
> cause this stage to throw a fetch failure exception(Try deleting certain 
> shuffle files on any host).
>  # Observe the task attempt numbers for the next stage attempt. Please note 
> that this issue is an intermittent one, so it might not happen all the time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org