[ 
https://issues.apache.org/jira/browse/SPARK-17894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eren Avsarogullari updated SPARK-17894:
---------------------------------------
    Description: 
TaskSetManager should have unique name to avoid adding duplicate ones to *Pool* 
via *SchedulableBuilder*. This problem surfaced with 
https://issues.apache.org/jira/browse/SPARK-17759 and please find discussion: 
https://github.com/apache/spark/pull/15326

There is 1x1 relationship between Stage Attempt Id and TaskSetManager so 
taskSet.Id covering both stageId and stageAttemptId looks to be used for 
TaskSetManager as well. 

*Current TaskSetManager Name* : 
{code:java} var name = "TaskSet_" + taskSet.stageId.toString{code}
*Sample*: TaskSet_0

*Proposed TaskSetManager Name* : 
{code:java} var name = "TaskSet_" + taskSet.Id (stageId + "." + stageAttemptId) 
{code}
*Sample* : TaskSet_0.0

cc [~kayousterhout] [~markhamstra]

  was:
TaskSetManager should have unique name to avoid adding duplicate ones to *Pool* 
via *SchedulableBuilder*. This problem surfaced with 
https://issues.apache.org/jira/browse/SPARK-17759 and please find discussion: 
https://github.com/apache/spark/pull/15326

There is 1x1 relationship between Stage Attempt Id and TaskSetManager so 
taskSet.Id covering both stageId and stageAttemptId looks to be used for 
TaskSetManager as well. 

What do you think about proposed TaskSetManager Name?

*Current TaskSetManager Name* : 
{code:java} var name = "TaskSet_" + taskSet.stageId.toString{code}
*Sample*: TaskSet_0

*Proposed TaskSetManager Name* : 
{code:java} var name = "TaskSet_" + taskSet.Id (stageId + "." + stageAttemptId) 
{code}
*Sample* : TaskSet_0.0


> Uniqueness of TaskSetManager name
> ---------------------------------
>
>                 Key: SPARK-17894
>                 URL: https://issues.apache.org/jira/browse/SPARK-17894
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 2.1.0
>            Reporter: Eren Avsarogullari
>
> TaskSetManager should have unique name to avoid adding duplicate ones to 
> *Pool* via *SchedulableBuilder*. This problem surfaced with 
> https://issues.apache.org/jira/browse/SPARK-17759 and please find discussion: 
> https://github.com/apache/spark/pull/15326
> There is 1x1 relationship between Stage Attempt Id and TaskSetManager so 
> taskSet.Id covering both stageId and stageAttemptId looks to be used for 
> TaskSetManager as well. 
> *Current TaskSetManager Name* : 
> {code:java} var name = "TaskSet_" + taskSet.stageId.toString{code}
> *Sample*: TaskSet_0
> *Proposed TaskSetManager Name* : 
> {code:java} var name = "TaskSet_" + taskSet.Id (stageId + "." + 
> stageAttemptId) {code}
> *Sample* : TaskSet_0.0
> cc [~kayousterhout] [~markhamstra]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to