[ 
https://issues.apache.org/jira/browse/SPARK-19438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-19438.
-------------------------------
    Resolution: Not A Problem

> executorDataMap should be guarded by 
> CoarseGrainedSchedulerBackend.this.synchronized 
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-19438
>                 URL: https://issues.apache.org/jira/browse/SPARK-19438
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.1.0
>            Reporter: jin xing
>
> Currently when handle *RegisterExecutor* in *CoarseGrainedSchedulerBackend*, 
> *executorDataMap* is guarded by 
> *CoarseGrainedSchedulerBackend.this.synchronized* when updating, which can 
> cause *numPendingExecutors* incorrect. 
> Code is like below:
> {code}
>         if (executorDataMap.contains(executorId)) {
>           executorRef.send(RegisterExecutorFailed("Duplicate executor ID: " + 
> executorId))
>           context.reply(true)
>         } else {
>           ...
>           CoarseGrainedSchedulerBackend.this.synchronized {
>             executorDataMap.put(executorId, data)
>             if (currentExecutorIdCounter < executorId.toInt) {
>               currentExecutorIdCounter = executorId.toInt
>             }
>             if (numPendingExecutors > 0) {
>               numPendingExecutors -= 1
>               logDebug(s"Decremented number of pending executors 
> ($numPendingExecutors left)")
>             }
>           }
> {code}
> Consider SPARK-19437 and a scenario like below:
> An executor sent *RegisterExecutor* twice by *askWithRetry*, and the interval 
> between the two is quite small. Thus it might be possible that both of them 
> will go to *else* branch, thus *numPendingExecutors* will be deducted twice. 
> Currently, the *askWithRetry* of *RegisterExecutor* only exists in some unit 
> tests, but it makes sense to make it stronger when handling 
> *RegisterExecutor*.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to