[ 
https://issues.apache.org/jira/browse/SPARK-24150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-24150:
---------------------------------
    Labels: bulk-closed  (was: )

> Race condition in FsHistoryProvider
> -----------------------------------
>
>                 Key: SPARK-24150
>                 URL: https://issues.apache.org/jira/browse/SPARK-24150
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: William Montaz
>            Priority: Major
>              Labels: bulk-closed
>
> There exist a race condition in checkLogs method between threads of 
> replayExecutor. They use the field "applications" to synchronise, but they 
> also update that field.
> The problem is that threads will eventually synchronise on different monitors 
> (because they will synchronise on different objects which references have 
> been assigned to "applications"), breaking the initial synchronisation 
> intent. This has even greater chance to reproduce when number_new_log_files > 
> replayExecutor_pool_size
> If such log disappears (it will not be present in the list "applications"), 
> it will be impossible to read it from the UI (being in the list 
> "applications" is a mandatory check to avoid getting a 404)
> Workaround:
>  * use a permanent object as a monitor on which to synchronise (or 
> synchronise on `this`)
>  * keep volatile field for all other read accesses



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to