[ 
https://issues.apache.org/jira/browse/SPARK-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498390#comment-14498390
 ] 

Matt Cheah edited comment on SPARK-6950 at 4/16/15 5:57 PM:
------------------------------------------------------------

There's one way I could reproduce this locally, but I can't confirm if this is 
what is happening in production. I had to force a race condition to occur for 
it.

Basically sometimes the Spark Master can attempt to rebuild the UI before the 
event logging listener renames the event log file removing the .inprogress 
extension. If the spark master reaches that point first before the event 
logging listener renames the file, it will never check again if the file is 
renamed and never build the UI. The SparkContext.stop() method requests the 
eventLogger to stop, which may not execute until after the Master has called 
rebuildSparkUi() for the completed application.

However I think the fix to SPARK-6107 will inadvertently also solve this issue. 
I'll try applying that patch.


was (Author: mcheah):
There's one way I could reproduce this locally, but I can't confirm if this is 
what is happening in production. I had to force a race condition to occur for 
it.

Basically sometimes the Spark Master can attempt to rebuild the UI before the 
event logging listener renames the event log file removing the .inprogress 
extension. If the spark master reaches that point first before the event 
logging listener renames the file, it will never check again if the file is 
renamed and never build the UI. The SparkContext.stop() method requests the 
eventLogger to stop, which may not execute until after the Master has called 
rebuildSparkUi() for the completed application.

However I think the fix to SPARK-6107 will inadvertently also solve this issue.

> Spark master UI believes some applications are in progress when they are 
> actually completed
> -------------------------------------------------------------------------------------------
>
>                 Key: SPARK-6950
>                 URL: https://issues.apache.org/jira/browse/SPARK-6950
>             Project: Spark
>          Issue Type: Bug
>          Components: Web UI
>    Affects Versions: 1.3.0
>            Reporter: Matt Cheah
>
> In Spark 1.2.x, I was able to set my spark event log directory to be a 
> different location from the default, and after the job finishes, I can replay 
> the UI by clicking on the appropriate link under "Completed Applications".
> Now, on a non-deterministic basis (but seems to happen most of the time), 
> when I click on the link under "Completed Applications", I instead get a 
> webpage that says:
> Application history not found (app-20150415052927-0014)
> Application myApp is still in progress.
> I am able to view the application's UI using the Spark history server, so 
> something regressed in the Spark master code between 1.2 and 1.3, but that 
> regression does not apply in the history server use case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to