[
https://issues.apache.org/jira/browse/SPARK-6950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14498390#comment-14498390
]
Matt Cheah edited comment on SPARK-6950 at 4/16/15 5:57 PM:
------------------------------------------------------------
There's one way I could reproduce this locally, but I can't confirm if this is
what is happening in production. I had to force a race condition to occur for
it.
Basically sometimes the Spark Master can attempt to rebuild the UI before the
event logging listener renames the event log file removing the .inprogress
extension. If the spark master reaches that point first before the event
logging listener renames the file, it will never check again if the file is
renamed and never build the UI. The SparkContext.stop() method requests the
eventLogger to stop, which may not execute until after the Master has called
rebuildSparkUi() for the completed application.
However I think the fix to SPARK-6107 will inadvertently also solve this issue.
I'll try applying that patch.
was (Author: mcheah):
There's one way I could reproduce this locally, but I can't confirm if this is
what is happening in production. I had to force a race condition to occur for
it.
Basically sometimes the Spark Master can attempt to rebuild the UI before the
event logging listener renames the event log file removing the .inprogress
extension. If the spark master reaches that point first before the event
logging listener renames the file, it will never check again if the file is
renamed and never build the UI. The SparkContext.stop() method requests the
eventLogger to stop, which may not execute until after the Master has called
rebuildSparkUi() for the completed application.
However I think the fix to SPARK-6107 will inadvertently also solve this issue.
> Spark master UI believes some applications are in progress when they are
> actually completed
> -------------------------------------------------------------------------------------------
>
> Key: SPARK-6950
> URL: https://issues.apache.org/jira/browse/SPARK-6950
> Project: Spark
> Issue Type: Bug
> Components: Web UI
> Affects Versions: 1.3.0
> Reporter: Matt Cheah
>
> In Spark 1.2.x, I was able to set my spark event log directory to be a
> different location from the default, and after the job finishes, I can replay
> the UI by clicking on the appropriate link under "Completed Applications".
> Now, on a non-deterministic basis (but seems to happen most of the time),
> when I click on the link under "Completed Applications", I instead get a
> webpage that says:
> Application history not found (app-20150415052927-0014)
> Application myApp is still in progress.
> I am able to view the application's UI using the Spark history server, so
> something regressed in the Spark master code between 1.2 and 1.3, but that
> regression does not apply in the history server use case.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]