[
https://issues.apache.org/jira/browse/SPARK-6270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053810#comment-15053810
]
Josh Rosen commented on SPARK-6270:
-----------------------------------
While I think that we should have this discussion about UI reconstruction of
long-running applications, I think this is orthogonal to the right solution for
this issue (SPARK-6270). The root problem here, related to the master / cluster
manager dying, seems to be caused by a design flaw: why is the master
responsible for serving historical UIs? The standalone history server process
should have that responsibility, since UI serving might need a lot of memory.
I think the right fix here is to just remove the Master's embedded history
server; I just don't think it makes sense to assign history server
responsibilities to the master when it's designed to be a very
low-resource-use, high-stability, high-resiliency service.
> Standalone Master hangs when streaming job completes and event logging is
> enabled
> ---------------------------------------------------------------------------------
>
> Key: SPARK-6270
> URL: https://issues.apache.org/jira/browse/SPARK-6270
> Project: Spark
> Issue Type: Bug
> Components: Deploy, Streaming
> Affects Versions: 1.2.0, 1.2.1, 1.3.0, 1.5.1
> Reporter: Tathagata Das
> Priority: Critical
>
> If the event logging is enabled, the Spark Standalone Master tries to
> recreate the web UI of a completed Spark application from its event logs.
> However if this event log is huge (e.g. for a Spark Streaming application),
> then the master hangs in its attempt to read and recreate the web ui. This
> hang causes the whole standalone cluster to be unusable.
> Workaround is to disable the event logging.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]