There are some APIs (SparkStatusTracker) that expose job and stage
data even when the UI is disabled. I don't think tasks, or SQL stuff,
are exposed without the UI though, and maybe the SQL listener doesn't
even need to be installed in that case. (Similar for other listeners
that don't do anything without the UI but may currently be installed.)

In fact I have some changes as part of another project that do just that...

On Fri, Oct 13, 2017 at 10:36 AM, Craig Ingram <cinple....@gmail.com> wrote:
> I was recently debugging an OOM exception one of my coworkers was struggling
> with and found that `SQLListener._stageIdToStageMetrics` was the culprit.
> The UI was disabled in this case, but stats were still accumulating for
> jobs, stages, and tasks. The job my coworker was running had over 40k tasks
> in one of the stages. Does it make sense to set different defaults for the
> following settings when the UI is disabled?
>
> spark.sql.ui.retainedExecutions
> spark.ui.retainedJobs
> spark.ui.retainedStages
> spark.ui.retainedTasks
>
> There may be some other configuration settings that should change too; but
> at a minimum, these settings are all potentially problematic as they can
> grow unbounded. Is there a reason these settings are using their default
> values even when the UI is disabled? If not, it seems like we could save
> users a lot of headaches by setting these values to 0 when the UI is
> disabled.
>
> Moreover, how does this work with streaming? It seems like this problem
> would come up quite often.
>
> Thanks,
> Craig
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to