I was recently debugging an OOM exception one of my coworkers was struggling with and found that `SQLListener._stageIdToStageMetrics` was the culprit. The UI was disabled in this case, but stats were still accumulating for jobs, stages, and tasks. The job my coworker was running had over 40k tasks in one of the stages. Does it make sense to set different defaults for the following settings when the UI is disabled?
spark.sql.ui.retainedExecutions spark.ui.retainedJobs spark.ui.retainedStages spark.ui.retainedTasks There may be some other configuration settings that should change too; but at a minimum, these settings are all potentially problematic as they can grow unbounded. Is there a reason these settings are using their default values even when the UI is disabled? If not, it seems like we could save users a lot of headaches by setting these values to 0 when the UI is disabled. Moreover, how does this work with streaming? It seems like this problem would come up quite often. Thanks, Craig -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org