skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740 @zsxwing I describe how this happened in the jira ticket. I just run Spark on K8s SparkPi with 1M as the input parameter. This creates 1M tasks (an array holds them) which creates an OOM error for the DAGScheduler eventLooop thread since this is the one that will eventually try to submit the actual job, of course my jvm mem settings are enough to reproduce it, for the values pls have a look at the jira ticket. Of course this could happen in other cases where jvm is running out of memory and at some point this thread needs to allocate more memory. Btw I can reproduce it on K8s in a consistent manner, it fails every time. On other thing is that in the code base there are other places where there is a join on a thread that will be stopped via the shutdown hook like contextCleaner and as I mentioned above shutdownHook does a lot of work eg. the SparkContext stop() method does stop a lot of stuff (not to mention there is one for Streaming as well).
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org