skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks (an array 
holds them) which creates an OOM error for the DAGScheduler eventLooop thread 
since this is the one that will eventually try to submit the actual job, of 
course my jvm mem settings are enough to reproduce it, for the values pls have 
a look at the jira ticket. Of course this could happen in other cases where jvm 
is running out of memory and at some point this thread needs to allocate more 
memory. Btw I can reproduce it on K8s in a consistent manner, it fails every 
time.
   
   On other thing is that in the code base there are other places where there 
is a join on a thread that will be stopped via the shutdown hook like 
contextCleaner and as I said above shutdownHook does a lot of work eg. the 
SparkContext stop() method does stop a lot of stuff (not to mention there is 
one for Streaming as well).

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to