[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-19 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag ( although java 11 is coming though and people are 
encouraged to use the latest for security reasons and so we should do in our 
images). @squito all executors get a handler here: 
https://github.com/apache/spark/blob/453cbf3dd8df5ec4da844c93eb6000610b551541/core/src/main/scala/org/apache/spark/executor/Executor.scala#L62
 and set it here: 
https://github.com/apache/spark/blob/453cbf3dd8df5ec4da844c93eb6000610b551541/core/src/main/scala/org/apache/spark/executor/Executor.scala#L88
   CoarseGrainedExecutorBackend which is used by K8s creates an executor 
instance that sets that. 
   @felixcheung @squito @mccheah @ifilonenko 
   The problem is  https://issues.apache.org/jira/browse/SPARK-27812, which can 
appear due to another type of exception besides OOM (the only solution there is 
add a handler afaik). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-19 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag ( although java 11 is coming though and people are 
encouraged to use the latest for security reasons and so we should do in our 
images). @squito all executors get a handler here: 
https://github.com/apache/spark/blob/453cbf3dd8df5ec4da844c93eb6000610b551541/core/src/main/scala/org/apache/spark/executor/Executor.scala#L62
 and set it here: 
https://github.com/apache/spark/blob/453cbf3dd8df5ec4da844c93eb6000610b551541/core/src/main/scala/org/apache/spark/executor/Executor.scala#L88
   CoarseGrainedExecutorBackend which is used by K8s creates an executor 
instance that sets that. 
   @felixcheung @squito @mccheah @ifilonenko 
   The problem is  https://issues.apache.org/jira/browse/SPARK-27812, which can 
appear due to another type of exception besides OOM (the only solution there is 
a handler to the afaik). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-19 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag ( although java 11 is coming though and people are 
encouraged to use the latest for security reasons and so we should do in our 
images). @squito all executors get a handler here: 
https://github.com/apache/spark/blob/453cbf3dd8df5ec4da844c93eb6000610b551541/core/src/main/scala/org/apache/spark/executor/Executor.scala#L62
 and set it here: 
https://github.com/apache/spark/blob/453cbf3dd8df5ec4da844c93eb6000610b551541/core/src/main/scala/org/apache/spark/executor/Executor.scala#L88
   CoarseGrainedExecutorBackend which is used by K8s creates and executor 
instance that sets that. 
   @felixcheung @squito @mccheah @ifilonenko 
   The problem is  https://issues.apache.org/jira/browse/SPARK-27812, which can 
appear due to another type of exception besides OOM (the only solution there is 
a handler to the afaik). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-19 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag (java 11 is coming though and people are 
encouraged to use the latest for security reasons and so we should do in our 
images). @squito all executors get a handler here: 
https://github.com/apache/spark/blob/453cbf3dd8df5ec4da844c93eb6000610b551541/core/src/main/scala/org/apache/spark/executor/Executor.scala#L62
 and set it here: 
https://github.com/apache/spark/blob/453cbf3dd8df5ec4da844c93eb6000610b551541/core/src/main/scala/org/apache/spark/executor/Executor.scala#L88
   CoarseGrainedExecutorBackend which is used by K8s creates and executor 
instance that sets that. 
   @felixcheung @squito @mccheah @ifilonenko 
   The problem is  https://issues.apache.org/jira/browse/SPARK-27812, which can 
appear due to another type of exception besides OOM (the only solution there is 
a handler to the afaik). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-19 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag (java 11 is coming though and people are 
encouraged to use the latest for security reasons and so we should do in our 
images). @squito all executors get a handler here: 
https://github.com/apache/spark/blob/453cbf3dd8df5ec4da844c93eb6000610b551541/core/src/main/scala/org/apache/spark/executor/Executor.scala#L62
   @felixcheung @squito @mccheah @ifilonenko 
   The problem is  https://issues.apache.org/jira/browse/SPARK-27812, which can 
appear due to another type of exception besides OOM (the only solution there is 
a handler to the afaik). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-19 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag (java 11 is coming though and people are 
encouraged to use the latest for security reasons and so we should do in our 
images). Also check if we can add a handler to the executors?
   @felixcheung @squito @mccheah @ifilonenko 
   The problem is  https://issues.apache.org/jira/browse/SPARK-27812, which can 
appear due to another type of exception besides OOM (the only solution there is 
a handler afaik). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-19 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag (java 11 is coming though and people are 
encouraged to use the latest for security reasons and so we should do in our 
images). Also check if we can add a handler to the executors?
   @felixcheung @squito @mccheah @ifilonenko 
   The problem is  https://issues.apache.org/jira/browse/SPARK-27812, which can 
appear due to another type of exception besides OOM (the only solution there is 
a handler to the afaik). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-19 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag (java 11 is coming though and people are 
encouraged to use the latest for security reasons and so we should do in our 
images). 
   @felixcheung @squito @mccheah @ifilonenko 
   The problem is  https://issues.apache.org/jira/browse/SPARK-27812, which can 
appear due to another type of exception besides OOM (the only solution there is 
a handler afaik). 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-19 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag (java 11 is coming though and people are 
encouraged to use the latest for security reasons and so we should do in our 
images). 
   @felixcheung @squito @mccheah @ifilonenko 
   The problem is  https://issues.apache.org/jira/browse/SPARK-27812, which can 
appear due to another type of exception besides OOM. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-18 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag (java 11 is coming though). 
   @felixcheung @squito @mccheah @ifilonenko 
   The problem is  https://issues.apache.org/jira/browse/SPARK-27812, which can 
appear due to another type of exception besides OOM. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-18 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag. 
   @felixcheung @squito @mccheah @ifilonenko 
   The problem is  https://issues.apache.org/jira/browse/SPARK-27812, which can 
appear due to another type of exception besides OOM. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-18 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-513038825
 
 
   Ok if we all agree I can open a PR to add `XX:OnOutOfMemoryError` at the 
entrypoint with the old flag. Should I? The problem is  
https://issues.apache.org/jira/browse/SPARK-27812.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-12 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-510849683
 
 
   @squito @srowen @dongjoon-hyun by having a handler (as mentioned in the 
ticket by HenryYu) without running shutdownhook we could solve also: 
https://issues.apache.org/jira/browse/SPARK-27812


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-12 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-510849683
 
 
   @squito @srowen @dongjoon-hyun by having a handler (as mentioned in the 
ticket by HenryYu) without running shutdownhooks we could solve also: 
https://issues.apache.org/jira/browse/SPARK-27812


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-509017090
 
 
   @squito `-XX:OnOutOfMemoryError="kill -9 %p" ` I think there more options now
   
   
https://stackoverflow.com/questions/5792049/xxonoutofmemoryerror-kill-9-p-problem
 after java8u92 I use:
   
   ```
   -XX:+ExitOnOutOfMemoryError
   -XX:+CrashOnOutOfMemoryError
   ```
   Anyway if that is the consensus let's all agree, what about the executors? 
They have a shutdown handler... should they? I think this issue goes beyond K8s 
it affects all deployments. Also sometimes you may want to collect the crash 
report, so `CrashOnOutOfMemoryError` maybe a better option in some scenarios.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-509017090
 
 
   @squito `-XX:OnOutOfMemoryError="kill -9 %p" ` I think there more options now
   
   
https://stackoverflow.com/questions/5792049/xxonoutofmemoryerror-kill-9-p-problem
 after java8u92 I use:
   
   ```
   -XX:+ExitOnOutOfMemoryError
   -XX:+CrashOnOutOfMemoryError
   ```
   Anyway if that is the consensus let's all agree, what about the executors? 
They have a shutdown handler... should they? I think this issue goes beyond K8s 
it affects all deployments. Also sometimes you may want to collect the crash 
report, so `CrashOnOutOfMemoryError` maybe a better option.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-509017090
 
 
   @squito `-XX:OnOutOfMemoryError="kill -9 %p" ` I think there more options now
   
   
https://stackoverflow.com/questions/5792049/xxonoutofmemoryerror-kill-9-p-problem
 after java8u92 I use:
   
   ```
   -XX:+ExitOnOutOfMemoryError
   -XX:+CrashOnOutOfMemoryError
   ```
   Anyway if that is the consensus let's all agree, what about the executors? 
They have a shutdown handler... should they?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-03 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-508061578
 
 
   @srowen @zsxwing should I fall back to the initial approach of clearing the 
shutdownhooks and exiting immediately? How should I proceed (the only thing I 
havent tried is to run the stop logic in a thread with high priority and a 
dedicated thread pool just in case that works and stop logic has the chance to 
run)? 
   I don't see a lot of alternatives here (still waiting for @shipilev).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-07-03 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-508061578
 
 
   @srowen @zsxwing should I fall back to the initial approach of clearing the 
shutdownhooks and exiting immediately? How should I proceed (the only thing I 
havent tried is to run the stop logic in a thread with high priority and a 
dedicated thread pool just in case that works and stop logic has the chance to 
run)? 
   I don't see a lot of alternatives here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-21 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-504558356
 
 
   @squito I tried `XX:+ExitOnOutOfMemoryError` (described in jira) but the 
thing is executors do have proper uncaught exception handling. Driver does not 
have. So initially I tried to add something like that so things shutdown 
gracefully if possible given the jvm state but without the shutdown hooks. If 
shutdown hooks are enabled we have this issue with joins (which is reproducible 
100%). Also, If you run the pi example on minikube and Spark will get stuck, so 
the initial issue where there are no uncaught exception handler is also 100% 
reproducible. Regarding the interrupts I havent seen it work so far but who 
knows maybe I got unlucky, the thing is it is also depends on the jdk version 
because there are fixes happening as described in my last comment which changed 
the interrupts behavior. I called out the openjdk guys to see if their fix 
relates to this, no response yet. I will try to to do the join in another 
thread and make sure it happens with higher priority, so far there is no 
guarantee for this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-21 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-504558356
 
 
   @squito I tried `XX:+ExitOnOutOfMemoryError` (described in jira) but the 
thing is executors do have proper uncaught exception handling. Driver does not 
have. So initially I tried to add something like that so things shutdown 
gracefully if possible given the jvm state but without the shutdown hooks. If 
shutdown hooks are enabled we have this issue with joins (which is reproducible 
100%). Also, If you run the pi example on minikube and Spark will get stuck, so 
the initial issue where there are no uncaught exception handler is also 100% 
reproducible. Regarding the interrupts I havent seen it work so far but who 
knows maybe I got unlucky, the thing is it is also depends on the jdk version 
because there are fixes happening as described in my last comment, that change 
interrupts behavior. I called the openjdk guys to see if their fix relates to 
this, no response yet. I will try to to do the join in another thread and make 
sure it happens with higher priority, so far there is no guarantee for this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-14 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting this one: https://bugs.openjdk.java.net/browse/JDK-8154017 
mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845.
   The fix in there ignores all interrupts until the hooks are completed (we 
call exit as well) so since the uncaught exception handler executes a 
shutdownhook from the event loop thread, that thread cannot be interrupted. 
@shipilev Hi ! I saw you reported that error and also discussed the related 
fix, any help would be great as I dont have the rights to comment to the ticket 
directly.
   Another question is do we have this pattern elsewhere like Master,Worker or 
Executor where there is already a handler? @srowen also thoughts?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting this one: https://bugs.openjdk.java.net/browse/JDK-8154017 
mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845.
   The fix in there ignores all interrupts until the hooks are completed (we 
call exit as well) so since the uncaught exception handler executes a 
shutdownhook from the event loop thread, that thread cannot be interrupted. 
@shipilev Hi ! I saw you reported that error and also discussed the related 
fix, any help would be great as I dont have the rights to comment to the ticket 
directly.
   Another question is do we have this pattern elsewhere like Master,Worker or 
Executor where there is already a handler?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting this one: https://bugs.openjdk.java.net/browse/JDK-8154017 
mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845.
   The fix in there ignores all interrupts until the hooks are completed (we 
call exit as well) so since the uncaught exception handler executes a 
shutdownhook from the event loop thread, that thread cannot be interrupted. 
@shipilev Hi ! I saw you reported that error and also discussed the related 
fix, any help would be great as I dont have the rights to comment to the ticket 
directly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting this one: https://bugs.openjdk.java.net/browse/JDK-8154017 
mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845.
   The fix in there ignores all interrupts until the hooks are completed so 
since the uncaught exception handler executes a shutdownhook from the event 
loop thread, that thread cannot be interrupted. @shipilev Hi ! I saw you 
reported that error and also discussed the related fix, any help would be great 
as I dont have the rights to comment to the ticket directly.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting this one: https://bugs.openjdk.java.net/browse/JDK-8154017 
mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845.
   The fix in there ignores all interrupts until the hooks are completed so 
since the uncaught exception handler executes a shutdownhook from the event 
loop thread, that thread cannot be interrupted. @shipilev Hi ! I saw you 
reported that error and also discussed the related fix, any help would be great?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting this one: https://bugs.openjdk.java.net/browse/JDK-8154017 
mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845.
   The fix in there ignores all interrupts until the hooks are completed so 
since the uncaught exception handler executes a shutdownhook from the event 
loop thread, that thread cannot be interrupted. @shipilev Hi @shipilev, I saw 
you reported that error and also discussed the related fix, any help would be 
great?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting this one: https://bugs.openjdk.java.net/browse/JDK-8154017 
mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845.
   The fix in there ignores all interrupts until the hooks are completed so 
since the uncaught exception handler executes a shutdownhook from the event 
loop thread, that thread cannot be interrupted.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting this one: https://bugs.openjdk.java.net/browse/JDK-8154017 
mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845.
   The fix in there ignores all interrupts until the hooks are completed so 
since the uncaught exception handler executes a shutdownhook from the event 
loop thread it cannot be interrupted.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting this one: https://bugs.openjdk.java.net/browse/JDK-8154017 
mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845.
   The fix in there ignores all interrupts until the hooks are completed so 
since the uncaught exception handler executes a shutdownhook from the event 
loop thread it cannot be interrupted.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting a bug like this one: 
https://bugs.openjdk.java.net/browse/JDK-8154017 mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845, I 
could open a bug.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting this one: https://bugs.openjdk.java.net/browse/JDK-8154017 
mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845, I 
could open a bug.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.
   Probably hitting this one: https://bugs.openjdk.java.net/browse/JDK-8154017 
mentioned 
here:https://github.com/jacoco/jacoco/issues/394#issuecomment-208531845


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-13 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-501593451
 
 
   @zsxwing we can only fix the related issue here, just avoid the deadlock so 
shutdown is finished. As for the generic case I dont see why this thread is not 
interrupted maybe because this is a special case when handling an Uncaught 
Exception via a handler coming from the thread it caused it. I will check what 
jvm does in this case but if there is anyone who knows more feel free to call 
him here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-11 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-500779047
 
 
   @zsxwing @srowen any decision on how to approach this? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-11 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-500779047
 
 
   @zsxwing @srowen any decision on how to approach this? I can do the fix if 
we make a call.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-11 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-500779047
 
 
   @zsxwing @srowen any decision on how to approach this? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499837796
 
 
   @zsxwing @srowen one solution that works is running the logic of stop in 
another thread:
   
   ```
   def stop() {
   
   if (stopped.compareAndSet(false, true)) {
 @volatile var onStopCalled = false
 try {
   
   new Thread() { 
 setDaemon(true)
   new Runnable {
 override def run(): Unit = {
   try {
 eventThread.join()
   // Call onStop after the event thread exits to make sure 
onReceive happens before onStop
   onStopCalled = true
   onStop()
   } catch {
 case ie: InterruptedException =>
   Thread.currentThread().interrupt()
   if (!onStopCalled) {
 // ie is thrown from `eventThread.join()`. Otherwise, we 
should not call `onStop` since
 // it's already called.
 onStop()
   }
   }
 }
   }
   }.start()
   } else {
 // Keep quiet to allow calling `stop` multiple times.
   }
   }
   ```
   Assuming that is safe we let the shutdownhook proceed... any issues with 
that if we let that run in the background or there is a strict order for how 
things should run? One thing is that it may never run unless thread priority is 
high enough...
   ```
   19/06/07 10:31:21 ERROR SparkUncaughtExceptionHandler: Uncaught exception in 
thread Thread[dag-scheduler-event-loop,5,main]
   java.lang.OutOfMemoryError: Java heap space
at 
scala.collection.mutable.ResizableArray.ensureSize(ResizableArray.scala:106)
at 
scala.collection.mutable.ResizableArray.ensureSize$(ResizableArray.scala:96)
at scala.collection.mutable.ArrayBuffer.ensureSize(ArrayBuffer.scala:49)
at scala.collection.mutable.ArrayBuffer.$plus$eq(ArrayBuffer.scala:85)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTask(TaskSetManager.scala:264)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$2(TaskSetManager.scala:194)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1106/850290016.apply$mcVI$sp(Unknown
 Source)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$1(TaskSetManager.scala:193)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1105/1310826901.apply$mcV$sp(Unknown
 Source)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:534)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTasks(TaskSetManager.scala:192)
at 
org.apache.spark.scheduler.TaskSetManager.(TaskSetManager.scala:189)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:252)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:210)
at 
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1233)
at 
org.apache.spark.scheduler.DAGScheduler.submitStage(DAGScheduler.scala:1084)
at 
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:1028)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2126)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2118)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2107)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   19/06/07 10:31:21 INFO SparkContext: Invoking stop() from shutdown hook
   19/06/07 10:31:21 INFO SparkUI: Stopped Spark web UI at 
http://spark-pi2-1559903354898-driver-svc.spark.svc:4040
   19/06/07 10:31:21 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
spark-pi2-1559903354898-driver-svc.spark.svc:7079 in memory (size: 1765.0 B, 
free: 110.0 MiB)
   19/06/07 10:31:21 INFO KubernetesClusterSchedulerBackend: Shutting down all 
executors
   19/06/07 10:31:21 INFO 
KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each 
executor to shut down
   19/06/07 10:31:21 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client 
has been closed (this is expected if the application is shutting down.)
   19/06/07 10:31:22 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   19/06/07 10:31:22 INFO MemoryStore: MemoryStore cleared
   19/06/07 10:31:22 INFO BlockManager: BlockManager stopped
   19/06/07 10:31:22 INFO BlockManagerMaster: BlockManagerMaster stopped
   19/06/07 10:31:22 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!

[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499837796
 
 
   @zsxwing @srowen one solution that works is running the logic of stop in 
another thread:
   
   ```
   def stop() {
   
   if (stopped.compareAndSet(false, true)) {
 @volatile var onStopCalled = false
 try {
   
   new Thread() { 
 setDaemon(true)
   new Runnable {
 override def run(): Unit = {
   try {
 eventThread.join()
   // Call onStop after the event thread exits to make sure 
onReceive happens before onStop
   onStopCalled = true
   onStop()
   } catch {
 case ie: InterruptedException =>
   Thread.currentThread().interrupt()
   if (!onStopCalled) {
 // ie is thrown from `eventThread.join()`. Otherwise, we 
should not call `onStop` since
 // it's already called.
 onStop()
   }
   }
 }
   }
   }.start()
   } else {
 // Keep quiet to allow calling `stop` multiple times.
   }
   }
   ```
   Assuming that is safe we let the shutdownhook proceed... any issues with 
that if we let that run in the background or there is a strict order for how 
things should run? One thing is that it may never run...
   ```
   19/06/07 10:31:21 ERROR SparkUncaughtExceptionHandler: Uncaught exception in 
thread Thread[dag-scheduler-event-loop,5,main]
   java.lang.OutOfMemoryError: Java heap space
at 
scala.collection.mutable.ResizableArray.ensureSize(ResizableArray.scala:106)
at 
scala.collection.mutable.ResizableArray.ensureSize$(ResizableArray.scala:96)
at scala.collection.mutable.ArrayBuffer.ensureSize(ArrayBuffer.scala:49)
at scala.collection.mutable.ArrayBuffer.$plus$eq(ArrayBuffer.scala:85)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTask(TaskSetManager.scala:264)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$2(TaskSetManager.scala:194)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1106/850290016.apply$mcVI$sp(Unknown
 Source)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$1(TaskSetManager.scala:193)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1105/1310826901.apply$mcV$sp(Unknown
 Source)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:534)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTasks(TaskSetManager.scala:192)
at 
org.apache.spark.scheduler.TaskSetManager.(TaskSetManager.scala:189)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:252)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:210)
at 
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1233)
at 
org.apache.spark.scheduler.DAGScheduler.submitStage(DAGScheduler.scala:1084)
at 
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:1028)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2126)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2118)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2107)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   19/06/07 10:31:21 INFO SparkContext: Invoking stop() from shutdown hook
   19/06/07 10:31:21 INFO SparkUI: Stopped Spark web UI at 
http://spark-pi2-1559903354898-driver-svc.spark.svc:4040
   19/06/07 10:31:21 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
spark-pi2-1559903354898-driver-svc.spark.svc:7079 in memory (size: 1765.0 B, 
free: 110.0 MiB)
   19/06/07 10:31:21 INFO KubernetesClusterSchedulerBackend: Shutting down all 
executors
   19/06/07 10:31:21 INFO 
KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each 
executor to shut down
   19/06/07 10:31:21 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client 
has been closed (this is expected if the application is shutting down.)
   19/06/07 10:31:22 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   19/06/07 10:31:22 INFO MemoryStore: MemoryStore cleared
   19/06/07 10:31:22 INFO BlockManager: BlockManager stopped
   19/06/07 10:31:22 INFO BlockManagerMaster: BlockManagerMaster stopped
   19/06/07 10:31:22 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   19/06/07 10:31:22 INFO 

[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499837796
 
 
   @zsxwing @srowen one solution that works is running the logic of stop in 
another thread:
   
   ```
   def stop() {
   
   if (stopped.compareAndSet(false, true)) {
 @volatile var onStopCalled = false
 try {
   
   new Thread() { new Runnable {
   
   
 setDaemon(true)
 override def run(): Unit = {
   try {
 eventThread.join()
   // Call onStop after the event thread exits to make sure 
onReceive happens before onStop
   onStopCalled = true
   onStop()
   } catch {
 case ie: InterruptedException =>
   Thread.currentThread().interrupt()
   if (!onStopCalled) {
 // ie is thrown from `eventThread.join()`. Otherwise, we 
should not call `onStop` since
 // it's already called.
 onStop()
   }
   }
 }
   }
   }.start()
   } else {
 // Keep quiet to allow calling `stop` multiple times.
   }
   }
   ```
   Assuming that is safe we let the shutdownhook proceed... any issues with 
that if we let that run in the background or there is a strict order for how 
things should run? One thing is that it may never run...
   ```
   19/06/07 10:31:21 ERROR SparkUncaughtExceptionHandler: Uncaught exception in 
thread Thread[dag-scheduler-event-loop,5,main]
   java.lang.OutOfMemoryError: Java heap space
at 
scala.collection.mutable.ResizableArray.ensureSize(ResizableArray.scala:106)
at 
scala.collection.mutable.ResizableArray.ensureSize$(ResizableArray.scala:96)
at scala.collection.mutable.ArrayBuffer.ensureSize(ArrayBuffer.scala:49)
at scala.collection.mutable.ArrayBuffer.$plus$eq(ArrayBuffer.scala:85)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTask(TaskSetManager.scala:264)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$2(TaskSetManager.scala:194)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1106/850290016.apply$mcVI$sp(Unknown
 Source)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$1(TaskSetManager.scala:193)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1105/1310826901.apply$mcV$sp(Unknown
 Source)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:534)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTasks(TaskSetManager.scala:192)
at 
org.apache.spark.scheduler.TaskSetManager.(TaskSetManager.scala:189)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:252)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:210)
at 
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1233)
at 
org.apache.spark.scheduler.DAGScheduler.submitStage(DAGScheduler.scala:1084)
at 
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:1028)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2126)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2118)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2107)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   19/06/07 10:31:21 INFO SparkContext: Invoking stop() from shutdown hook
   19/06/07 10:31:21 INFO SparkUI: Stopped Spark web UI at 
http://spark-pi2-1559903354898-driver-svc.spark.svc:4040
   19/06/07 10:31:21 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
spark-pi2-1559903354898-driver-svc.spark.svc:7079 in memory (size: 1765.0 B, 
free: 110.0 MiB)
   19/06/07 10:31:21 INFO KubernetesClusterSchedulerBackend: Shutting down all 
executors
   19/06/07 10:31:21 INFO 
KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each 
executor to shut down
   19/06/07 10:31:21 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client 
has been closed (this is expected if the application is shutting down.)
   19/06/07 10:31:22 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   19/06/07 10:31:22 INFO MemoryStore: MemoryStore cleared
   19/06/07 10:31:22 INFO BlockManager: BlockManager stopped
   19/06/07 10:31:22 INFO BlockManagerMaster: BlockManagerMaster stopped
   19/06/07 10:31:22 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   19/06/07 10:31:22 INFO 

[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499837796
 
 
   @zsxwing @srowen one solution that works is running the logic of stop in 
another thread:
   
   ```
   def stop() {
   
   if (stopped.compareAndSet(false, true)) {
 @volatile var onStopCalled = false
 try {
   
   new Thread() { new Runnable {
   
   
 setDaemon(true)
 override def run(): Unit = {
   try {
 eventThread.join()
   // Call onStop after the event thread exits to make sure 
onReceive happens before onStop
   onStopCalled = true
   onStop()
   } catch {
 case ie: InterruptedException =>
   Thread.currentThread().interrupt()
   if (!onStopCalled) {
 // ie is thrown from `eventThread.join()`. Otherwise, we 
should not call `onStop` since
 // it's already called.
 onStop()
   }
   }
 }
   }
   }.start()
   } else {
 // Keep quiet to allow calling `stop` multiple times.
   }
   }
   ```
   Assuming that is safe we let the shutdownhook proceed... any issues with 
that if we let that run in the background or there is a strict order for how 
things should run? One thing is that i may never run...
   ```
   19/06/07 10:31:21 ERROR SparkUncaughtExceptionHandler: Uncaught exception in 
thread Thread[dag-scheduler-event-loop,5,main]
   java.lang.OutOfMemoryError: Java heap space
at 
scala.collection.mutable.ResizableArray.ensureSize(ResizableArray.scala:106)
at 
scala.collection.mutable.ResizableArray.ensureSize$(ResizableArray.scala:96)
at scala.collection.mutable.ArrayBuffer.ensureSize(ArrayBuffer.scala:49)
at scala.collection.mutable.ArrayBuffer.$plus$eq(ArrayBuffer.scala:85)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTask(TaskSetManager.scala:264)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$2(TaskSetManager.scala:194)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1106/850290016.apply$mcVI$sp(Unknown
 Source)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$1(TaskSetManager.scala:193)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1105/1310826901.apply$mcV$sp(Unknown
 Source)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:534)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTasks(TaskSetManager.scala:192)
at 
org.apache.spark.scheduler.TaskSetManager.(TaskSetManager.scala:189)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:252)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:210)
at 
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1233)
at 
org.apache.spark.scheduler.DAGScheduler.submitStage(DAGScheduler.scala:1084)
at 
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:1028)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2126)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2118)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2107)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   19/06/07 10:31:21 INFO SparkContext: Invoking stop() from shutdown hook
   19/06/07 10:31:21 INFO SparkUI: Stopped Spark web UI at 
http://spark-pi2-1559903354898-driver-svc.spark.svc:4040
   19/06/07 10:31:21 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
spark-pi2-1559903354898-driver-svc.spark.svc:7079 in memory (size: 1765.0 B, 
free: 110.0 MiB)
   19/06/07 10:31:21 INFO KubernetesClusterSchedulerBackend: Shutting down all 
executors
   19/06/07 10:31:21 INFO 
KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each 
executor to shut down
   19/06/07 10:31:21 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client 
has been closed (this is expected if the application is shutting down.)
   19/06/07 10:31:22 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   19/06/07 10:31:22 INFO MemoryStore: MemoryStore cleared
   19/06/07 10:31:22 INFO BlockManager: BlockManager stopped
   19/06/07 10:31:22 INFO BlockManagerMaster: BlockManagerMaster stopped
   19/06/07 10:31:22 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   19/06/07 10:31:22 INFO 

[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499837796
 
 
   @zsxwing @srowen one solution that works is running the logic of stop in 
another thread:
   
   ```
   def stop() {
   
   if (stopped.compareAndSet(false, true)) {
 @volatile var onStopCalled = false
 try {
   
   new Thread() { new Runnable {
   
   
 setDaemon(true)
 override def run(): Unit = {
   try {
 eventThread.join()
   // Call onStop after the event thread exits to make sure 
onReceive happens before onStop
   onStopCalled = true
   onStop()
   } catch {
 case ie: InterruptedException =>
   Thread.currentThread().interrupt()
   if (!onStopCalled) {
 // ie is thrown from `eventThread.join()`. Otherwise, we 
should not call `onStop` since
 // it's already called.
 onStop()
   }
   }
 }
   }
   }.start()
   } else {
 // Keep quiet to allow calling `stop` multiple times.
   }
   }
   ```
   Assuming that is safe we let the shutdownhook proceed... any issues with 
that if we let that run in the background or there is a strict order for how 
things should run?
   ```
   19/06/07 10:31:21 ERROR SparkUncaughtExceptionHandler: Uncaught exception in 
thread Thread[dag-scheduler-event-loop,5,main]
   java.lang.OutOfMemoryError: Java heap space
at 
scala.collection.mutable.ResizableArray.ensureSize(ResizableArray.scala:106)
at 
scala.collection.mutable.ResizableArray.ensureSize$(ResizableArray.scala:96)
at scala.collection.mutable.ArrayBuffer.ensureSize(ArrayBuffer.scala:49)
at scala.collection.mutable.ArrayBuffer.$plus$eq(ArrayBuffer.scala:85)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTask(TaskSetManager.scala:264)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$2(TaskSetManager.scala:194)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1106/850290016.apply$mcVI$sp(Unknown
 Source)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$1(TaskSetManager.scala:193)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1105/1310826901.apply$mcV$sp(Unknown
 Source)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:534)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTasks(TaskSetManager.scala:192)
at 
org.apache.spark.scheduler.TaskSetManager.(TaskSetManager.scala:189)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:252)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:210)
at 
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1233)
at 
org.apache.spark.scheduler.DAGScheduler.submitStage(DAGScheduler.scala:1084)
at 
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:1028)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2126)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2118)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2107)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   19/06/07 10:31:21 INFO SparkContext: Invoking stop() from shutdown hook
   19/06/07 10:31:21 INFO SparkUI: Stopped Spark web UI at 
http://spark-pi2-1559903354898-driver-svc.spark.svc:4040
   19/06/07 10:31:21 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
spark-pi2-1559903354898-driver-svc.spark.svc:7079 in memory (size: 1765.0 B, 
free: 110.0 MiB)
   19/06/07 10:31:21 INFO KubernetesClusterSchedulerBackend: Shutting down all 
executors
   19/06/07 10:31:21 INFO 
KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each 
executor to shut down
   19/06/07 10:31:21 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client 
has been closed (this is expected if the application is shutting down.)
   19/06/07 10:31:22 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   19/06/07 10:31:22 INFO MemoryStore: MemoryStore cleared
   19/06/07 10:31:22 INFO BlockManager: BlockManager stopped
   19/06/07 10:31:22 INFO BlockManagerMaster: BlockManagerMaster stopped
   19/06/07 10:31:22 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   19/06/07 10:31:22 INFO SparkContext: Successfully stopped 

[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499837796
 
 
   @zsxwing @srowen one solution that works is running the logic of stop in 
another thread:
   
   ```
   def stop() {
   
   if (stopped.compareAndSet(false, true)) {
 @volatile var onStopCalled = false
 try {
   
   new Thread() { new Runnable {
   
   
 setDaemon(true)
 override def run(): Unit = {
   try {
 eventThread.join()
   // Call onStop after the event thread exits to make sure 
onReceive happens before onStop
   onStopCalled = true
   onStop()
   } catch {
 case ie: InterruptedException =>
   Thread.currentThread().interrupt()
   if (!onStopCalled) {
 // ie is thrown from `eventThread.join()`. Otherwise, we 
should not call `onStop` since
 // it's already called.
 onStop()
   }
   }
 }
   }
   }.start()
   } else {
 // Keep quiet to allow calling `stop` multiple times.
   }
   }
   ```
   Assuming that is safe we let the shutdownhook proceed... any issues with 
that if we let that run in the background?
   ```
   19/06/07 10:31:21 ERROR SparkUncaughtExceptionHandler: Uncaught exception in 
thread Thread[dag-scheduler-event-loop,5,main]
   java.lang.OutOfMemoryError: Java heap space
at 
scala.collection.mutable.ResizableArray.ensureSize(ResizableArray.scala:106)
at 
scala.collection.mutable.ResizableArray.ensureSize$(ResizableArray.scala:96)
at scala.collection.mutable.ArrayBuffer.ensureSize(ArrayBuffer.scala:49)
at scala.collection.mutable.ArrayBuffer.$plus$eq(ArrayBuffer.scala:85)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTask(TaskSetManager.scala:264)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$2(TaskSetManager.scala:194)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1106/850290016.apply$mcVI$sp(Unknown
 Source)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$1(TaskSetManager.scala:193)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1105/1310826901.apply$mcV$sp(Unknown
 Source)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:534)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTasks(TaskSetManager.scala:192)
at 
org.apache.spark.scheduler.TaskSetManager.(TaskSetManager.scala:189)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:252)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:210)
at 
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1233)
at 
org.apache.spark.scheduler.DAGScheduler.submitStage(DAGScheduler.scala:1084)
at 
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:1028)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2126)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2118)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2107)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   19/06/07 10:31:21 INFO SparkContext: Invoking stop() from shutdown hook
   19/06/07 10:31:21 INFO SparkUI: Stopped Spark web UI at 
http://spark-pi2-1559903354898-driver-svc.spark.svc:4040
   19/06/07 10:31:21 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
spark-pi2-1559903354898-driver-svc.spark.svc:7079 in memory (size: 1765.0 B, 
free: 110.0 MiB)
   19/06/07 10:31:21 INFO KubernetesClusterSchedulerBackend: Shutting down all 
executors
   19/06/07 10:31:21 INFO 
KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each 
executor to shut down
   19/06/07 10:31:21 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client 
has been closed (this is expected if the application is shutting down.)
   19/06/07 10:31:22 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   19/06/07 10:31:22 INFO MemoryStore: MemoryStore cleared
   19/06/07 10:31:22 INFO BlockManager: BlockManager stopped
   19/06/07 10:31:22 INFO BlockManagerMaster: BlockManagerMaster stopped
   19/06/07 10:31:22 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   19/06/07 10:31:22 INFO SparkContext: Successfully stopped SparkContext
   19/06/07 10:31:22 INFO ShutdownHookManager: 

[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499837796
 
 
   @zsxwing @srowen one solution that works is running the logic of stop in 
another thread:
   
   ```
   def stop() {
 @volatile var onStopCalled = false
 try {
   
   new Thread() { new Runnable {
 setDaemon(true)
 override def run(): Unit = {
   try {
 eventThread.join()
   // Call onStop after the event thread exits to make sure 
onReceive happens before onStop
   onStopCalled = true
   onStop()
   } catch {
 case ie: InterruptedException =>
   Thread.currentThread().interrupt()
   if (!onStopCalled) {
 // ie is thrown from `eventThread.join()`. Otherwise, we 
should not call `onStop` since
 // it's already called.
 onStop()
   }
   }
 }
   }
   }.start()
   }
   ```
   Assuming that is safe we let the shutdownhook proceed... any issues with 
that if we let that run in the background?
   ```
   19/06/07 10:31:21 ERROR SparkUncaughtExceptionHandler: Uncaught exception in 
thread Thread[dag-scheduler-event-loop,5,main]
   java.lang.OutOfMemoryError: Java heap space
at 
scala.collection.mutable.ResizableArray.ensureSize(ResizableArray.scala:106)
at 
scala.collection.mutable.ResizableArray.ensureSize$(ResizableArray.scala:96)
at scala.collection.mutable.ArrayBuffer.ensureSize(ArrayBuffer.scala:49)
at scala.collection.mutable.ArrayBuffer.$plus$eq(ArrayBuffer.scala:85)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTask(TaskSetManager.scala:264)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$2(TaskSetManager.scala:194)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1106/850290016.apply$mcVI$sp(Unknown
 Source)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$1(TaskSetManager.scala:193)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1105/1310826901.apply$mcV$sp(Unknown
 Source)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:534)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTasks(TaskSetManager.scala:192)
at 
org.apache.spark.scheduler.TaskSetManager.(TaskSetManager.scala:189)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:252)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:210)
at 
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1233)
at 
org.apache.spark.scheduler.DAGScheduler.submitStage(DAGScheduler.scala:1084)
at 
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:1028)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2126)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2118)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2107)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   19/06/07 10:31:21 INFO SparkContext: Invoking stop() from shutdown hook
   19/06/07 10:31:21 INFO SparkUI: Stopped Spark web UI at 
http://spark-pi2-1559903354898-driver-svc.spark.svc:4040
   19/06/07 10:31:21 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
spark-pi2-1559903354898-driver-svc.spark.svc:7079 in memory (size: 1765.0 B, 
free: 110.0 MiB)
   19/06/07 10:31:21 INFO KubernetesClusterSchedulerBackend: Shutting down all 
executors
   19/06/07 10:31:21 INFO 
KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each 
executor to shut down
   19/06/07 10:31:21 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client 
has been closed (this is expected if the application is shutting down.)
   19/06/07 10:31:22 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   19/06/07 10:31:22 INFO MemoryStore: MemoryStore cleared
   19/06/07 10:31:22 INFO BlockManager: BlockManager stopped
   19/06/07 10:31:22 INFO BlockManagerMaster: BlockManagerMaster stopped
   19/06/07 10:31:22 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   19/06/07 10:31:22 INFO SparkContext: Successfully stopped SparkContext
   19/06/07 10:31:22 INFO ShutdownHookManager: Shutdown hook called
   19/06/07 10:31:22 INFO ShutdownHookManager: Deleting directory 
/tmp/spark-fef9ec63-c71e-4859-9910-12c51a336d75
   19/06/07 

[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499837796
 
 
   @zsxwing @srowen one solution that works is running the logic of stop in 
another thread:
   
   ```
   def stop() {
 @volatile var onStopCalled = false
 try {
   
   new Thread() { new Runnable {
 setDaemon(true)
 override def run(): Unit = {
   try {
 eventThread.join()
   // Call onStop after the event thread exits to make sure 
onReceive happens before onStop
   onStopCalled = true
   onStop()
   } catch {
 case ie: InterruptedException =>
   Thread.currentThread().interrupt()
   if (!onStopCalled) {
 // ie is thrown from `eventThread.join()`. Otherwise, we 
should not call `onStop` since
 // it's already called.
 onStop()
   }
   }
 }
   }
   }.start()
   }
   ```
   Assuming that is safe... any issues with that if we let that run in the 
background?
   ```
   19/06/07 10:31:21 ERROR SparkUncaughtExceptionHandler: Uncaught exception in 
thread Thread[dag-scheduler-event-loop,5,main]
   java.lang.OutOfMemoryError: Java heap space
at 
scala.collection.mutable.ResizableArray.ensureSize(ResizableArray.scala:106)
at 
scala.collection.mutable.ResizableArray.ensureSize$(ResizableArray.scala:96)
at scala.collection.mutable.ArrayBuffer.ensureSize(ArrayBuffer.scala:49)
at scala.collection.mutable.ArrayBuffer.$plus$eq(ArrayBuffer.scala:85)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTask(TaskSetManager.scala:264)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$2(TaskSetManager.scala:194)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1106/850290016.apply$mcVI$sp(Unknown
 Source)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$1(TaskSetManager.scala:193)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1105/1310826901.apply$mcV$sp(Unknown
 Source)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:534)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTasks(TaskSetManager.scala:192)
at 
org.apache.spark.scheduler.TaskSetManager.(TaskSetManager.scala:189)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:252)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:210)
at 
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1233)
at 
org.apache.spark.scheduler.DAGScheduler.submitStage(DAGScheduler.scala:1084)
at 
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:1028)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2126)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2118)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2107)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   19/06/07 10:31:21 INFO SparkContext: Invoking stop() from shutdown hook
   19/06/07 10:31:21 INFO SparkUI: Stopped Spark web UI at 
http://spark-pi2-1559903354898-driver-svc.spark.svc:4040
   19/06/07 10:31:21 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
spark-pi2-1559903354898-driver-svc.spark.svc:7079 in memory (size: 1765.0 B, 
free: 110.0 MiB)
   19/06/07 10:31:21 INFO KubernetesClusterSchedulerBackend: Shutting down all 
executors
   19/06/07 10:31:21 INFO 
KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each 
executor to shut down
   19/06/07 10:31:21 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client 
has been closed (this is expected if the application is shutting down.)
   19/06/07 10:31:22 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   19/06/07 10:31:22 INFO MemoryStore: MemoryStore cleared
   19/06/07 10:31:22 INFO BlockManager: BlockManager stopped
   19/06/07 10:31:22 INFO BlockManagerMaster: BlockManagerMaster stopped
   19/06/07 10:31:22 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   19/06/07 10:31:22 INFO SparkContext: Successfully stopped SparkContext
   19/06/07 10:31:22 INFO ShutdownHookManager: Shutdown hook called
   19/06/07 10:31:22 INFO ShutdownHookManager: Deleting directory 
/tmp/spark-fef9ec63-c71e-4859-9910-12c51a336d75
   19/06/07 10:31:22 INFO ShutdownHookManager: 

[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499837796
 
 
   @zsxwing @srowen one solution that works is running the logic of stop in 
another thread:
   
   ```
   def stop() {
 @volatile var onStopCalled = false
 try {
   
   new Thread() { new Runnable {
 setDaemon(true)
 override def run(): Unit = {
   try {
 eventThread.join()
   // Call onStop after the event thread exits to make sure 
onReceive happens before onStop
   onStopCalled = true
   onStop()
   } catch {
 case ie: InterruptedException =>
   Thread.currentThread().interrupt()
   if (!onStopCalled) {
 // ie is thrown from `eventThread.join()`. Otherwise, we 
should not call `onStop` since
 // it's already called.
 onStop()
   }
   }
 }
   }
   }.start()
   }
   ```
   Assuming that is safe... any issues with that if we let that run in the 
background?
   ```
   19/06/07 10:31:21 ERROR SparkUncaughtExceptionHandler: Uncaught exception in 
thread Thread[dag-scheduler-event-loop,5,main]
   java.lang.OutOfMemoryError: Java heap space
at 
scala.collection.mutable.ResizableArray.ensureSize(ResizableArray.scala:106)
at 
scala.collection.mutable.ResizableArray.ensureSize$(ResizableArray.scala:96)
at scala.collection.mutable.ArrayBuffer.ensureSize(ArrayBuffer.scala:49)
at scala.collection.mutable.ArrayBuffer.$plus$eq(ArrayBuffer.scala:85)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTask(TaskSetManager.scala:264)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$2(TaskSetManager.scala:194)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1106/850290016.apply$mcVI$sp(Unknown
 Source)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at 
org.apache.spark.scheduler.TaskSetManager.$anonfun$addPendingTasks$1(TaskSetManager.scala:193)
at 
org.apache.spark.scheduler.TaskSetManager$$Lambda$1105/1310826901.apply$mcV$sp(Unknown
 Source)
at 
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.timeTakenMs(Utils.scala:534)
at 
org.apache.spark.scheduler.TaskSetManager.addPendingTasks(TaskSetManager.scala:192)
at 
org.apache.spark.scheduler.TaskSetManager.(TaskSetManager.scala:189)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.createTaskSetManager(TaskSchedulerImpl.scala:252)
at 
org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:210)
at 
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1233)
at 
org.apache.spark.scheduler.DAGScheduler.submitStage(DAGScheduler.scala:1084)
at 
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:1028)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2126)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2118)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2107)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   19/06/07 10:31:21 INFO SparkContext: Invoking stop() from shutdown hook
   19/06/07 10:31:21 INFO SparkUI: Stopped Spark web UI at 
http://spark-pi2-1559903354898-driver-svc.spark.svc:4040
   19/06/07 10:31:21 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 
spark-pi2-1559903354898-driver-svc.spark.svc:7079 in memory (size: 1765.0 B, 
free: 110.0 MiB)
   19/06/07 10:31:21 INFO KubernetesClusterSchedulerBackend: Shutting down all 
executors
   19/06/07 10:31:21 INFO 
KubernetesClusterSchedulerBackend$KubernetesDriverEndpoint: Asking each 
executor to shut down
   19/06/07 10:31:21 WARN ExecutorPodsWatchSnapshotSource: Kubernetes client 
has been closed (this is expected if the application is shutting down.)
   19/06/07 10:31:22 INFO MapOutputTrackerMasterEndpoint: 
MapOutputTrackerMasterEndpoint stopped!
   19/06/07 10:31:22 INFO MemoryStore: MemoryStore cleared
   19/06/07 10:31:22 INFO BlockManager: BlockManager stopped
   19/06/07 10:31:22 INFO BlockManagerMaster: BlockManagerMaster stopped
   19/06/07 10:31:22 INFO 
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: 
OutputCommitCoordinator stopped!
   19/06/07 10:31:22 INFO SparkContext: Successfully stopped SparkContext
   19/06/07 10:31:22 INFO ShutdownHookManager: Shutdown hook called
   19/06/07 10:31:22 INFO ShutdownHookManager: Deleting directory 
/tmp/spark-fef9ec63-c71e-4859-9910-12c51a336d75
   19/06/07 10:31:22 INFO ShutdownHookManager: 

[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks (an array 
holds them) which creates an OOM error for the DAGScheduler eventLooop thread 
since this is the one that will eventually try to submit the actual job, of 
course my jvm mem settings are enough to reproduce it, for the values pls have 
a look at the jira ticket. Of course this could happen in other cases where jvm 
is running out of memory and at some point this thread needs to allocate more 
memory. Btw I can reproduce it on K8s in a consistent manner, it fails every 
time.
   
   On other thing is that in the code base there are other places where there 
is a join on a thread that will be stopped via the shutdown hook like 
contextCleaner and as I mentioned above shutdownHook does a lot of work eg. the 
SparkContext stop() method does stop a lot of stuff (not to mention there is 
one hook for Streaming as well).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks (an array 
holds them) which creates an OOM error for the DAGScheduler eventLooop thread 
since this is the one that will eventually try to submit the actual job, of 
course my jvm mem settings are enough to reproduce it, for the values pls have 
a look at the jira ticket. Of course this could happen in other cases where jvm 
is running out of memory and at some point this thread needs to allocate more 
memory. Btw I can reproduce it on K8s in a consistent manner, it fails every 
time.
   
   On other thing is that in the code base there are other places where there 
is a join on a thread that will be stopped via the shutdown hook like 
contextCleaner and as I mentioned above shutdownHook does a lot of work eg. the 
SparkContext stop() method does stop a lot of stuff (not to mention there is 
one for Streaming as well).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks (an array 
holds them) which creates an OOM error for the DAGScheduler eventLooop thread 
since this is the one that will eventually try to submit the actual job, of 
course my jvm mem settings are enough to reproduce it, for the values pls have 
a look at the jira ticket. Of course this could happen in other cases where jvm 
is running out of memory and at some point this thread needs to allocate more 
memory. Btw I can reproduce it on K8s in a consistent manner, it fails every 
time.
   
   On other thing is that in the code base there are other places where there 
is a join on a thread that will be stopped via the shutdown hook like 
contextCleaner and as I said above shutdownHook does a lot of work eg. the 
SparkContext stop() method does stop a lot of stuff (not to mention there is 
one for Streaming as well).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks (an array 
holds them) which creates an OOM error for the DAGScheduler eventLooop thread 
since this is the one that will eventually try to submit the actual job, of 
course my jvm mem settings are enough to reproduce it, for the values pls have 
a look at the jira ticket. Of course this could happen in other cases where jvm 
is running out of memory and at some point this thread needs to allocate more 
memory. Btw I can reproduce it on K8s in a consistent manner, it fails every 
time.
   On other thing is that in the code base there are many places where there is 
a join on a thread that will be stopped via the shutdown hook like 
contextCleaner.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks (an array 
holds them) which creates an OOM error for the DAGScheduler eventLooop thread 
since this is the one that will eventually try to submit the actual job, of 
course my jvm mem settings are enough to reproduce it, for the values pls have 
a look at the jira ticket. Of course this could happen in other cases where jvm 
is running out of memory and at some point this thread needs to allocate more 
memory. Btw I can reproduce it on K8s in a consistent manner, it fails every 
time.
   
   On other thing is that in the code base there are other places where there 
is a join on a thread that will be stopped via the shutdown hook like 
contextCleaner and as I said above shutdownHook does a lot of work eg. the 
SparkContext stop() method does stop a lot of stuff.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks (an array 
holds them) which creates an OOM error for the DAGScheduler eventLooop thread 
since this is the one that will eventually try to submit the actual job, of 
course my jvm mem settings are enough to reproduce it, for the values pls have 
a look at the jira ticket. Of course this could happen in other cases where jvm 
is running out of memory and at some point this thread needs to allocate more 
memory. Btw I can reproduce it on K8s in a consistent manner, it fails every 
time.
   On other thing is that in the code base there are other places where there 
is a join on a thread that will be stopped via the shutdown hook like 
contextCleaner and as I said above shutdownHook does a lot of work eg. the 
SparkContext stop() method does stop a lot of stuff.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks (an array 
holds them) which creates an OOM error for the DAGScheduler eventLooop thread 
since this is the one that will eventually try to submit the actual job, of 
course my jvm mem settings are enough to reproduce it, for the values pls have 
a look at the jira ticket. Of course this could happen in other cases where jvm 
is running out of memory and at some point this thread needs to allocate more 
memory. Btw I can reproduce it on K8s in a consistent manner, it fails every 
time.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks (an array 
holds them) which creates an OOM error for the DAGScheduler eventLooop thread 
since this is the one that will eventually try to submit the actual job, of 
course my jvm mem settings are enough to reproduce it, for the values pls have 
a look at the jira ticket. Of course this could happen in other cases where jvm 
is running out of memory and at some point this thread needs to allocate more 
memory. Btw I can reproduce it on K8s in a consistent manner.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks (an array 
holds them) which creates an OOM error for the DAGScheduler eventLooop threads 
since this is the one that will eventually try to submit the actual job, of 
course my jvm mem settings are enough to reproduce it, for the values pls have 
a look at the jira ticket. Of course this could happen in other cases where jvm 
is running out of memory and at some point this thread needs to allocate more 
memory. Btw I can reproduce it on K8s in a consistent manner.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks which 
create an OOM error for the DAGScheduler eventLooop threads since this is the 
one that will eventually try to submit the actual job, of course my jvm mem 
settings are enough to reproduce it, for the values pls have a look at the jira 
ticket. Of course this could happen in other cases where jvm is running out of 
memory and at some point this thread needs to allocate more memory. Btw I can 
reproduce it on K8s in a consistent manner.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-07 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499791740
 
 
   @zsxwing I describe how this happened in the jira ticket. I just run Spark 
on K8s SparkPi with 1M as the input parameter. This creates 1M tasks which 
create an OOM error for the DAGScheduler eventLooop threads since this is the 
one that will eventually try to submit the actual job, of course my jvm mem 
settings are enough to reproduce it, for the values pls have a look at the jira 
ticket. Of course this could happen in other cases where jvm is running out of 
memory and at some point this thread needs to allocate more memory.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-05 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499119354
 
 
   @srowen Ideally yes we want the graceful shutdown without this deadlock if 
possible. My concern is can we actually be sure things will not lead to a 
deadlock elsewhere? Probably we need to check the threads allocated in general 
and involved in the shutdown.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-05 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-499119354
 
 
   @srowen Ideally yes we want the graceful shutdown without this deadlock if 
possible. My concern is can we actually be sore things will not lead to a 
deadlock elsewhere?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-04 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-498866549
 
 
   Don't see how these unit tests relate to this PR, weird.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-04 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-498840730
 
 
   @srowen @vanzin @squito @erikerlandson pls review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught exception handler to the driver

2019-06-04 Thread GitBox
skonto edited a comment on issue #24796: [SPARK-27900][CORE] Add uncaught 
exception handler to the driver
URL: https://github.com/apache/spark/pull/24796#issuecomment-498840730
 
 
   @srowen @vanzin @squito pls review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org