mridulm commented on PR #37779:
URL: https://github.com/apache/spark/pull/37779#issuecomment-1258716473

   Added a few debug statements, and it became clear what the issue is.
   Essentially, since we are leveraging a `ThreadPoolExecutor`, it does not 
result in killing the thread with the exception/error thrown - but rather, will 
call `ThreadPoolExecutor.afterExecute` with the cause for failure (See 
`runWorker` for more).
   
   We should be overriding this, and invoke our `uncaughtExceptionHandler` when 
an exception is thrown.
   
   
   In `receiveLoopRunnable` when a `Throwable` is thrown:
   
   ```
   22/09/26 17:17:12 INFO DedicatedMessageLoop: Current exceptionHandler = 
org.apache.spark.util.SparkUncaughtExceptionHandler@27c71f14
   22/09/26 17:17:12 INFO DedicatedMessageLoop: Thread = 
Thread[dispatcher-Executor,5,main]
   22/09/26 17:17:12 INFO DedicatedMessageLoop: Stack ...
   java.lang.Exception: For stack
           at 
org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:56)
           at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
           at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
           at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
           at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
           at java.base/java.lang.Thread.run(Thread.java:829)
   
   ```
   
   In `receiveLoopRunnable`'s `run`, when a `Throwable` is thrown:
   ```
   2/09/26 17:17:12 INFO DedicatedMessageLoop: Thread = 
Thread[dispatcher-Executor,5,main], stackTrace =
   22/09/26 17:17:12 INFO DedicatedMessageLoop:     
[email protected]/java.lang.Thread.dumpThreads(Native Method)
   22/09/26 17:17:12 INFO DedicatedMessageLoop:     
[email protected]/java.lang.Thread.getAllStackTraces(Thread.java:1653)
   22/09/26 17:17:12 INFO DedicatedMessageLoop:     
app//org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$dumpAllStackTraces(MessageLoop.scala:70)
   22/09/26 17:17:12 INFO DedicatedMessageLoop:     
app//org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:58)
   22/09/26 17:17:12 INFO DedicatedMessageLoop:     
[email protected]/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
   22/09/26 17:17:12 INFO DedicatedMessageLoop:     
[email protected]/java.util.concurrent.FutureTask.run(FutureTask.java:264)
   22/09/26 17:17:12 INFO DedicatedMessageLoop:     
[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   22/09/26 17:17:12 INFO DedicatedMessageLoop:     
[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   
   ```
   
   
   and finally, a few seconds after Executor Inbox failure - dumping all 
threads in a new thread.
   ```
   22/09/26 17:17:14 INFO DedicatedMessageLoop: Thread = 
Thread[dispatcher-Executor,5,main], stackTrace =
   22/09/26 17:17:14 INFO DedicatedMessageLoop:     
[email protected]/jdk.internal.misc.Unsafe.park(Native Method)
   22/09/26 17:17:14 INFO DedicatedMessageLoop:     
[email protected]/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
   22/09/26 17:17:14 INFO DedicatedMessageLoop:     
[email protected]/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2081)
   22/09/26 17:17:14 INFO DedicatedMessageLoop:     
[email protected]/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:433)
   22/09/26 17:17:14 INFO DedicatedMessageLoop:     
app//org.apache.spark.rpc.netty.MessageLoop.org$apache$spark$rpc$netty$MessageLoop$$receiveLoop(MessageLoop.scala:102)
   22/09/26 17:17:14 INFO DedicatedMessageLoop:     
app//org.apache.spark.rpc.netty.MessageLoop$$anon$1.run(MessageLoop.scala:45)
   22/09/26 17:17:14 INFO DedicatedMessageLoop:     
[email protected]/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   22/09/26 17:17:14 INFO DedicatedMessageLoop:     
[email protected]/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   22/09/26 17:17:14 INFO DedicatedMessageLoop:     
[email protected]/java.lang.Thread.run(Thread.java:829)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to