clintropolis opened a new pull request #8140: fix issue with 
CuratorLoadQueuePeon shutting down executors it does not own
URL: https://github.com/apache/incubator-druid/pull/8140
 
 
   Fixes #8137.
   
   ### Description
   
   #7088 implemented parallel loading for `CuratorLoadQueuePeon`, but is 
incorrectly shutting down the peon executor and callback executor that is 
shared by _all_ peons in the stop method of _any_ peon. This means that the 
coordinator will operate correctly until a server disappears for any reason, 
which will then lead to an exception of the form:
   
   ```
   2019-07-23T19:17:27,993 ERROR [Coordinator-Exec--0] 
org.apache.druid.server.coordinator.DruidCoordinator - Caught exception, 
ignoring so that schedule keeps going.: 
{class=org.apache.druid.server.coordinator.DruidCoordinator, 
exceptionType=class java.util.concurrent.RejectedExecutionException, 
exceptionMessage=Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7d63afe1 
rejected from 
java.util.concurrent.ScheduledThreadPoolExecutor@576bd596[Terminated, pool size 
= 0, active threads = 0, queued tasks = 0, completed tasks = 72]}
   java.util.concurrent.RejectedExecutionException: Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7d63afe1 
rejected from 
java.util.concurrent.ScheduledThreadPoolExecutor@576bd596[Terminated, pool size 
= 0, active threads = 0, queued tasks = 0, completed tasks = 72]
        at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
 ~[?:1.8.0_192]
        at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) 
~[?:1.8.0_192]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
 ~[?:1.8.0_192]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
 ~[?:1.8.0_192]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:632)
 ~[?:1.8.0_192]
        at 
org.apache.druid.server.coordinator.CuratorLoadQueuePeon.dropSegment(CuratorLoadQueuePeon.java:194)
 ~[classes/:?]
        at 
org.apache.druid.server.coordinator.helper.DruidCoordinatorCleanupUnneeded.run(DruidCoordinatorCleanupUnneeded.java:62)
 ~[classes/:?]
        at 
org.apache.druid.server.coordinator.DruidCoordinator$CoordinatorRunnable.run(DruidCoordinator.java:667)
 [classes/:?]
        at 
org.apache.druid.server.coordinator.DruidCoordinator$2.call(DruidCoordinator.java:559)
 [classes/:?]
        at 
org.apache.druid.server.coordinator.DruidCoordinator$2.call(DruidCoordinator.java:552)
 [classes/:?]
        at 
org.apache.druid.java.util.common.concurrent.ScheduledExecutors$2.run(ScheduledExecutors.java:92)
 [classes/:?]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[?:1.8.0_192]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_192]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
 [?:1.8.0_192]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
 [?:1.8.0_192]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_192]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_192]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_192]
   2019-07-23T19:17:27,994 INFO [Coordinator-Exec--0] 
org.apache.druid.java.util.emitter.core.LoggingEmitter - Event 
[{"feed":"alerts","timestamp":"2019-07-23T19:17:27.994Z","service":"druid/coordinator","host":"localhost:8081","version":"","severity":"component-failure","description":"Caught
 exception, ignoring so that schedule keeps 
going.","data":{"class":"org.apache.druid.server.coordinator.DruidCoordinator","exceptionType":"java.util.concurrent.RejectedExecutionException","exceptionMessage":"Task
 java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7d63afe1 
rejected from 
java.util.concurrent.ScheduledThreadPoolExecutor@576bd596[Terminated, pool size 
= 0, active threads = 0, queued tasks = 0, completed tasks = 
72]","exceptionStackTrace":"java.util.concurrent.RejectedExecutionException: 
Task 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@7d63afe1 
rejected from 
java.util.concurrent.ScheduledThreadPoolExecutor@576bd596[Terminated, pool size 
= 0, active threads = 0, queued tasks = 0, completed tasks = 72]\n\tat 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)\n\tat
 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)\n\tat
 
java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)\n\tat
 
java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)\n\tat
 
java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:632)\n\tat
 
org.apache.druid.server.coordinator.CuratorLoadQueuePeon.dropSegment(CuratorLoadQueuePeon.java:194)\n\tat
 
org.apache.druid.server.coordinator.helper.DruidCoordinatorCleanupUnneeded.run(DruidCoordinatorCleanupUnneeded.java:62)\n\tat
 
org.apache.druid.server.coordinator.DruidCoordinator$CoordinatorRunnable.run(DruidCoordinator.java:667)\n\tat
 
org.apache.druid.server.coordinator.DruidCoordinator$2.call(DruidCoordinator.java:559)\n\tat
 
org.apache.druid.server.coordinator.DruidCoordinator$2.call(DruidCoordinator.java:552)\n\tat
 
org.apache.druid.java.util.common.concurrent.ScheduledExecutors$2.run(ScheduledExecutors.java:92)\n\tat
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat 
java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n\tat
 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n\tat
 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat
 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat
 java.lang.Thread.run(Thread.java:748)\n"}}]
   ```
   
   as described in #8137.  
   
   This PR fixes the issue by not shutting down the executors.
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
   - [x] added unit tests or modified existing tests to cover new code paths.
   - [x] been tested in a laptop test Druid cluster.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to