clintropolis opened a new pull request #8085: fix forking task runner task 
shutdown to be more graceful
URL: https://github.com/apache/incubator-druid/pull/8085
 
 
   Fixes #7962.
   
   ### Description
   
   This PR modifies `ForkingTaskRunner.shutdown` to behave in the same manner 
as `ForkingTaskRunner.stop`, which closes the output stream to the forked task 
instead of calling destroy directly, which retains the log output on that 
stream allowing it to be fully written before terminating the task. I am unable 
to definitively determine if this means the task is not waiting for lifecycle 
stop to complete, or if it's strictly a logging issue, as I have collected 
conflicting observations.
   
   This PR _does_ seem to fix #7962 in my testing with a docker hadoop cluster, 
which lends credibility to lifecycle stop _not_ being called. However, parallel 
index tasks appear to kill all subtasks which _i think_ is done during 
lifecycle stop of the supervisor task, which would mean it's just a logging 
thing. Either way, this patch makes the task logs be complete, so we can rest 
easy _knowing_ that lifecycle stop is happening on the tasks.
   
   #### Background
   While looking into #8075 and #7962, noticed that the task logs stop abruptly 
when `ForkingTaskRunner.shutdown` is called, with an exception appearing in the 
middle-manager logs:
   
   ```
   2019-07-13T05:13:10,009 INFO [forking-task-runner-0] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Logging task 
index_wikiticker_2019-07-13T05:13:09.860Z output to: 
/var/folders/v0/zc_jt5n12d39kx7dn6xmg73w0000gn/T/persistent/task/index_wikiticker_2019-07-13T05:13:09.860Z/log
   2019-07-13T05:13:17,271 INFO [qtp345893819-70] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Shutdown 
[index_wikiticker_2019-07-13T05:13:09.860Z] because: [shut down request via 
HTTP endpoint]
   2019-07-13T05:13:17,271 INFO [qtp345893819-70] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Killing process for 
task: index_wikiticker_2019-07-13T05:13:09.860Z
   2019-07-13T05:13:17,283 INFO [forking-task-runner-0] 
org.apache.druid.indexing.common.tasklogs.FileTaskLogs - Wrote task log to: 
log/index_wikiticker_2019-07-13T05:13:09.860Z.log
   2019-07-13T05:13:17,284 INFO [forking-task-runner-0] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Exception caught during 
execution
   java.io.IOException: Stream closed
       at 
java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:170) 
~[?:1.8.0_192]
       at java.io.BufferedInputStream.read1(BufferedInputStream.java:291) 
~[?:1.8.0_192]
       at java.io.BufferedInputStream.read(BufferedInputStream.java:345) 
~[?:1.8.0_192]
       at java.io.FilterInputStream.read(FilterInputStream.java:107) 
~[?:1.8.0_192]
       at com.google.common.io.ByteStreams.copy(ByteStreams.java:175) 
~[guava-16.0.1.jar:?]
       at 
org.apache.druid.indexing.overlord.ForkingTaskRunner$1.call(ForkingTaskRunner.java:436)
 [classes/:?]
       at 
org.apache.druid.indexing.overlord.ForkingTaskRunner$1.call(ForkingTaskRunner.java:221)
 [classes/:?]
       at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_192]
       at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_192]
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_192]
       at java.lang.Thread.run(Thread.java:748) [?:1.8.0_192]
   2019-07-13T05:13:17,289 INFO [forking-task-runner-0] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Removing task directory: 
/var/folders/v0/zc_jt5n12d39kx7dn6xmg73w0000gn/T/persistent/task/index_wikiticker_2019-07-13T05:13:09.860Z
   2019-07-13T05:13:17,339 INFO [WorkerTaskManager-NoticeHandler] 
org.apache.druid.indexing.worker.WorkerTaskManager - Job's finished. Completed 
[index_wikiticker_2019-07-13T05:13:09.860Z] with status [FAILED]
   ```
   
   and the task log output just ends, without the normal lifecycle stopping 
messaging:
   ```
   ...
   2019-07-13T05:13:17,257 INFO [task-runner-0-priority-0] 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - 
Hydrant[FireHydrant{, 
queryable=wikiticker_2015-09-12T11:00:00.000Z_2015-09-12T12:00:00.000Z_2019-07-13T05:13:09.862Z,
 count=0}] hasn't persisted yet, persisting. 
Segment[wikiticker_2015-09-12T11:00:00.000Z_2015-09-12T12:00:00.000Z_2019-07-13T05:13:09.862Z]
   2019-07-13T05:13:17,257 INFO [task-runner-0-priority-0] 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Submitting 
persist runnable for dataSource[wikiticker]
   2019-07-13T05:13:17,260 INFO [wikiticker-incremental-persist] 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - 
Segment[wikiticker_2015-09-12T08:00:00.000Z_2015-09-12T09:00:00.000Z_2019-07-13T05:13:09.862Z],
 persisting Hydrant[FireHydrant{, 
queryable=wikiticker_2015-09-12T08:00:00.000Z_2015-09-12T09:00:00.000Z_2019-07-13T05:13:09.862Z,
 count=0}]
   ```
   
   However, terminating the entire middle-manager does retain these logs for 
the tasks.
   
   middle-manager log during entire middle-manager shutdown:
   ```
   ...
   2019-07-13T05:15:32,931 INFO [forking-task-runner-1] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Logging task 
index_wikiticker_2019-07-13T05:15:32.875Z output to: 
/var/folders/v0/zc_jt5n12d39kx7dn6xmg73w0000gn/T/persistent/task/index_wikiticker_2019-07-13T05:15:32.875Z/log
   2019-07-13T05:15:40,412 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle - Lifecycle [module] 
running shutdown hook
   2019-07-13T05:15:40,414 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle 
[module] stage [ANNOUNCEMENTS]
   2019-07-13T05:15:40,415 INFO [Thread-49] 
org.apache.druid.curator.discovery.CuratorDruidNodeAnnouncer - Unannouncing 
[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/middlemanager', 
host='localhost', bindOnHost=false, port=-1, plaintextPort=8091, 
enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, 
nodeType='MIDDLE_MANAGER', 
services={workerNodeService=WorkerNodeService{ip='clintbook', capacity=7, 
version='0'}}}].
   2019-07-13T05:15:40,415 INFO [Thread-49] 
org.apache.druid.curator.announcement.Announcer - unannouncing 
[/druid/internal-discovery/MIDDLE_MANAGER/localhost:8091]
   2019-07-13T05:15:40,450 INFO [Thread-49] 
org.apache.druid.curator.discovery.CuratorDruidNodeAnnouncer - Unannounced 
[DiscoveryDruidNode{druidNode=DruidNode{serviceName='druid/middlemanager', 
host='localhost', bindOnHost=false, port=-1, plaintextPort=8091, 
enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, 
nodeType='MIDDLE_MANAGER', 
services={workerNodeService=WorkerNodeService{ip='clintbook', capacity=7, 
version='0'}}}].
   2019-07-13T05:15:40,450 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.curator.announcement.Announcer.stop()] on 
object[org.apache.druid.curator.announcement.Announcer@ebda593].
   2019-07-13T05:15:40,450 INFO [Thread-49] 
org.apache.druid.curator.announcement.Announcer - Stopping announcer
   2019-07-13T05:15:40,452 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle 
[module] stage [SERVER]
   2019-07-13T05:15:40,452 INFO [Thread-49] 
org.apache.druid.server.initialization.jetty.JettyServerModule - Stopping Jetty 
Server...
   2019-07-13T05:15:40,464 INFO [Thread-49] 
org.eclipse.jetty.server.AbstractConnector - Stopped 
ServerConnector@da28d03{HTTP/1.1,[http/1.1]}{0.0.0.0:8091}
   2019-07-13T05:15:40,465 INFO [Thread-49] org.eclipse.jetty.server.session - 
node0 Stopped scavenging
   2019-07-13T05:15:40,469 INFO [Thread-49] 
org.eclipse.jetty.server.handler.ContextHandler - Stopped 
o.e.j.s.ServletContextHandler@33eb6758{/,null,UNAVAILABLE}
   2019-07-13T05:15:40,477 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle 
[module] stage [NORMAL]
   2019-07-13T05:15:40,477 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.indexing.worker.WorkerTaskMonitor.stop() throws 
java.lang.Exception] on 
object[org.apache.druid.indexing.worker.WorkerTaskMonitor@12fcc71f].
   2019-07-13T05:15:40,477 INFO [Thread-49] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Unregistered listener 
[WorkerTaskManager]
   2019-07-13T05:15:40,477 INFO [Thread-49] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Closing output stream to 
task[index_wikiticker_2019-07-13T05:15:32.875Z].
   2019-07-13T05:15:40,478 INFO [Thread-49] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Waiting up to 300,000ms 
for shutdown.
   2019-07-13T05:15:40,697 INFO 
[forking-task-runner-1-[index_wikiticker_2019-07-13T05:15:32.875Z]] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Process exited with 
status[130] for task: index_wikiticker_2019-07-13T05:15:32.875Z
   2019-07-13T05:15:40,699 INFO [forking-task-runner-1] 
org.apache.druid.indexing.common.tasklogs.FileTaskLogs - Wrote task log to: 
log/index_wikiticker_2019-07-13T05:15:32.875Z.log
   2019-07-13T05:15:40,699 INFO [forking-task-runner-1] 
org.apache.druid.indexing.overlord.TaskRunnerUtils - Task 
[index_wikiticker_2019-07-13T05:15:32.875Z] status changed to [FAILED].
   2019-07-13T05:15:40,700 ERROR [forking-task-runner-1] 
com.google.common.util.concurrent.ExecutionList - RuntimeException while 
executing runnable com.google.common.util.concurrent.Futures$4@4b6e74ad with 
executor 
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService@115e899b
   java.util.concurrent.RejectedExecutionException: Task 
org.apache.druid.indexing.worker.WorkerTaskManager$$Lambda$48/474419945@7ade2f5a
 rejected from java.util.concurrent.ThreadPoolExecutor@69048cc1[Terminated, 
pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 5]
        at 
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
 ~[?:1.8.0_192]
        at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) 
~[?:1.8.0_192]
        at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) 
~[?:1.8.0_192]
        at 
java.util.concurrent.Executors$DelegatedExecutorService.execute(Executors.java:668)
 ~[?:1.8.0_192]
        at 
org.apache.druid.indexing.worker.WorkerTaskManager.submitNoticeToExec(WorkerTaskManager.java:180)
 ~[classes/:?]
        at 
org.apache.druid.indexing.worker.WorkerTaskManager.access$000(WorkerTaskManager.java:77)
 ~[classes/:?]
        at 
org.apache.druid.indexing.worker.WorkerTaskManager$2.onSuccess(WorkerTaskManager.java:244)
 ~[classes/:?]
        at 
org.apache.druid.indexing.worker.WorkerTaskManager$2.onSuccess(WorkerTaskManager.java:240)
 ~[classes/:?]
        at com.google.common.util.concurrent.Futures$4.run(Futures.java:1181) 
~[guava-16.0.1.jar:?]
        at 
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
 ~[guava-16.0.1.jar:?]
        at 
com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
 [guava-16.0.1.jar:?]
        at 
com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) 
[guava-16.0.1.jar:?]
        at 
com.google.common.util.concurrent.ListenableFutureTask.done(ListenableFutureTask.java:91)
 [guava-16.0.1.jar:?]
        at 
java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:384) 
[?:1.8.0_192]
        at java.util.concurrent.FutureTask.set(FutureTask.java:233) 
[?:1.8.0_192]
        at java.util.concurrent.FutureTask.run(FutureTask.java:274) 
[?:1.8.0_192]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_192]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_192]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_192]
   2019-07-13T05:15:40,701 INFO [Thread-49] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Finished stopping in 
224ms.
   2019-07-13T05:15:40,701 INFO [Thread-49] 
org.apache.druid.indexing.worker.WorkerTaskManager - Stopped WorkerTaskManager.
   2019-07-13T05:15:40,701 INFO [Thread-49] 
org.apache.druid.indexing.worker.WorkerTaskMonitor - Stopped WorkerTaskMonitor.
   2019-07-13T05:15:40,701 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.discovery.DruidLeaderClient.stop()] on 
object[org.apache.druid.discovery.DruidLeaderClient@1e7ab390].
   2019-07-13T05:15:40,701 INFO [Thread-49] 
org.apache.druid.discovery.DruidLeaderClient - Stopped.
   2019-07-13T05:15:40,701 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.curator.discovery.ServerDiscoverySelector.stop() throws 
java.io.IOException] on 
object[org.apache.druid.curator.discovery.ServerDiscoverySelector@22a6e998].
   2019-07-13T05:15:40,704 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider.stop()] on 
object[org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider@7e74a380].
   2019-07-13T05:15:40,704 INFO [Thread-49] 
org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider - stopping
   2019-07-13T05:15:40,704 INFO [Thread-49] 
org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider - stopped
   2019-07-13T05:15:40,704 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.java.util.http.client.NettyHttpClient.stop()] on 
object[org.apache.druid.java.util.http.client.NettyHttpClient@5bc28f40].
   2019-07-13T05:15:40,730 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.indexing.worker.WorkerCuratorCoordinator.stop()] on 
object[org.apache.druid.indexing.worker.WorkerCuratorCoordinator@290d10ef].
   2019-07-13T05:15:40,730 INFO [Thread-49] 
org.apache.druid.indexing.worker.WorkerCuratorCoordinator - Stopping 
WorkerCuratorCoordinator for worker[localhost:8091]
   2019-07-13T05:15:40,730 INFO [Thread-49] 
org.apache.druid.curator.announcement.Announcer - Stopping announcer
   2019-07-13T05:15:40,731 INFO [Thread-49] 
org.apache.druid.curator.announcement.Announcer - unannouncing 
[/druid/indexer/announcements/localhost:8091]
   2019-07-13T05:15:40,737 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.java.util.metrics.MonitorScheduler.stop()] on 
object[org.apache.druid.java.util.metrics.MonitorScheduler@33899f7a].
   2019-07-13T05:15:40,737 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.java.util.emitter.service.ServiceEmitter.close() throws 
java.io.IOException] on 
object[ServiceEmitter{serviceDimensions={service=druid/middlemanager, 
host=localhost:8091, version=}, 
emitter=LoggingEmitter{log=Logger{name=[org.apache.druid.java.util.emitter.core.LoggingEmitter],
 class[class org.apache.logging.slf4j.Log4jLogger]}, level=INFO}}].
   2019-07-13T05:15:40,737 INFO [Thread-49] 
org.apache.druid.java.util.emitter.core.LoggingEmitter - Close: started [false]
   2019-07-13T05:15:40,737 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.java.util.emitter.core.LoggingEmitter.close()] on 
object[LoggingEmitter{log=Logger{name=[org.apache.druid.java.util.emitter.core.LoggingEmitter],
 class[class org.apache.logging.slf4j.Log4jLogger]}, level=INFO}].
   2019-07-13T05:15:40,738 INFO [Thread-49] 
org.apache.druid.curator.CuratorModule - Stopping Curator
   2019-07-13T05:15:40,739 INFO [Curator-Framework-0] 
org.apache.curator.framework.imps.CuratorFrameworkImpl - 
backgroundOperationsLoop exiting
   2019-07-13T05:15:40,746 INFO [Thread-49] org.apache.zookeeper.ZooKeeper - 
Session: 0x100126ca33300c6 closed
   2019-07-13T05:15:40,747 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle 
[module] stage [INIT]
   2019-07-13T05:15:40,746 INFO [main-EventThread] 
org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 
0x100126ca33300c6
   2019-07-13T05:15:40,747 INFO [Thread-49] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.initialization.Log4jShutterDownerModule$Log4jShutterDowner.stop()]
 on 
object[org.apache.druid.initialization.Log4jShutterDownerModule$Log4jShutterDowner@360bc645].
   ```
   
   and the task logs have their own lifecycle stop messaging:
   ```
   2019-07-13T05:15:40,422 INFO [wikiticker-incremental-persist] 
org.apache.druid.segment.IndexMergerV9 - Completed metric column[count] in 1 
millis.
   2019-07-13T05:15:40,423 INFO [wikiticker-incremental-persist] 
org.apache.druid.segment.IndexMergerV9 - Completed metric column[deleted] in 1 
millis.
   2019-07-13T05:15:40,425 INFO [wikiticker-incremental-persist] 
org.apache.druid.segment.IndexMergerV9 - Completed metric column[delta] in 1 
millis.
   2019-07-13T05:15:40,429 INFO [wikiticker-incremental-persist] 
org.apache.druid.segment.IndexMergerV9 - Completed metric column[user_unique] 
in 4 millis.
   2019-07-13T05:15:40,429 INFO [wikiticker-incremental-persist] 
org.apache.druid.segment.IndexMergerV9 - Completed metric columns in 11 millis.
   2019-07-13T05:15:40,443 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle 
[module] stage [SERVER]
   2019-07-13T05:15:40,443 INFO [Thread-110] 
org.apache.druid.server.initialization.jetty.JettyServerModule - Stopping Jetty 
Server...
   2019-07-13T05:15:40,455 INFO [Thread-110] 
org.eclipse.jetty.server.AbstractConnector - Stopped 
ServerConnector@12192604{HTTP/1.1,[http/1.1]}{0.0.0.0:8100}
   2019-07-13T05:15:40,456 INFO [Thread-110] org.eclipse.jetty.server.session - 
node0 Stopped scavenging
   2019-07-13T05:15:40,458 INFO [wikiticker-incremental-persist] 
org.apache.druid.segment.StringDimensionMergerV9 - Completed dim[channel] 
inverted with cardinality[37] in 28 millis.
   2019-07-13T05:15:40,459 INFO [Thread-110] 
org.eclipse.jetty.server.handler.ContextHandler - Stopped 
o.e.j.s.ServletContextHandler@703fa45{/,null,UNAVAILABLE}
   2019-07-13T05:15:40,468 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle 
[module] stage [NORMAL]
   2019-07-13T05:15:40,468 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.server.listener.announcer.ListenerResourceAnnouncer.stop()] on 
object[org.apache.druid.query.lookup.LookupResourceListenerAnnouncer@3fbe503c].
   2019-07-13T05:15:40,469 INFO [Thread-110] 
org.apache.druid.server.listener.announcer.ListenerResourceAnnouncer - 
Unannouncing start time on 
[/druid/listeners/lookups/__default/http:localhost:8100]
   2019-07-13T05:15:40,469 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.indexing.worker.executor.ExecutorLifecycle.stop() throws 
java.lang.Exception] on 
object[org.apache.druid.indexing.worker.executor.ExecutorLifecycle@62b57479].
   2019-07-13T05:15:40,470 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner.stop()] on 
object[org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner@627ff1b8].
   2019-07-13T05:15:40,471 INFO [Thread-110] 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Starting 
graceful shutdown of task[index_wikiticker_2019-07-13T05:15:32.875Z].
   2019-07-13T05:15:40,471 INFO [Thread-110] 
org.apache.druid.indexing.overlord.TaskRunnerUtils - Task 
[index_wikiticker_2019-07-13T05:15:32.875Z] status changed to [FAILED].
   2019-07-13T05:15:40,473 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.client.cache.CaffeineCache.close()] on 
object[org.apache.druid.client.cache.CaffeineCache@1877f3de].
   2019-07-13T05:15:40,473 INFO [task-runner-0-priority-0] 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting 
down...
   2019-07-13T05:15:40,475 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.discovery.DruidLeaderClient.stop()] on 
object[org.apache.druid.discovery.DruidLeaderClient@526e8108].
   2019-07-13T05:15:40,476 INFO [Thread-110] 
org.apache.druid.discovery.DruidLeaderClient - Stopped.
   2019-07-13T05:15:40,476 INFO [wikiticker-incremental-persist] 
org.apache.druid.segment.StringDimensionMergerV9 - Completed dim[cityName] 
inverted with cardinality[54] in 7 millis.
   2019-07-13T05:15:40,476 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.curator.discovery.ServerDiscoverySelector.stop() throws 
java.io.IOException] on 
object[org.apache.druid.curator.discovery.ServerDiscoverySelector@2ec99035].
   2019-07-13T05:15:40,478 INFO [parent-monitor-0] 
org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Triggering JVM 
shutdown.
   2019-07-13T05:15:40,479 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.java.util.metrics.MonitorScheduler.stop()] on 
object[org.apache.druid.java.util.metrics.MonitorScheduler@5f366587].
   2019-07-13T05:15:40,479 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.query.lookup.LookupReferencesManager.stop()] on 
object[org.apache.druid.query.lookup.LookupReferencesManager@32120956].
   2019-07-13T05:15:40,479 INFO [Thread-110] 
org.apache.druid.query.lookup.LookupReferencesManager - 
LookupExtractorFactoryContainerProvider is stopping.
   2019-07-13T05:15:40,479 INFO 
[LookupExtractorFactoryContainerProvider-MainThread] 
org.apache.druid.query.lookup.LookupReferencesManager - Lookup Management loop 
exited, Lookup notices are not handled anymore.
   2019-07-13T05:15:40,479 INFO [Thread-110] 
org.apache.druid.query.lookup.LookupReferencesManager - 
LookupExtractorFactoryContainerProvider is stopped.
   2019-07-13T05:15:40,479 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.discovery.DruidLeaderClient.stop()] on 
object[org.apache.druid.discovery.DruidLeaderClient@546394ed].
   2019-07-13T05:15:40,479 INFO [Thread-110] 
org.apache.druid.discovery.DruidLeaderClient - Stopped.
   2019-07-13T05:15:40,479 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.curator.discovery.ServerDiscoverySelector.stop() throws 
java.io.IOException] on 
object[org.apache.druid.curator.discovery.ServerDiscoverySelector@438c9aa7].
   2019-07-13T05:15:40,480 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider.stop()] on 
object[org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider@33188612].
   2019-07-13T05:15:40,480 INFO [Thread-110] 
org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider - stopping
   2019-07-13T05:15:40,480 INFO [Thread-110] 
org.apache.druid.curator.discovery.CuratorDruidNodeDiscoveryProvider - stopped
   2019-07-13T05:15:40,480 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.java.util.http.client.NettyHttpClient.stop()] on 
object[org.apache.druid.java.util.http.client.NettyHttpClient@3b42121d].
   2019-07-13T05:15:40,509 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.java.util.emitter.service.ServiceEmitter.close() throws 
java.io.IOException] on 
object[ServiceEmitter{serviceDimensions={service=druid/peon, 
host=localhost:8100, version=}, 
emitter=LoggingEmitter{log=Logger{name=[org.apache.druid.java.util.emitter.core.LoggingEmitter],
 class[class org.apache.logging.slf4j.Log4jLogger]}, level=DEBUG}}].
   2019-07-13T05:15:40,510 INFO [Thread-110] 
org.apache.druid.curator.CuratorModule - Stopping Curator
   2019-07-13T05:15:40,511 INFO [Curator-Framework-0] 
org.apache.curator.framework.imps.CuratorFrameworkImpl - 
backgroundOperationsLoop exiting
   2019-07-13T05:15:40,531 INFO [Thread-110] org.apache.zookeeper.ZooKeeper - 
Session: 0x100126ca33300c8 closed
   2019-07-13T05:15:40,531 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.java.util.emitter.core.LoggingEmitter.close()] on 
object[LoggingEmitter{log=Logger{name=[org.apache.druid.java.util.emitter.core.LoggingEmitter],
 class[class org.apache.logging.slf4j.Log4jLogger]}, level=DEBUG}].
   2019-07-13T05:15:40,531 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle 
[module] stage [INIT]
   2019-07-13T05:15:40,531 INFO [Thread-110] 
org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - 
Invoking stop method[public void 
org.apache.druid.initialization.Log4jShutterDownerModule$Log4jShutterDowner.stop()]
 on 
object[org.apache.druid.initialization.Log4jShutterDownerModule$Log4jShutterDowner@4116f66a].
   2019-07-13T05:15:40,531 INFO [main-EventThread] 
org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 
0x100126ca33300c8
   ```
   
   After modifying `ForkingTaskRunner.shutdown` to use the same mechanism to 
stop the tasks as `ForkingTaskRunner.stop`, the IOException is no longer there, 
and the task logs retain their shutdown messaging.
   
   middle-manager log:
   ```
   2019-07-13T05:22:41,865 INFO [forking-task-runner-0] 
org.apache.druid.indexing.overlord.TaskRunnerUtils - Task 
[index_wikiticker_2019-07-13T05:22:41.728Z] location changed to 
[TaskLocation{host='localhost', port=8100, tlsPort=-1}].
   2019-07-13T05:22:41,867 INFO [forking-task-runner-0] 
org.apache.druid.indexing.overlord.TaskRunnerUtils - Task 
[index_wikiticker_2019-07-13T05:22:41.728Z] status changed to [RUNNING].
   2019-07-13T05:22:41,868 INFO [forking-task-runner-0] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Logging task 
index_wikiticker_2019-07-13T05:22:41.728Z output to: 
/var/folders/v0/zc_jt5n12d39kx7dn6xmg73w0000gn/T/persistent/task/index_wikiticker_2019-07-13T05:22:41.728Z/log
   2019-07-13T05:22:49,258 INFO [qtp345893819-70] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Shutdown 
[index_wikiticker_2019-07-13T05:22:41.728Z] because: [shut down request via 
HTTP endpoint]
   2019-07-13T05:22:49,258 INFO [qtp345893819-70] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Closing output stream to 
task[index_wikiticker_2019-07-13T05:22:41.728Z].
   2019-07-13T05:22:49,347 INFO [qtp345893819-66] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Shutdown 
[index_wikiticker_2019-07-13T05:22:41.728Z] because: [shut down request via 
HTTP endpoint]
   2019-07-13T05:22:49,347 INFO [qtp345893819-66] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Closing output stream to 
task[index_wikiticker_2019-07-13T05:22:41.728Z].
   2019-07-13T05:22:49,547 INFO 
[forking-task-runner-0-[index_wikiticker_2019-07-13T05:22:41.728Z]] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Process exited with 
status[2] for task: index_wikiticker_2019-07-13T05:22:41.728Z
   2019-07-13T05:22:49,558 INFO [forking-task-runner-0] 
org.apache.druid.indexing.common.tasklogs.FileTaskLogs - Wrote task log to: 
log/index_wikiticker_2019-07-13T05:22:41.728Z.log
   2019-07-13T05:22:49,558 INFO [forking-task-runner-0] 
org.apache.druid.indexing.overlord.TaskRunnerUtils - Task 
[index_wikiticker_2019-07-13T05:22:41.728Z] status changed to [FAILED].
   2019-07-13T05:22:49,559 INFO [forking-task-runner-0] 
org.apache.druid.indexing.overlord.ForkingTaskRunner - Removing task directory: 
/var/folders/v0/zc_jt5n12d39kx7dn6xmg73w0000gn/T/persistent/task/index_wikiticker_2019-07-13T05:22:41.728Z
   2019-07-13T05:22:49,591 INFO [WorkerTaskManager-NoticeHandler] 
org.apache.druid.indexing.worker.WorkerTaskManager - Job's finished. Completed 
[index_wikiticker_2019-07-13T05:22:41.728Z] with status [FAILED]
   2019-07-13T05:23:20,800 INFO [MonitorScheduler-0] 
org.apache.druid.java.util.emitter.core.LoggingEmitter - Event 
[{"feed":"metrics","timestamp":"2019-07-13T05:23:20.792Z","service":"druid/middlemanager","host":"localhost:8091","version":"","metric":"jetty/numOpenConnections","value":2}]
   2019-07-13T05:23:21,252 ERROR [WorkerTaskManager-CompletedTasksCleaner] 
org.apache.druid.curator.discovery.ServerDiscoverySelector - No server instance 
found for [druid/overlord]
   2019-07-13T05:23:21,382 INFO [WorkerTaskManager-CompletedTasksCleaner] 
org.apache.druid.indexing.worker.WorkerTaskManager - Deleting completed 
task[index_wikiticker_2019-07-13T05:22:41.728Z] information, overlord task 
status[FAILED]
   ```
   
   and the task log has the same lifecycle stop messaging that appears when the 
entire middle-manager is terminated, and the tasks still retain the 'failed' 
status.
   
   <hr>
   
   This PR has:
   - [x] been self-reviewed.
   - [x] been tested in a test Druid cluster.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to