tanisdlj opened a new issue #10867:
URL: https://github.com/apache/druid/issues/10867


   ### Affected Version
   
   0.20.0
   
   ### Description
   
   We've seen tasks failing randomly. Didn't managed to get a pattern or cause, 
so I'm thinking might be some sort of bug.
   Cluster size: 27 Dataservers (Middlemanager + Historical), 2 Brokers, 2 
Routers, 2 Overlords, 2 Coordinator, 3 Zookeeper (in dedicated hosts).
   Might or might not be related with the other issue I created #10866 
   
   Failure log:
   ```
   2021-02-08T18:32:17,487 INFO [task-runner-0-priority-0] 
org.apache.druid.curator.announcement.Announcer - Unannouncing 
[/druid/segments/druid-dataserver-8.c.maximal-record-121815.internal:8103/druid-dataserver-8.c.maximal-record-121815.internal:8103_indexer-executor__default_tier_2021-02-08T14:26:18.432Z_7327df2e309f4740acae7d41c5a365a50]
   2021-02-08T18:32:17,600 WARN 
[[index_kafka_ssp-events-hourly_8634a3382846e76_bnnomidg]-appenderator-merge] 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Failed to 
push merged index for 
segment[ssp-events-hourly_2021-02-08T17:00:00.000Z_2021-02-08T18:00:00.000Z_2021-02-08T17:24:56.150Z_143].
   java.nio.channels.ClosedByInterruptException: null
        at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
 ~[?:1.8.0_275]
        at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:216) 
~[?:1.8.0_275]
        at org.apache.druid.io.Channels.writeFully(Channels.java:32) 
~[druid-core-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.writeout.FileWriteOutBytes.flush(FileWriteOutBytes.java:62)
 ~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.writeout.FileWriteOutBytes.flushIfNeeded(FileWriteOutBytes.java:54)
 ~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.writeout.FileWriteOutBytes.write(FileWriteOutBytes.java:86)
 ~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.data.GenericIndexedWriter$1.writeTo(GenericIndexedWriter.java:123)
 ~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.data.GenericIndexedWriter$1.writeTo(GenericIndexedWriter.java:96)
 ~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.data.GenericIndexedWriter.write(GenericIndexedWriter.java:237)
 ~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.data.BlockLayoutColumnarDoublesSerializer.add(BlockLayoutColumnarDoublesSerializer.java:95)
 ~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.DoubleColumnSerializer.serialize(DoubleColumnSerializer.java:83)
 ~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.IndexMergerV9.mergeIndexesAndWriteColumns(IndexMergerV9.java:573)
 ~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:267) 
~[druid-processing-0.20.0.jar:0.20.0]
        at org.apache.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:999) 
~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:917)
 ~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:895)
 ~[druid-processing-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:769)
 ~[druid-server-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$push$1(AppenderatorImpl.java:660)
 ~[druid-server-0.20.0.jar:0.20.0]
        at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) 
[guava-16.0.1.jar:?]
        at 
com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861)
 [guava-16.0.1.jar:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_275]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_275]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
   2021-02-08T18:32:17,743 ERROR 
[[index_kafka_ssp-events-hourly_8634a3382846e76_bnnomidg]-appenderator-persist] 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Incremental 
persist failed: 
{class=org.apache.druid.segment.realtime.appenderator.AppenderatorImpl, 
segment=ssp-events-hourly_2021-02-08T17:00:00.000Z_2021-02-08T18:00:00.000Z_2021-02-08T17:24:56.150Z_194,
 dataSource=ssp-events-hourly, count=0}
   2021-02-08T18:32:17,745 ERROR 
[[index_kafka_ssp-events-hourly_8634a3382846e76_bnnomidg]-appenderator-persist] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - 
Persist failed, dying
   2021-02-08T18:32:17,760 INFO [task-runner-0-priority-0] 
org.apache.druid.curator.announcement.Announcer - Unannouncing 
[/druid/internal-discovery/PEON/druid-dataserver-8.c.maximal-record-121815.internal:8103]
   2021-02-08T18:32:17,772 INFO [task-runner-0-priority-0] 
org.apache.druid.curator.discovery.CuratorDruidNodeAnnouncer - Unannounced self 
[{"druidNode":{"service":"druid/middleManager","host":"druid-dataserver-8.c.maximal-record-121815.internal","bindOnHost":false,"plaintextPort":8103,"port":-1,"tlsPort":-1,"enablePlaintextPort":true,"enableTlsPort":false},"nodeType":"peon","services":{"dataNodeService":{"type":"dataNodeService","tier":"_default_tier","maxSize":0,"type":"indexer-executor","priority":0},"lookupNodeService":{"type":"lookupNodeService","lookupTier":"__default"}}}].
   2021-02-08T18:32:17,772 INFO [task-runner-0-priority-0] 
org.apache.druid.curator.announcement.Announcer - Unannouncing 
[/druid/announcements/druid-dataserver-8.c.maximal-record-121815.internal:8103]
   2021-02-08T18:32:17,782 ERROR [task-runner-0-priority-0] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - 
Encountered exception while running task.
   java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: 
Got Interrupted while adding to the Queue
        at 
org.apache.druid.segment.realtime.appenderator.StreamAppenderatorDriver.persist(StreamAppenderatorDriver.java:239)
 ~[druid-server-0.20.0.jar:0.20.0]
        at 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.runInternal(SeekableStreamIndexTaskRunner.java:756)
 ~[druid-indexing-service-0.20.0.jar:0.20.0]
        at 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.run(SeekableStreamIndexTaskRunner.java:267)
 [druid-indexing-service-0.20.0.jar:0.20.0]
        at 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTask.run(SeekableStreamIndexTask.java:145)
 [druid-indexing-service-0.20.0.jar:0.20.0]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:451)
 [druid-indexing-service-0.20.0.jar:0.20.0]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:423)
 [druid-indexing-service-0.20.0.jar:0.20.0]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_275]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_275]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_275]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
   Caused by: java.util.concurrent.RejectedExecutionException: Got Interrupted 
while adding to the Queue
        at 
org.apache.druid.java.util.common.concurrent.Execs$1.rejectedExecution(Execs.java:152)
 ~[druid-core-0.20.0.jar:0.20.0]
        at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) 
~[?:1.8.0_275]
        at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) 
~[?:1.8.0_275]
        at 
com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:484)
 ~[guava-16.0.1.jar:?]
        at 
java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
 ~[?:1.8.0_275]
        at 
com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:58)
 ~[guava-16.0.1.jar:?]
        at 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.persistAll(AppenderatorImpl.java:539)
 ~[druid-server-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.realtime.appenderator.StreamAppenderatorDriver.persist(StreamAppenderatorDriver.java:231)
 ~[druid-server-0.20.0.jar:0.20.0]
        ... 9 more
   Caused by: java.lang.InterruptedException
        at java.util.concurrent.SynchronousQueue.put(SynchronousQueue.java:879) 
~[?:1.8.0_275]
        at 
org.apache.druid.java.util.common.concurrent.Execs$1.rejectedExecution(Execs.java:149)
 ~[druid-core-0.20.0.jar:0.20.0]
        at 
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) 
~[?:1.8.0_275]
        at 
java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1379) 
~[?:1.8.0_275]
        at 
com.google.common.util.concurrent.MoreExecutors$ListeningDecorator.execute(MoreExecutors.java:484)
 ~[guava-16.0.1.jar:?]
        at 
java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:134)
 ~[?:1.8.0_275]
        at 
com.google.common.util.concurrent.AbstractListeningExecutorService.submit(AbstractListeningExecutorService.java:58)
 ~[guava-16.0.1.jar:?]
        at 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.persistAll(AppenderatorImpl.java:539)
 ~[druid-server-0.20.0.jar:0.20.0]
        at 
org.apache.druid.segment.realtime.appenderator.StreamAppenderatorDriver.persist(StreamAppenderatorDriver.java:231)
 ~[druid-server-0.20.0.jar:0.20.0]
        ... 9 more
   2021-02-08T18:32:17,801 INFO [parent-monitor-0] 
org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Triggering JVM 
shutdown.
   2021-02-08T18:32:17,839 INFO [Thread-60] org.apache.druid.cli.CliPeon - 
Running shutdown hook
   2021-02-08T18:32:17,840 INFO [Thread-60] 
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle 
[module] stage [ANNOUNCEMENTS]
   2021-02-08T18:32:17,841 INFO [Thread-60] 
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle 
[module] stage [SERVER]
   2021-02-08T18:32:17,854 INFO [task-runner-0-priority-0] 
org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed 
with status: {
     "id" : "index_kafka_ssp-events-hourly_8634a3382846e76_bnnomidg",
     "status" : "FAILED",
     "duration" : 14763329,
     "errorMsg" : "java.lang.RuntimeException: 
java.util.concurrent.RejectedExecutionException: Got Interrupted while a...",
     "location" : {
       "host" : null,
       "port" : -1,
       "tlsPort" : -1
     }
   }
   2021-02-08T18:32:17,863 INFO [main] org.apache.druid.cli.CliPeon - Thread 
[Thread[Thread-60,5,main]] is non daemon.
   2021-02-08T18:32:17,863 INFO [main] 
org.apache.druid.java.util.common.lifecycle.Lifecycle - Lifecycle [module] 
already stopped and stop was called. Silently skipping
   Cannot remove shutdown hook, already shutting down!
   Finished peon task
   2021-02-08T18:32:18,709 WARN [MonitorScheduler-0] 
org.apache.druid.indexing.common.stats.TaskRealtimeMetricsMonitor - [712] 
events thrown away. Possible causes: null events, events filtered out by 
transformSpec, or events outside earlyMessageRejectionPeriod / 
lateMessageRejectionPeriod.
   2021-02-08T18:32:47,938 INFO [Thread-60] 
org.eclipse.jetty.server.AbstractConnector - Stopped 
ServerConnector@1e5b33e5{HTTP/1.1, (http/1.1)}{0.0.0.0:8103}
   2021-02-08T18:32:47,939 INFO [Thread-60] org.eclipse.jetty.server.session - 
node0 Stopped scavenging
   2021-02-08T18:32:47,952 INFO [Thread-60] 
org.eclipse.jetty.server.handler.ContextHandler - Stopped 
o.e.j.s.ServletContextHandler@79e2606f{/,null,UNAVAILABLE}
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to