saifat29 edited a comment on issue #9842:
URL: https://github.com/apache/druid/issues/9842#issuecomment-678620228
I too faced the same issue on Druid v0.17.0 -
Data is ingested into Druid through Kafka realtime ingestion with rollups
enabled.
Suddenly the task failed and segments got lost.
From the below logs it is evident that there was a null exception in the
group by query.
Attaching the surrounding relevant logs-
```
2020-08-21T03:14:31,451 INFO
[[index_kafka_aggregations_95b07d728be26d3_iacokiha]-publish]
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Saved
sequence metadata to disk: [SequenceMetadata{sequenceId=2,
sequenceName='index_kafka_aggregations_95b07d728be26d3_2', assignments=[],
startOffsets={0=2073149116, 1=2073148653}, exclusiveStartPartitions=[],
endOffsets={0=2077740467, 1=2077737653}, sentinel=false, checkpointed=true}]
2020-08-21T03:14:31,451 INFO
[[index_kafka_aggregations_95b07d728be26d3_iacokiha]-publish]
org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier
- Adding SegmentHandoffCallback for dataSource[aggregations]
Segment[SegmentDescriptor{interval=2020-08-21T02:00:00.000Z/2020-08-21T03:00:00.000Z,
version='2020-08-21T02:00:10.526Z', partitionNumber=1}]
2020-08-21T03:15:00,801 INFO [coordinator_handoff_scheduled_0]
org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier
- Still waiting for Handoff for Segments :
[[SegmentDescriptor{interval=2020-08-21T02:00:00.000Z/2020-08-21T03:00:00.000Z,
version='2020-08-21T02:00:10.526Z', partitionNumber=1}]]
2020-08-21T03:16:00,784 INFO [coordinator_handoff_scheduled_0]
org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier
- Segment Handoff complete for dataSource[aggregations]
Segment[SegmentDescriptor{interval=2020-08-21T02:00:00.000Z/2020-08-21T03:00:00.000Z,
version='2020-08-21T02:00:10.526Z', partitionNumber=1}]
2020-08-21T03:16:00,787 INFO
[[index_kafka_aggregations_95b07d728be26d3_iacokiha]-appenderator-persist]
org.apache.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing
segment[aggregations_2020-08-21T02:00:00.000Z_2020-08-21T03:00:00.000Z_2020-08-21T02:00:10.526Z_1]
at
path[/druid/segments/middlemanager:8102/middlemanager:8102_indexer-executor__default_tier_2020-08-21T01:46:10.992Z_72a1a95bc07543efaa1fe266685d72e40]
2020-08-21T03:16:00,832 ERROR [processing-0]
org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2 -
Exception with one of the sequences!
java.lang.NullPointerException: null
at
org.apache.druid.segment.realtime.FireHydrant.getIncrementedSegment(FireHydrant.java:78)
~[druid-server-0.17.0.jar:0.17.0]
at
org.apache.druid.segment.realtime.FireHydrant.getAndIncrementSegment(FireHydrant.java:138)
~[druid-server-0.17.0.jar:0.17.0]
at
org.apache.druid.segment.realtime.appenderator.SinkQuerySegmentWalker.lambda$null$0(SinkQuerySegmentWalker.java:212)
~[druid-server-0.17.0.jar:0.17.0]
at com.google.common.collect.Iterators$8.transform(Iterators.java:794)
~[guava-16.0.1.jar:?]
at
com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
~[guava-16.0.1.jar:?]
at
org.apache.druid.query.SinkQueryRunners$1.next(SinkQueryRunners.java:56)
~[druid-processing-0.17.0.jar:0.17.0]
at
org.apache.druid.query.SinkQueryRunners$1.next(SinkQueryRunners.java:46)
~[druid-processing-0.17.0.jar:0.17.0]
at
com.google.common.collect.Iterators$7.computeNext(Iterators.java:646)
~[guava-16.0.1.jar:?]
at
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
~[guava-16.0.1.jar:?]
at
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
~[guava-16.0.1.jar:?]
at
com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:43)
~[guava-16.0.1.jar:?]
at com.google.common.collect.Iterators.addAll(Iterators.java:356)
~[guava-16.0.1.jar:?]
at com.google.common.collect.Lists.newArrayList(Lists.java:147)
~[guava-16.0.1.jar:?]
at com.google.common.collect.Lists.newArrayList(Lists.java:129)
~[guava-16.0.1.jar:?]
at
org.apache.druid.query.ChainedExecutionQueryRunner$1.make(ChainedExecutionQueryRunner.java:104)
~[druid-processing-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.BaseSequence.accumulate(BaseSequence.java:39)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.LazySequence.accumulate(LazySequence.java:40)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.LazySequence.accumulate(LazySequence.java:40)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.SequenceWrapper.wrap(SequenceWrapper.java:55)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.query.CPUTimeMetricQueryRunner$1.wrap(CPUTimeMetricQueryRunner.java:74)
~[druid-processing-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.query.spec.SpecificSegmentQueryRunner$1.accumulate(SpecificSegmentQueryRunner.java:79)
~[druid-processing-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.WrappingSequence$1.get(WrappingSequence.java:50)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.query.spec.SpecificSegmentQueryRunner.doNamed(SpecificSegmentQueryRunner.java:163)
~[druid-processing-0.17.0.jar:0.17.0]
at
org.apache.druid.query.spec.SpecificSegmentQueryRunner.access$100(SpecificSegmentQueryRunner.java:42)
~[druid-processing-0.17.0.jar:0.17.0]
at
org.apache.druid.query.spec.SpecificSegmentQueryRunner$2.wrap(SpecificSegmentQueryRunner.java:145)
~[druid-processing-0.17.0.jar:0.17.0]
at
org.apache.druid.java.util.common.guava.WrappingSequence.accumulate(WrappingSequence.java:45)
~[druid-core-0.17.0.jar:0.17.0]
at
org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:246)
[druid-processing-0.17.0.jar:0.17.0]
at
org.apache.druid.query.groupby.epinephelinae.GroupByMergingQueryRunnerV2$1$1$1.call(GroupByMergingQueryRunnerV2.java:233)
[druid-processing-0.17.0.jar:0.17.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[?:1.8.0_232]
at
org.apache.druid.query.PrioritizedListenableFutureTask.run(PrioritizedExecutorService.java:247)
[druid-processing-0.17.0.jar:0.17.0]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_232]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_232]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
2020-08-21T03:16:00,873 ERROR
[qtp1136890657-129[groupBy_[aggregations]_17bc90cf-a1d9-4380-bcc5-a1b565b5c91a]]
org.apache.druid.server.QueryResource - Exception handling request:
{class=org.apache.druid.server.QueryResource, exceptionType=class
java.lang.RuntimeException,
exceptionMessage=java.util.concurrent.ExecutionException:
java.lang.RuntimeException: java.lang.NullPointerException,
query={"queryType":"groupBy","dataSource":{"type":"table","name":"aggregations"},"intervals":{"type":"segments","segments":[{"itvl":"2020-08-21T02:00:00.000Z/2020-08-21T03:00:00.000Z","ver":"2020-08-21T02:00:10.526Z","part":1},{"itvl":"2020-08-21T02:00:00.000Z/2020-08-21T03:00:00.000Z","ver":"2020-08-21T02:00:10.526Z","part":2}]},"virtualColumns":[],"filter":null,"granularity":{"type":"all"},"dimensions":[{"type":"default","dimension":"d_id","outputName":"d0","outputType":"LONG"}],"aggregations":[{"type":"doubleSum","name":"a0","fieldName":"s_dr","expression":null}],"postAggregations":[],"having":
null,"limitSpec":{"type":"NoopLimitSpec"},"context":{"applyLimitPushDown":false,"defaultTimeout":300000,"finalize":false,"fudgeTimestamp":"1597968000000","groupByOutermost":false,"groupByStrategy":"v2","maxQueuedBytes":3333333,"maxScatterGatherBytes":9223372036854775807,"queryFailTime":1597980040147,"queryId":"17bc90cf-a1d9-4380-bcc5-a1b565b5c91a","resultAsArray":true,"sqlQueryId":"a371d117-bffa-4a5f-9115-d8812eb489b0","timeout":299998},"descending":false},
peer=10.0.7.4} (java.lang.RuntimeException:
java.util.concurrent.ExecutionException: java.lang.RuntimeException:
java.lang.NullPointerException)
2020-08-21T03:16:01,000 INFO
[[index_kafka_aggregations_95b07d728be26d3_iacokiha]-appenderator-persist]
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Dropped
segment[aggregations_2020-08-21T02:00:00.000Z_2020-08-21T03:00:00.000Z_2020-08-21T02:00:10.526Z_1].
2020-08-21T03:16:16,707 INFO [parent-monitor-0]
org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Triggering JVM
shutdown.
2020-08-21T03:16:16,708 INFO [Thread-70] org.apache.druid.cli.CliPeon -
Running shutdown hook
2020-08-21T03:16:16,708 INFO [Thread-70]
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle
[module] stage [ANNOUNCEMENTS]
2020-08-21T03:16:16,709 INFO [Thread-70]
org.apache.druid.curator.announcement.Announcer - Unannouncing
[/druid/announcements/middlemanager:8102]
2020-08-21T03:16:16,722 INFO [Thread-70]
org.apache.druid.curator.announcement.Announcer - Unannouncing
[/druid/listeners/lookups/__default/http:middlemanager:8102]
2020-08-21T03:16:16,723 INFO [Thread-70]
org.apache.druid.curator.announcement.Announcer - Unannouncing
[/druid/segments/middlemanager:8102/middlemanager:8102_indexer-executor__default_tier_2020-08-21T01:46:10.992Z_72a1a95bc07543efaa1fe266685d72e40]
2020-08-21T03:16:16,724 INFO [Thread-70]
org.apache.druid.curator.announcement.Announcer - Unannouncing
[/druid/internal-discovery/PEON/middlemanager:8102]
2020-08-21T03:16:16,725 INFO [Thread-70]
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle
[module] stage [SERVER]
2020-08-21T03:16:16,733 INFO [Thread-70]
org.eclipse.jetty.server.AbstractConnector - Stopped
ServerConnector@b2da3a5{HTTP/1.1,[http/1.1]}{0.0.0.0:8102}
2020-08-21T03:16:16,734 INFO [Thread-70] org.eclipse.jetty.server.session -
node0 Stopped scavenging
2020-08-21T03:16:16,736 INFO [Thread-70]
org.eclipse.jetty.server.handler.ContextHandler - Stopped
o.e.j.s.ServletContextHandler@1b485c8b{/,null,UNAVAILABLE}
2020-08-21T03:16:16,740 INFO [Thread-70]
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle
[module] stage [NORMAL]
2020-08-21T03:16:16,741 INFO [Thread-70]
org.apache.druid.server.listener.announcer.ListenerResourceAnnouncer -
Unannouncing start time on
[/druid/listeners/lookups/__default/http:middlemanager:8102]
2020-08-21T03:16:16,743 INFO [Thread-70]
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Starting
graceful shutdown of task[index_kafka_aggregations_95b07d728be26d3_iacokiha].
2020-08-21T03:16:16,743 INFO [Thread-70]
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner -
Stopping forcefully (status: [PUBLISHING])
2020-08-21T03:16:16,750 ERROR [task-runner-0-priority-0]
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Error
while publishing segments for sequenceNumber[SequenceMetadata{sequenceId=2,
sequenceName='index_kafka_aggregations_95b07d728be26d3_2', assignments=[],
startOffsets={0=2073149116, 1=2073148653}, exclusiveStartPartitions=[],
endOffsets={0=2077740467, 1=2077737653}, sentinel=false, checkpointed=true}]
java.util.concurrent.CancellationException: Task was cancelled.
at
com.google.common.util.concurrent.AbstractFuture.cancellationExceptionWithCause(AbstractFuture.java:392)
~[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306)
~[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
~[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
~[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135)
~[guava-16.0.1.jar:?]
at com.google.common.util.concurrent.Futures$4.run(Futures.java:1170)
[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.AbstractFuture.cancel(AbstractFuture.java:134)
[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.Futures$ChainingListenableFuture.cancel(Futures.java:826)
[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.Futures$CombinedFuture$1.run(Futures.java:1505)
[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.AbstractFuture.cancel(AbstractFuture.java:134)
[guava-16.0.1.jar:?]
at
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.runInternal(SeekableStreamIndexTaskRunner.java:891)
[druid-indexing-service-0.17.0.jar:0.17.0]
at
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.run(SeekableStreamIndexTaskRunner.java:278)
[druid-indexing-service-0.17.0.jar:0.17.0]
at
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTask.run(SeekableStreamIndexTask.java:164)
[druid-indexing-service-0.17.0.jar:0.17.0]
at
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419)
[druid-indexing-service-0.17.0.jar:0.17.0]
at
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391)
[druid-indexing-service-0.17.0.jar:0.17.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
[?:1.8.0_232]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[?:1.8.0_232]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[?:1.8.0_232]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
Caused by: java.util.concurrent.CancellationException: Future.cancel() was
called.
at
com.google.common.util.concurrent.AbstractFuture$Sync.complete(AbstractFuture.java:378)
~[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.AbstractFuture$Sync.cancel(AbstractFuture.java:355)
~[guava-16.0.1.jar:?]
at
com.google.common.util.concurrent.AbstractFuture.cancel(AbstractFuture.java:131)
~[guava-16.0.1.jar:?]
... 15 more
2020-08-21T03:16:16,752 INFO [Thread-70]
org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager
- CoordinatorPollingBasicAuthenticatorCacheManager is stopping.
2020-08-21T03:16:16,752 INFO [Thread-70]
org.apache.druid.security.basic.authentication.db.cache.CoordinatorPollingBasicAuthenticatorCacheManager
- CoordinatorPollingBasicAuthenticatorCacheManager is stopped.
2020-08-21T03:16:16,752 INFO [Thread-70]
org.apache.druid.security.basic.authorization.db.cache.CoordinatorPollingBasicAuthorizerCacheManager
- CoordinatorPollingBasicAuthorizerCacheManager is stopping.
2020-08-21T03:16:16,752 INFO [task-runner-0-priority-0]
org.apache.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing
segment[aggregations_2020-08-21T02:00:00.000Z_2020-08-21T03:00:00.000Z_2020-08-21T02:00:10.526Z_2]
at
path[/druid/segments/middlemanager:8102/middlemanager:8102_indexer-executor__default_tier_2020-08-21T01:46:10.992Z_72a1a95bc07543efaa1fe266685d72e40]
2020-08-21T03:16:16,752 INFO [Thread-70]
org.apache.druid.security.basic.authorization.db.cache.CoordinatorPollingBasicAuthorizerCacheManager
- CoordinatorPollingBasicAuthorizerCacheManager is stopped.
2020-08-21T03:16:16,752 INFO
[LookupExtractorFactoryContainerProvider-MainThread]
org.apache.druid.query.lookup.LookupReferencesManager - Lookup Management loop
exited. Lookup notices are not handled anymore.
2020-08-21T03:16:16,753 INFO [task-runner-0-priority-0]
org.apache.druid.curator.discovery.CuratorDruidNodeAnnouncer - Unannounced self
[{"druidNode":{"service":"druid/middleManager","host":"middlemanager","bindOnHost":false,"plaintextPort":8102,"port":-1,"tlsPort":-1,"enablePlaintextPort":true,"enableTlsPort":false},"nodeType":"peon","services":{"dataNodeService":{"type":"dataNodeService","tier":"_default_tier","maxSize":0,"type":"indexer-executor","priority":0},"lookupNodeService":{"type":"lookupNodeService","lookupTier":"__default"}}}].
2020-08-21T03:16:16,771 INFO [task-runner-0-priority-0]
org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed
with status: {
"id" : "index_kafka_aggregations_95b07d728be26d3_iacokiha",
"status" : "SUCCESS",
"duration" : 5416234,
"errorMsg" : null,
"location" : {
"host" : null,
"port" : -1,
"tlsPort" : -1
}
}
2020-08-21T03:16:16,774 INFO [main] org.apache.druid.cli.CliPeon - Thread
[Thread[Thread-70,5,main]] is non daemon.
2020-08-21T03:16:16,774 INFO [main]
org.apache.druid.java.util.common.lifecycle.Lifecycle - Lifecycle [module]
already stopped and stop was called. Silently skipping
Cannot remove shutdown hook, already shutting down!
Finished peon task
2020-08-21T03:16:16,801 INFO [Curator-Framework-0]
org.apache.curator.framework.imps.CuratorFrameworkImpl -
backgroundOperationsLoop exiting
2020-08-21T03:16:16,804 INFO [Thread-70] org.apache.zookeeper.ZooKeeper -
Session: 0x1013a00d03800a8 closed
2020-08-21T03:16:16,804 INFO [main-EventThread]
org.apache.zookeeper.ClientCnxn - EventThread shut down for session:
0x1013a00d03800a8
2020-08-21T03:16:16,804 INFO [Thread-70]
org.apache.druid.java.util.common.lifecycle.Lifecycle - Stopping lifecycle
[module] stage [INIT]
```
This issue is very critical as ingestion tasks have become a gamble.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]