gianm opened a new issue, #14520:
URL: https://github.com/apache/druid/issues/14520

   After upgrading to datasketches 4.0.0 (#14334) we are seeing an NPE during 
ingestion. Stack trace below.
   
   It looks like `union.update(sketch)` in DataSketches 3.x accepts null 
`sketch`, but `union.union(sketch)` in DataSketches 4.x doesn't. Here's the 
code from DataSketches 3.3.0: 
https://github.com/apache/datasketches-java/blob/3.3.0/src/main/java/org/apache/datasketches/quantiles/DoublesUnionImpl.java#L116-L119
   
   I think we can fix this by skipping the call to `union.union(sketch)` if 
`sketch` is null. We also need to:
   
   - audit other calls to `Union#union` for possible null inputs and fix those 
too.
   - audit other sketches for similar situations (where nulls used to be 
accepted and are no longer) and fix those up if there are any.
   
   ```
   2023-07-03T13:59:20,448 ERROR [task-runner-0-priority-0] 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Exception while 
running task[<task>]
   java.lang.NullPointerException: null
        at java.util.Objects.requireNonNull(Objects.java:209) ~[?:?]
        at 
org.apache.datasketches.quantiles.DoublesUnionImpl.union(DoublesUnionImpl.java:124)
 ~[datasketches-java-4.0.0.jar:?]
        at 
org.apache.druid.query.aggregation.datasketches.quantiles.DoublesSketchAggregatorFactory$2.fold(DoublesSketchAggregatorFactory.java:266)
 ~[?:?]
        at 
org.apache.druid.query.aggregation.datasketches.quantiles.DoublesSketchAggregatorFactory$2.reset(DoublesSketchAggregatorFactory.java:259)
 ~[?:?]
        at 
org.apache.druid.segment.RowCombiningTimeAndDimsIterator.resetCombinedMetrics(RowCombiningTimeAndDimsIterator.java:249)
 ~[druid-processing.jar]
        at 
org.apache.druid.segment.RowCombiningTimeAndDimsIterator.combineToCurrentTimeAndDims(RowCombiningTimeAndDimsIterator.java:229)
 ~[druid-processing.jar]
        at 
org.apache.druid.segment.RowCombiningTimeAndDimsIterator.moveToNext(RowCombiningTimeAndDimsIterator.java:191)
 ~[druid-processing.jar]
        at 
org.apache.druid.segment.IndexMergerV9.mergeIndexesAndWriteColumns(IndexMergerV9.java:605)
 ~[druid-processing.jar]
        at 
org.apache.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:233) 
~[druid-processing.jar]
        at 
org.apache.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:1155) 
~[druid-processing.jar]
        at 
org.apache.druid.segment.IndexMergerV9.multiphaseMerge(IndexMergerV9.java:972) 
~[druid-processing.jar]
        at 
org.apache.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:914)
 ~[druid-processing.jar]
        at 
org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.mergeSegmentsInSamePartition(PartialSegmentMergeTask.java:352)
 ~[druid-indexing-service.jar]
        at 
org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.mergeAndPushSegments(PartialSegmentMergeTask.java:260)
 ~[druid-indexing-service.jar]
        at 
org.apache.druid.indexing.common.task.batch.parallel.PartialSegmentMergeTask.runTask(PartialSegmentMergeTask.java:191)
 ~[druid-indexing-service.jar]
        at 
org.apache.druid.indexing.common.task.batch.parallel.PartialGenericSegmentMergeTask.runTask(PartialGenericSegmentMergeTask.java:46)
 ~[druid-indexing-service.jar]
        at 
org.apache.druid.indexing.common.task.AbstractTask.run(AbstractTask.java:173) 
~[druid-indexing-service.jar]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:477)
 ~[druid-indexing-service.jar]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:449)
 ~[druid-indexing-service.jar]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) 
~[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) 
~[?:?]
        at java.lang.Thread.run(Thread.java:833) ~[?:?]
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to