quenlang opened a new issue #7047: ingestion data with quantiles sketch module 
throw exception in 0.12.3
URL: https://github.com/apache/incubator-druid/issues/7047
 
 
   @leventov @jon-wei @AlexanderSaydakov 
   I had met a exception when i use quantiles sketch aggregation in ```druid 
0.12.3```.
   The exception as below:
   ```
   2019-02-10T03:46:18,582 WARN [appenderator_merge_0] 
io.druid.segment.realtime.appenderator.AppenderatorImpl - Failed to push merged 
index for 
segment[BRS_WECHAT_APPLET_MIN_BAK_2019-02-04T12:00:00.000Z_2019-02-04T18:00:00.000Z_2019-02-10T01:46:35.227Z].
   java.lang.NullPointerException
        at 
io.druid.query.aggregation.datasketches.quantiles.DoublesSketchAggregatorFactory$1.compare(DoublesSketchAggregatorFactory.java:129)
 ~[?:?]
        at 
io.druid.query.aggregation.datasketches.quantiles.DoublesSketchAggregatorFactory$1.compare(DoublesSketchAggregatorFactory.java:125)
 ~[?:?]
        at 
io.druid.query.aggregation.datasketches.quantiles.DoublesSketchObjectStrategy.compare(DoublesSketchObjectStrategy.java:37)
 ~[?:?]
        at 
io.druid.query.aggregation.datasketches.quantiles.DoublesSketchObjectStrategy.compare(DoublesSketchObjectStrategy.java:29)
 ~[?:?]
        at 
io.druid.segment.data.GenericIndexedWriter.write(GenericIndexedWriter.java:212) 
~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer.serialize(LargeColumnSupportedComplexColumnSerializer.java:100)
 ~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.IndexMergerV9.mergeIndexesAndWriteColumns(IndexMergerV9.java:462)
 ~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:209) 
~[druid-processing-0.12.3.jar:0.12.3]
        at io.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:837) 
~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:710) 
~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:688) 
~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:659)
 ~[druid-server-0.12.3.jar:0.12.3]
        at 
io.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$push$0(AppenderatorImpl.java:563)
 ~[druid-server-0.12.3.jar:0.12.3]
        at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) 
[guava-16.0.1.jar:?]
        at 
com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861)
 [guava-16.0.1.jar:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0_60]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0_60]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]
   2019-02-10T03:46:18,613 ERROR [publish-0] 
io.druid.indexing.kafka.KafkaIndexTask - Error while publishing segments for 
sequence[SequenceMetadata{sequenceName='index_kafka_BRS_WECHAT_APPLET_MIN_BAK_305445d71cf9d8f_0',
 sequenceId=0, startOffsets={0=13195, 1=82987733, 2=13773273, 3=5077, 4=59584, 
5=198, 6=12182050, 7=213853278, 8=606, 9=59971, 10=22860, 11=131652, 
12=31679708, 13=156731, 14=22460033, 15=51988}, endOffsets={0=13255, 
1=83041632, 2=20172715, 3=5119, 4=59584, 5=198, 6=14445404, 7=213853439, 8=606, 
9=110911, 10=28108, 11=131652, 12=42833619, 13=174499, 14=26118960, 15=52000}, 
assignments=[], sentinel=false, checkpointed=true}]
   java.lang.NullPointerException
        at 
io.druid.query.aggregation.datasketches.quantiles.DoublesSketchAggregatorFactory$1.compare(DoublesSketchAggregatorFactory.java:129)
 ~[?:?]
        at 
io.druid.query.aggregation.datasketches.quantiles.DoublesSketchAggregatorFactory$1.compare(DoublesSketchAggregatorFactory.java:125)
 ~[?:?]
        at 
io.druid.query.aggregation.datasketches.quantiles.DoublesSketchObjectStrategy.compare(DoublesSketchObjectStrategy.java:37)
 ~[?:?]
        at 
io.druid.query.aggregation.datasketches.quantiles.DoublesSketchObjectStrategy.compare(DoublesSketchObjectStrategy.java:29)
 ~[?:?]
        at 
io.druid.segment.data.GenericIndexedWriter.write(GenericIndexedWriter.java:212) 
~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.serde.LargeColumnSupportedComplexColumnSerializer.serialize(LargeColumnSupportedComplexColumnSerializer.java:100)
 ~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.IndexMergerV9.mergeIndexesAndWriteColumns(IndexMergerV9.java:462)
 ~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:209) 
~[druid-processing-0.12.3.jar:0.12.3]
        at io.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:837) 
~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:710) 
~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:688) 
~[druid-processing-0.12.3.jar:0.12.3]
        at 
io.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:659)
 ~[druid-server-0.12.3.jar:0.12.3]
        at 
io.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$push$0(AppenderatorImpl.java:563)
 ~[druid-server-0.12.3.jar:0.12.3]
        at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) 
~[guava-16.0.1.jar:?]
        at 
com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861)
 [guava-16.0.1.jar:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0_60]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0_60]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60]
   2019-02-10T03:46:18,618 INFO [task-runner-0-priority-0] 
io.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting down 
immediately...
   ``` 
   And my ingestion task spec file looks like:
   ```
   {
       "type": "kafka",
       "dataSchema": {
           "dataSource": "BRS_WECHAT_APPLET_MIN",
           "parser": {
               "type": "string",
               "parseSpec": {
                   "format": "json",
                   "timestampSpec": {
                       "column": "timestamp",
                       "format": "millis"
                   },
                   "dimensionsSpec": {
                       "dimensions": [
                        "mp_id",
                        "application_id",
                        "instance_id",
                        "action_name",
                        "message_type",
                        "from_path",
                        "path",
                        "open_path",
                        "close_path",
                        "scene",
                        "country_id",
                        "region_id",
                        "city_id",
                        "carrier_id",
                        "error_message",
                        "error_filename",
                        "request_method",
                        "host",
                        "uri",
                        "network_type",
                        "wechat_version",
                        "route_chain",
                        "uid",
                        "http_code",
                        "system",
                        "ip",
                        "device_type",
                        "agreement_id",
                        "protocol"
                       ]
                   }
               }
           },
           "metricsSpec": [
               {
                   "name": "server_count",
                   "fieldName": "server_count",
                   "type": "longSum"
               },
               {
                   "name": "quit_count",
                   "fieldName": "quit_count",
                   "type": "longSum"
               },
               {
                   "name": "on_ready",
                   "fieldName": "on_ready",
                   "type": "doubleSum"
               },
               {
                   "name": "custom_time",
                   "fieldName": "custom_time",
                   "type": "longSum"
               },
               {
                   "name": "first_response_time",
                   "fieldName": "first_response_time",
                   "type": "doubleSum"
               },
               {
                   "name": "response_time",
                   "fieldName": "response_time",
                   "type": "doubleSum"
               },
               {
                   "name": "application_server_time",
                   "fieldName": "application_server_time",
                   "type": "doubleSum"
               },
               {
                   "name": "network_time",
                   "fieldName": "network_time",
                   "type": "doubleSum"
               },
               {
                   "name": "callback_time",
                   "fieldName": "callback_time",
                   "type": "longSum"
               },
               {
                   "name": "bytes_sent",
                   "fieldName": "bytes_sent",
                   "type": "longSum"
               },
               {
                   "name": "bytes_received",
                   "fieldName": "bytes_received",
                   "type": "longSum"
               },
               {
                   "name": "msg_error_pv",
                   "fieldName": "msg_error_pv",
                   "type": "longSum"
               },
               {
                   "name": "file_error_pv",
                   "fieldName": "file_error_pv",
                   "type": "longSum"
               },
               {
                   "name": "count",
                   "fieldName": "count",
                   "type": "longSum"
               },
               {
                   "name": "on_ready_count",
                   "fieldName": "on_ready_count",
                   "type": "longSum"
               },
               {
                   "name": "open_count",
                   "fieldName": "open_count",
                   "type": "longSum"
               },
               {
                   "name": "net_count",
                   "fieldName": "net_count",
                   "type": "longSum"
               },
               {
                   "name": "net_error_count",
                   "fieldName": "net_error_count",
                   "type": "longSum"
               },
               {
                   "name": "js_error_count",
                   "fieldName": "js_error_count",
                   "type": "longSum"
               },
               {
                   "name": "slow_count",
                   "fieldName": "slow_count",
                   "type": "longSum"
               },
               {
                   "name": "net_slow_count",
                   "fieldName": "net_slow_count",
                   "type": "longSum"
               },
               {
                   "name": "uv",
                   "fieldName": "uid",
                   "type": "thetaSketch"
               },
               {
                   "name": "on_ready_histogram",
                   "fieldName": "on_ready",
                   "type": "approxHistogramFold",
                   "resolution": 50,
                   "numBuckets": 7,
                   "lowerLimit": 0
               },
               {
                   "name": "first_response_time_histogram",
                   "fieldName": "first_response_time",
                   "type": "approxHistogramFold",
                   "resolution": 50,
                   "numBuckets": 7,
                   "lowerLimit": 0
               },
               {
                   "name": "response_time_histogram",
                   "fieldName": "response_time",
                   "type": "approxHistogramFold",
                   "resolution": 50,
                   "numBuckets": 7,
                   "lowerLimit": 0
               },
               {
                   "name": "network_time_histogram",
                   "fieldName": "network_time",
                   "type": "approxHistogramFold",
                   "resolution": 50,
                   "numBuckets": 7,
                   "lowerLimit": 0
               },
               {
                   "name": "application_server_time_histogram",
                   "fieldName": "application_server_time",
                   "type": "approxHistogramFold",
                   "resolution": 50,
                   "numBuckets": 7,
                   "lowerLimit": 0
               },
            {
                "type" : "quantilesDoublesSketch",
                "name" : "on_ready_sketch",
                "fieldName" : "on_ready",
                "k": 256
            },
            {
                "type" : "quantilesDoublesSketch",
                "name" : "first_response_time_sketch",
                "fieldName" : "first_response_time",
                "k": 256
            },
            {
                "type" : "quantilesDoublesSketch",
                "name" : "response_time_sketch",
                "fieldName" : "response_time",
                "k": 256
            },
            {
                "type" : "quantilesDoublesSketch",
                "name" : "network_time_sketch",
                "fieldName" : "network_time",
                "k": 256
            },
            {
                "type" : "quantilesDoublesSketch",
                "name" : "application_server_time_sketch",
                "fieldName" : "application_server_time",
                "k": 256
            }
           ],
           "granularitySpec": {
               "type": "uniform",
               "segmentGranularity": "SIX_HOUR",
               "queryGranularity": "MINUTE"
           }
       },
       "tuningConfig": {
           "type": "kafka",
           "maxRowsPerSegment": 2500000
       },
       "ioConfig": {
           "topic": "drd-mp-web-applet",
           "consumerProperties": {
               "bootstrap.servers": "kafka1:9092"
           },
           "taskCount": 1,
           "taskDuration": "PT6H",
           "replicas": 1,
        "useEarliestOffset": true
       }
   }
   ```
   I have no idea with this exception, can you give me some advices ? Thank you 
very much.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to