glasser opened a new issue #7243: stringFirst/stringLast crashes at aggregation time URL: https://github.com/apache/incubator-druid/issues/7243 ### Affected Version 0.13.0-incubating (and maybe older versions) ### Description See discussion on #5789 for background. Set up a fresh install of Druid 0.13-incubating following the [quickstart](http://druid.io/docs/latest/tutorials/index.html). Write this index spec: ```json { "type" : "index", "spec" : { "dataSchema" : { "dataSource" : "wikipedia", "parser" : { "type" : "string", "parseSpec" : { "format" : "json", "dimensionsSpec" : {}, "timestampSpec": { "column": "time", "format": "iso" } } }, "metricsSpec": [{ "name": "channel", "fieldName": "channel", "type": "stringFirst", "maxStringBytes": 100 }], "granularitySpec" : { "type" : "uniform", "segmentGranularity" : "day", "queryGranularity" : "hour", "intervals" : ["2015-09-12/2015-09-13"], "rollup" : true } }, "ioConfig" : { "type" : "index", "firehose" : { "type" : "local", "baseDir" : "quickstart/tutorial/", "filter" : "wikiticker-2015-09-12-sampled.json.gz" }, "appendToExisting" : false }, "tuningConfig" : { "type" : "index", "targetPartitionSize" : 5000000, "maxRowsInMemory" : 1000, "forceExtendableShardSpecs" : true } } } ``` and post it: ``` $ bin/post-index-task --file stringfirst-index.json Beginning indexing data for wikipedia Waiting up to 119s for indexing service [http://localhost:8090/] to become available. [Got: <urlopen error [Errno 61] Connection refused> ] Task started: index_wikipedia_2019-03-12T19:07:26.507Z Task log: http://localhost:8090/druid/indexer/v1/task/index_wikipedia_2019-03-12T19:07:26.507Z/log Task status: http://localhost:8090/druid/indexer/v1/task/index_wikipedia_2019-03-12T19:07:26.507Z/status Task index_wikipedia_2019-03-12T19:07:26.507Z still running... Task index_wikipedia_2019-03-12T19:07:26.507Z still running... Task index_wikipedia_2019-03-12T19:07:26.507Z still running... Task index_wikipedia_2019-03-12T19:07:26.507Z still running... Task finished with status: FAILED ``` The log reads: ``` 2019-03-12T19:07:45,892 WARN [appenderator_merge_0] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Failed to push merged index for segment[wikipedia_2015-09-12T00:00:00.000Z_2015-09-13T00:00:00.000Z_2019-03-12T19:07:26.636Z]. java.lang.ClassCastException: org.apache.druid.query.aggregation.SerializablePairLongString cannot be cast to java.lang.String at org.apache.druid.query.aggregation.first.StringFirstAggregateCombiner.reset(StringFirstAggregateCombiner.java:35) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.segment.RowCombiningTimeAndDimsIterator.resetCombinedMetrics(RowCombiningTimeAndDimsIterator.java:249) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.segment.RowCombiningTimeAndDimsIterator.combineToCurrentTimeAndDims(RowCombiningTimeAndDimsIterator.java:229) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.segment.RowCombiningTimeAndDimsIterator.moveToNext(RowCombiningTimeAndDimsIterator.java:191) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.segment.IndexMergerV9.mergeIndexesAndWriteColumns(IndexMergerV9.java:492) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:191) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:914) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:832) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:810) ~[druid-processing-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:719) ~[druid-server-0.13.0-incubating.jar:0.13.0-incubating] at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$push$1(AppenderatorImpl.java:623) ~[druid-server-0.13.0-incubating.jar:0.13.0-incubating] at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) [guava-16.0.1.jar:?] at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861) [guava-16.0.1.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_162] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_162] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162] ``` @gianm suggested in #5789 that the AggregateCombiner code was just not running at all and that it should always be acting on SerializablePairLongString values rather than Strings. I am not enough of an expert on aggregation to know if that's correct.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
