jamangstangs opened a new issue, #16982:
URL: https://github.com/apache/druid/issues/16982
### Environment
- Apache Druid: 26.0.0
- Kafka: 2.7.1
### Description
Using Kafka ingestion and submitting the ingestion task as follows.
```
...
"metricsSpec": [
{
"name": "uniq_column1",
"type": "thetaSketch",
"fieldName": "uniq_column1",
"size": 16384
},
{
"name": "uniq_column1",
"type": "thetaSketch",
"fieldName": "uniq_column1",
"size": 16384
},
]
...
"tuningConfig": {
"type": "kafka",
"maxRowsPerSegment": 1000000000,
"maxTotalRows": 1000000000,
"maxBytesInMemory": -1
},
...
"granularitySpec": {
"type": "uniform",
"segmentGranularity": "HOUR",
"queryGranularity": "SECOND",
"rollup": true
}
...
"taskDuration": "PT1H"
```
When use segment metadata query, thetaSketch type column return type and
typeSignature as STRING type. Not the thetaSketch type.
```
{
queryType: "segmentMetadata",
dataSource: "datasource",
merge: true
}
```
column | typeSignature | type | errorMessage
-- | -- | -- | --
uniq_column1 | STRING | STRING | error:cannot_merge_diff_types:
[thetaSketch] and [thetaSketchBuild]
uniq_column2 | STRING | STRING | error:cannot_merge_diff_types:
[thetaSketch] and [thetaSketchBuild]
But, when I set the range of the segment metadata query to exclude the
real-time ingestion range, it returns the correct type.
```
{
queryType: "segmentMetadata",
dataSource: "datasource",
merge: true,
intervals:["2024-08-30T04:00:00.000Z/2024-09-01T23:00:00.000Z"]
}
```
column | typeSignature | type | errorMessage
-- | -- | -- | --
uniq_column1 | COMPLEX\<thetaSketch\> | thetaSketch | null
uniq_column2 | COMPLEX\<thetaSketch\> | thetaSketch | null
I'm also using version 0.21.0 of the Druid cluster, and when I test the same
type of query, it returns the correct type.
```
{
queryType: "segmentMetadata",
dataSource: "datasource",
merge: true
}
```
column | type | errorMessage
-- | -- | --
uniq_column1 | thetaSketch | null
uniq_column2 | thetaSketch | null
It seems particularly unable to merge in the real-time ingestion range for
thetaSketch type.
This kind of issue already fixed in
https://github.com/apache/druid/issues/3339, but still affected in version
26.0.0.
Is there a solution for this, or has it been fixed in a newer version of the
Druid cluster?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]