tijoparacka opened a new issue, #22438:
URL: https://github.com/apache/superset/issues/22438
Currently, Superset creates the query using Filtered Aggregation while
creating the Druid native query. Filtered aggregation on the String dimension
is not optimal due to performance reasons while applied in a big dataset.
This request is to add the filter used in the filtered aggregation to the
query itself.
Eg: Aggregation query
{
"queryType": "timeseries",
"dataSource": {
"type": "table",
"name": "wikipedia"
},
"intervals": {
"type": "intervals",
"intervals": [
"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
]
},
"granularity": {
"type": "all"
},
"aggregations": [
{
"type": "filtered",
"aggregator": {
"type": "longSum",
"name": "en_cnt",
"fieldName": "added"
},
"filter": {
"type": "selector",
"dimension": "channel",
"value": "#en.wikipedia"
}
},
{
"type": "filtered",
"aggregator": {
"type": "longSum",
"name": "ar_cnt",
"fieldName": "added"
},
"filter": {
"type": "selector",
"dimension": "channel",
"value": "#ar.wikipedia"
}
}
]
}
This will scan all the segments within the interval.
To improve the performance we need to add the filters in the Query filter
eg:
{
"queryType": "timeseries",
"dataSource": {
"type": "table",
"name": "wikipedia"
},
"intervals": {
"type": "intervals",
"intervals": [
"-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
]
},
"filter": {
"type": "in",
"dimension": "channel",
"values": [
"#ar.wikipedia",
"#en.wikipedia"
]
},
"granularity": {
"type": "all"
},
"aggregations": [
{
"type": "filtered",
"aggregator": {
"type": "longSum",
"name": "a0",
"fieldName": "added"
},
"filter": {
"type": "selector",
"dimension": "channel",
"value": "#en.wikipedia"
},
"name": "a0"
},
{
"type": "filtered",
"aggregator": {
"type": "longSum",
"name": "a1",
"fieldName": "added"
},
"filter": {
"type": "selector",
"dimension": "channel",
"value": "#ar.wikipedia"
},
"name": "a1"
}
]
}
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]