a2l007 opened a new issue #6057: Broker sends sequential requests to the historical for union queries URL: https://github.com/apache/incubator-druid/issues/6057 Analyzing groupBy queries based on Union datasources, we have seen that for a given query if a historical has segments for the multiple datasources part of this query, the broker generates multiple requests to this historical, but in a serial manner. For example, running the following union query: ``` { "queryType": "groupBy", "dataSource": { "type": "union", "dataSources": [ { "type": "table", "name": "tableA" }, { "type": "table", "name": "tableB" }, { "type": "table", "name": "tableC" } ] }, "intervals": { "type": "LegacySegmentSpec", "intervals": [ "2018-07-23T00:00:00.000Z\/2018-07-24T00:00:00.000Z" ] }, "virtualColumns": [ ], "filter": null, "granularity": "DAY", "dimensions": [ { "type": "default", "dimension": "acc_id" } ], "aggregations": [ { "fieldName": "clicks", "name": "clicks", "type": "longSum" } ], "postAggregations": [ ], "having": null, "limitSpec": { "type": "NoopLimitSpec" }, "descending": false } ``` generates logs such as: ``` 2018-07-24T14:50:06,500 INFO [qtp1256384385-231[groupBy_q_test_1]] com.metamx.http.client.pool.ChannelResourceFactory - Generating: https://tier .historical.foo.com:4443 2018-07-24T14:50:40,366 INFO [qtp1256384385-231[groupBy_q_test_1]] com.metamx.http.client.pool.ChannelResourceFactory - Generating: https://tier .historical.foo.com:4443 ``` This historical contains segments for datasources `tableA` and `tableB` and therefore it sends two request to the historical but as it can be seen from the timestamps, there is a delay between both the requests. [BrokerServerView](https://github.com/apache/incubator-druid/blob/master/server/src/main/java/io/druid/client/BrokerServerView.java#L291) confirms this behaviour that at an instant for a given query, there can be only a single request to a historical. This behaviour clearly causes query execution delays for union queries. Before I work on investigating if this can be parallelized, I wanted to check with the community for any comments on whether this is already a known issue with union queries or if it is actually a bug.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org