abhishekagarwal87 commented on a change in pull request #11809:
URL: https://github.com/apache/druid/pull/11809#discussion_r741613829
##########
File path:
server/src/main/java/org/apache/druid/server/ClientQuerySegmentWalker.java
##########
@@ -431,6 +448,101 @@ private DataSource inlineIfNecessary(
);
}
+ /**
+ * This method returns the datasource by populating all the {@link
QueryDataSource} with correct nesting level and
+ * sibling order of all the subqueries that are present.
+ * It also plumbs parent query's id and sql id in case the subqueries don't
have it set by default
+ *
+ * @param dataSource Datasource whose subqueries need to be populated
+ * @param parentQueryId Parent Query's ID, can be null if do not need to
update this in the subqueries
+ * @param parentSqlQueryId Parent Query's SQL Query ID, can be null if do
not need to update this in the subqueries
+ * @return DataSource populated with the subqueries
+ */
+ private DataSource generateSubqueryIds(
+ DataSource dataSource,
+ @Nullable final String parentQueryId,
+ @Nullable final String parentSqlQueryId
+ )
+ {
+ Queue<DataSource> queue = new LinkedList<>();
+ queue.add(dataSource);
+
+ /*
+ Performs BFS on the datasource tree to find the nesting level, and the
sibling order of the query datasource
+ */
+ Map<DataSource, Pair<Integer, Integer>> queryDataSourceToSubqueryIds = new
HashMap<>();
+ int level = 1;
+ while (!queue.isEmpty()) {
+ int size = queue.size();
+ int siblingOrder = 1;
+ for (int i = 0; i < size; ++i) {
+ DataSource currentDataSource = queue.poll();
+ if (currentDataSource instanceof QueryDataSource) {
+ queryDataSourceToSubqueryIds.put(currentDataSource, new
Pair<>(level, siblingOrder));
Review comment:
It is not clear to me why we are not calling `insertSubQueryId` here
itself. That is we have the level information already so we can populate the
ids for the query corresponding to this QueryDataSource here itself instead of
saving it inside a map and doing it later. am I missing something?
##########
File path: processing/src/main/java/org/apache/druid/query/UnionQueryRunner.java
##########
@@ -71,19 +74,23 @@ public UnionQueryRunner(
return new MergeSequence<>(
query.getResultOrdering(),
Sequences.simple(
- Lists.transform(
- unionDataSource.getDataSources(),
- (Function<DataSource, Sequence<T>>) singleSource ->
- baseRunner.run(
- queryPlus.withQuery(
- Queries.withBaseDataSource(query, singleSource)
- // assign the subqueryId. this will be
used to validate that every query servers
- // have responded per subquery in
RetryQueryRunner
- .withDefaultSubQueryId()
- ),
- responseContext
- )
- )
+ IntStream.range(0, unionDataSource.getDataSources().size())
+ .mapToObj(i -> new Pair<>(i + 1,
unionDataSource.getDataSources().get(i)))
+ .map(indexBaseDataSourcePair ->
+ baseRunner.run(
+
queryPlus.withQuery(Queries.withBaseDataSource(
+ query,
+ indexBaseDataSourcePair.rhs
+ ).withSubQueryId(
+ generateSubqueryId(
+ query.getSubQueryId(),
+ // toString() works since the
datasource will be a TableDataSource
+
indexBaseDataSourcePair.rhs.toString(),
Review comment:
maybe you can call `getName()` instead of `toString()`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]