abhishekagarwal87 commented on a change in pull request #11809:
URL: https://github.com/apache/druid/pull/11809#discussion_r741613829



##########
File path: 
server/src/main/java/org/apache/druid/server/ClientQuerySegmentWalker.java
##########
@@ -431,6 +448,101 @@ private DataSource inlineIfNecessary(
         );
   }
 
+  /**
+   * This method returns the datasource by populating all the {@link 
QueryDataSource} with correct nesting level and
+   * sibling order of all the subqueries that are present.
+   * It also plumbs parent query's id and sql id in case the subqueries don't 
have it set by default
+   *
+   * @param dataSource       Datasource whose subqueries need to be populated
+   * @param parentQueryId    Parent Query's ID, can be null if do not need to 
update this in the subqueries
+   * @param parentSqlQueryId Parent Query's SQL Query ID, can be null if do 
not need to update this in the subqueries
+   * @return DataSource populated with the subqueries
+   */
+  private DataSource generateSubqueryIds(
+      DataSource dataSource,
+      @Nullable final String parentQueryId,
+      @Nullable final String parentSqlQueryId
+  )
+  {
+    Queue<DataSource> queue = new LinkedList<>();
+    queue.add(dataSource);
+
+    /*
+    Performs BFS on the datasource tree to find the nesting level, and the 
sibling order of the query datasource
+     */
+    Map<DataSource, Pair<Integer, Integer>> queryDataSourceToSubqueryIds = new 
HashMap<>();
+    int level = 1;
+    while (!queue.isEmpty()) {
+      int size = queue.size();
+      int siblingOrder = 1;
+      for (int i = 0; i < size; ++i) {
+        DataSource currentDataSource = queue.poll();
+        if (currentDataSource instanceof QueryDataSource) {
+          queryDataSourceToSubqueryIds.put(currentDataSource, new 
Pair<>(level, siblingOrder));

Review comment:
       It is not clear to me why we are not calling `insertSubQueryId` here 
itself. That is we have the level information already so we can populate the 
ids for the query corresponding to this QueryDataSource here itself instead of 
saving it inside a map and doing it later. am I missing something? 

##########
File path: processing/src/main/java/org/apache/druid/query/UnionQueryRunner.java
##########
@@ -71,19 +74,23 @@ public UnionQueryRunner(
         return new MergeSequence<>(
             query.getResultOrdering(),
             Sequences.simple(
-                Lists.transform(
-                    unionDataSource.getDataSources(),
-                    (Function<DataSource, Sequence<T>>) singleSource ->
-                        baseRunner.run(
-                            queryPlus.withQuery(
-                                Queries.withBaseDataSource(query, singleSource)
-                                       // assign the subqueryId. this will be 
used to validate that every query servers
-                                       // have responded per subquery in 
RetryQueryRunner
-                                       .withDefaultSubQueryId()
-                            ),
-                            responseContext
-                        )
-                )
+                IntStream.range(0, unionDataSource.getDataSources().size())
+                         .mapToObj(i -> new Pair<>(i + 1, 
unionDataSource.getDataSources().get(i)))
+                         .map(indexBaseDataSourcePair ->
+                                  baseRunner.run(
+                                      
queryPlus.withQuery(Queries.withBaseDataSource(
+                                          query,
+                                          indexBaseDataSourcePair.rhs
+                                      ).withSubQueryId(
+                                          generateSubqueryId(
+                                              query.getSubQueryId(),
+                                              // toString() works since the 
datasource will be a TableDataSource
+                                              
indexBaseDataSourcePair.rhs.toString(),

Review comment:
       maybe you can call `getName()` instead of `toString()`? 

##########
File path: 
server/src/main/java/org/apache/druid/server/ClientQuerySegmentWalker.java
##########
@@ -431,6 +448,101 @@ private DataSource inlineIfNecessary(
         );
   }
 
+  /**
+   * This method returns the datasource by populating all the {@link 
QueryDataSource} with correct nesting level and
+   * sibling order of all the subqueries that are present.
+   * It also plumbs parent query's id and sql id in case the subqueries don't 
have it set by default
+   *
+   * @param dataSource       Datasource whose subqueries need to be populated
+   * @param parentQueryId    Parent Query's ID, can be null if do not need to 
update this in the subqueries
+   * @param parentSqlQueryId Parent Query's SQL Query ID, can be null if do 
not need to update this in the subqueries
+   * @return DataSource populated with the subqueries
+   */
+  private DataSource generateSubqueryIds(
+      DataSource dataSource,
+      @Nullable final String parentQueryId,
+      @Nullable final String parentSqlQueryId
+  )
+  {
+    Queue<DataSource> queue = new LinkedList<>();
+    queue.add(dataSource);
+
+    /*
+    Performs BFS on the datasource tree to find the nesting level, and the 
sibling order of the query datasource
+     */
+    Map<DataSource, Pair<Integer, Integer>> queryDataSourceToSubqueryIds = new 
HashMap<>();

Review comment:
       ```suggestion
       Map<QueryDataSource, Pair<Integer, Integer>> 
queryDataSourceToSubqueryIds = new HashMap<>();
   ```

##########
File path: 
server/src/main/java/org/apache/druid/server/ClientQuerySegmentWalker.java
##########
@@ -431,6 +448,101 @@ private DataSource inlineIfNecessary(
         );
   }
 
+  /**
+   * This method returns the datasource by populating all the {@link 
QueryDataSource} with correct nesting level and
+   * sibling order of all the subqueries that are present.
+   * It also plumbs parent query's id and sql id in case the subqueries don't 
have it set by default
+   *
+   * @param dataSource       Datasource whose subqueries need to be populated
+   * @param parentQueryId    Parent Query's ID, can be null if do not need to 
update this in the subqueries
+   * @param parentSqlQueryId Parent Query's SQL Query ID, can be null if do 
not need to update this in the subqueries
+   * @return DataSource populated with the subqueries
+   */
+  private DataSource generateSubqueryIds(
+      DataSource dataSource,
+      @Nullable final String parentQueryId,
+      @Nullable final String parentSqlQueryId
+  )
+  {
+    Queue<DataSource> queue = new LinkedList<>();
+    queue.add(dataSource);
+
+    /*
+    Performs BFS on the datasource tree to find the nesting level, and the 
sibling order of the query datasource
+     */
+    Map<DataSource, Pair<Integer, Integer>> queryDataSourceToSubqueryIds = new 
HashMap<>();
+    int level = 1;
+    while (!queue.isEmpty()) {
+      int size = queue.size();
+      int siblingOrder = 1;
+      for (int i = 0; i < size; ++i) {
+        DataSource currentDataSource = queue.poll();
+        if (currentDataSource instanceof QueryDataSource) {
+          queryDataSourceToSubqueryIds.put(currentDataSource, new 
Pair<>(level, siblingOrder));
+          ++siblingOrder;
+        }
+        queue.addAll(currentDataSource.getChildren());
+      }
+      ++level;
+    }
+    /*
+    Returns the datasource by populating all the subqueries with the id 
generated in the map above.
+    Implemented in a separate function since the methods on datasource and 
queries return a new datasource/query
+     */
+    return insertSubqueryIds(dataSource, queryDataSourceToSubqueryIds, 
parentQueryId, parentSqlQueryId);
+  }
+
+  /**
+   * To be used in conjunction with {@code generateSubqueryIds()} method. This 
does the actual task of populating the
+   * query's id, subQueryId and sqlQueryId
+   *
+   * @param dataSource                   The datasource to be populated with 
the subqueries
+   * @param queryDataSourceToSubqueryIds Map of the datasources to their level 
and sibling order
+   * @param parentQueryId                Parent query's id
+   * @param parentSqlQueryId             Parent query's sqlQueryId
+   * @return Populates the subqueries from the map
+   */
+  private DataSource insertSubqueryIds(
+      DataSource dataSource,
+      Map<DataSource, Pair<Integer, Integer>> queryDataSourceToSubqueryIds,
+      @Nullable final String parentQueryId,
+      @Nullable final String parentSqlQueryId
+  )
+  {
+    if (queryDataSourceToSubqueryIds.containsKey(dataSource)) {
+      if (dataSource instanceof QueryDataSource) { // This should always be 
true, done for typecasting

Review comment:
       this will be redundant once you modify the signature of the map. 

##########
File path: 
server/src/main/java/org/apache/druid/server/ClientQuerySegmentWalker.java
##########
@@ -431,6 +448,101 @@ private DataSource inlineIfNecessary(
         );
   }
 
+  /**
+   * This method returns the datasource by populating all the {@link 
QueryDataSource} with correct nesting level and
+   * sibling order of all the subqueries that are present.
+   * It also plumbs parent query's id and sql id in case the subqueries don't 
have it set by default
+   *
+   * @param dataSource       Datasource whose subqueries need to be populated
+   * @param parentQueryId    Parent Query's ID, can be null if do not need to 
update this in the subqueries
+   * @param parentSqlQueryId Parent Query's SQL Query ID, can be null if do 
not need to update this in the subqueries
+   * @return DataSource populated with the subqueries
+   */
+  private DataSource generateSubqueryIds(
+      DataSource dataSource,
+      @Nullable final String parentQueryId,
+      @Nullable final String parentSqlQueryId
+  )
+  {
+    Queue<DataSource> queue = new LinkedList<>();
+    queue.add(dataSource);
+
+    /*
+    Performs BFS on the datasource tree to find the nesting level, and the 
sibling order of the query datasource
+     */
+    Map<DataSource, Pair<Integer, Integer>> queryDataSourceToSubqueryIds = new 
HashMap<>();
+    int level = 1;
+    while (!queue.isEmpty()) {
+      int size = queue.size();
+      int siblingOrder = 1;
+      for (int i = 0; i < size; ++i) {
+        DataSource currentDataSource = queue.poll();
+        if (currentDataSource instanceof QueryDataSource) {
+          queryDataSourceToSubqueryIds.put(currentDataSource, new 
Pair<>(level, siblingOrder));

Review comment:
       I see. Thank you for explaining that. 

##########
File path: 
server/src/main/java/org/apache/druid/server/ClientQuerySegmentWalker.java
##########
@@ -431,6 +448,101 @@ private DataSource inlineIfNecessary(
         );
   }
 
+  /**
+   * This method returns the datasource by populating all the {@link 
QueryDataSource} with correct nesting level and
+   * sibling order of all the subqueries that are present.
+   * It also plumbs parent query's id and sql id in case the subqueries don't 
have it set by default
+   *
+   * @param dataSource       Datasource whose subqueries need to be populated
+   * @param parentQueryId    Parent Query's ID, can be null if do not need to 
update this in the subqueries
+   * @param parentSqlQueryId Parent Query's SQL Query ID, can be null if do 
not need to update this in the subqueries
+   * @return DataSource populated with the subqueries
+   */
+  private DataSource generateSubqueryIds(
+      DataSource dataSource,
+      @Nullable final String parentQueryId,
+      @Nullable final String parentSqlQueryId
+  )
+  {
+    Queue<DataSource> queue = new LinkedList<>();
+    queue.add(dataSource);
+
+    /*
+    Performs BFS on the datasource tree to find the nesting level, and the 
sibling order of the query datasource
+     */
+    Map<DataSource, Pair<Integer, Integer>> queryDataSourceToSubqueryIds = new 
HashMap<>();
+    int level = 1;
+    while (!queue.isEmpty()) {
+      int size = queue.size();
+      int siblingOrder = 1;
+      for (int i = 0; i < size; ++i) {
+        DataSource currentDataSource = queue.poll();
+        if (currentDataSource instanceof QueryDataSource) {
+          queryDataSourceToSubqueryIds.put(currentDataSource, new 
Pair<>(level, siblingOrder));
+          ++siblingOrder;
+        }
+        queue.addAll(currentDataSource.getChildren());
+      }
+      ++level;
+    }
+    /*
+    Returns the datasource by populating all the subqueries with the id 
generated in the map above.
+    Implemented in a separate function since the methods on datasource and 
queries return a new datasource/query
+     */
+    return insertSubqueryIds(dataSource, queryDataSourceToSubqueryIds, 
parentQueryId, parentSqlQueryId);
+  }
+
+  /**
+   * To be used in conjunction with {@code generateSubqueryIds()} method. This 
does the actual task of populating the
+   * query's id, subQueryId and sqlQueryId
+   *
+   * @param dataSource                   The datasource to be populated with 
the subqueries
+   * @param queryDataSourceToSubqueryIds Map of the datasources to their level 
and sibling order
+   * @param parentQueryId                Parent query's id
+   * @param parentSqlQueryId             Parent query's sqlQueryId
+   * @return Populates the subqueries from the map
+   */
+  private DataSource insertSubqueryIds(
+      DataSource dataSource,
+      Map<DataSource, Pair<Integer, Integer>> queryDataSourceToSubqueryIds,
+      @Nullable final String parentQueryId,
+      @Nullable final String parentSqlQueryId
+  )
+  {
+    if (queryDataSourceToSubqueryIds.containsKey(dataSource)) {
+      if (dataSource instanceof QueryDataSource) { // This should always be 
true, done for typecasting
+        QueryDataSource queryDataSource = (QueryDataSource) dataSource;
+        Pair<Integer, Integer> nestingInfo = 
queryDataSourceToSubqueryIds.get(dataSource);
+        String subQueryId = nestingInfo.lhs.toString() + "." + 
nestingInfo.rhs.toString();
+        Query<?> query = queryDataSource.getQuery();
+
+        if (StringUtils.isEmpty(query.getSubQueryId())) {
+          query = query.withSubQueryId(subQueryId);
+        }
+
+        if (StringUtils.isEmpty(query.getId()) && 
StringUtils.isNotEmpty(parentQueryId)) {
+          query = query.withId(parentQueryId);
+        }
+
+        if (StringUtils.isEmpty(query.getSqlQueryId()) && 
StringUtils.isNotEmpty(parentSqlQueryId)) {
+          query = query.withSqlQueryId(parentSqlQueryId);
+        }
+
+        queryDataSource = new QueryDataSource(query);
+        dataSource = queryDataSource;

Review comment:
       ```suggestion
           dataSource = new QueryDataSource(query);
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to