jihoonson commented on a change in pull request #11379:
URL: https://github.com/apache/druid/pull/11379#discussion_r665127636



##########
File path: 
processing/src/main/java/org/apache/druid/query/groupby/strategy/GroupByStrategyV2.java
##########
@@ -213,7 +216,48 @@ public boolean doMergeResults(final GroupByQuery query)
     context.put("finalize", false);
     context.put(GroupByQueryConfig.CTX_KEY_STRATEGY, 
GroupByStrategySelector.STRATEGY_V2);
     context.put(CTX_KEY_OUTERMOST, false);
-    if (query.getUniversalTimestamp() != null) {
+    Map<String, Object> timestampFieldContext = 
GroupByQueryHelper.findTimestampResultField(query);
+    context.putAll(timestampFieldContext);
+
+    Granularity granularity = query.getGranularity();
+    List<DimensionSpec> dimensionSpecs = query.getDimensions();
+    final String timestampResultField = (String) 
timestampFieldContext.get(GroupByQuery.CTX_TIMESTAMP_RESULT_FIELD);
+    final boolean hasTimestampResultField = timestampResultField != null
+                                            && 
query.getContextBoolean(CTX_KEY_OUTERMOST, true);
+    int timestampResultFieldIndex = 0;
+    if (hasTimestampResultField) {
+      // sql like "group by city_id,time_floor(__time to day)",
+      // the original translated query is granularity=all and dimensions:[d0, 
d1]
+      // the better plan is granularity=day and dimensions:[d0]
+      // but the ResultRow structure is changed from [d0, d1] to [__time, d0]
+      // this structure should be fixed as [d0, d1] (actually it is [d0, 
__time]) before postAggs are called
+      final Granularity timestampResultFieldGranularity
+          = (Granularity) 
timestampFieldContext.get(GroupByQuery.CTX_TIMESTAMP_RESULT_FIELD_GRANULARITY);
+      dimensionSpecs =
+          query.getDimensions()
+               .stream()
+               .filter(dimensionSpec -> 
!dimensionSpec.getOutputName().equals(timestampResultField))
+               .collect(Collectors.toList());
+      granularity = timestampResultFieldGranularity;
+      // when timestampResultField is the last dimension, should set 
sortByDimsFirst=true,
+      // otherwise the downstream is sorted by row's timestamp first which 
makes the final ordering not as expected
+      timestampResultFieldIndex = (int) 
timestampFieldContext.get(GroupByQuery.CTX_TIMESTAMP_RESULT_FIELD_INDEX);
+      if (!query.getContextSortByDimsFirst() && timestampResultFieldIndex == 
query.getDimensions().size() - 1) {
+        context.put(GroupByQuery.CTX_KEY_SORT_BY_DIMS_FIRST, true);
+      }
+      // when timestampResultField is the first dimension and 
sortByDimsFirst=true,
+      // it is actually equals to sortByDimsFirst=false
+      if (query.getContextSortByDimsFirst() && timestampResultFieldIndex == 0) 
{
+        context.put(GroupByQuery.CTX_KEY_SORT_BY_DIMS_FIRST, false);
+      }
+      // when hasTimestampResultField=true and timestampResultField is neither 
first nor last dimension,
+      // the DefaultLimitSpec will always do the reordering
+    }

Review comment:
       > The granularity=all and sortByDimsFirst are still unchanged in 
DruidQuery. toGroupByQuery, because some code relies on the granularity and 
sortByDimsFirst like subtotals, in another word, the basic idea is still to 
treat this optimization as group by inner process improvement.
   
   In general, I don't think this should be an engine-level optimization but 
should be plan-level because the job of the Druid planner is making an optimal 
native query. Apparently, making this an engine-level optimization will 
introduce many problems such as 1) the explain result doesn't match to the 
native query actually being executed, 2) the optimization logic should be 
replicated for all strategies which the overhead could increase whenever we add 
a new strategy, 3) the optimization code requires readers to understand the 
relationship between the code piece in the sql planner and the other piece in 
the groupBy strategy, 4) having the optimization code spread across 
`DruidQuery` and `GroupByStrategyV2` can potentially make them tied which can 
make 2) and 3) harder, and so on. 
   
   However, I think I understand why you want to do this. The fundamental 
problem here seems that the timestamp column name is fixed as `__time` and must 
be the first column for merge. Can you please add a comment in the code about 
these details of the tradeoffs behind the design decision? The comment should 
explain what the tradeoffs are, such as what code relies on what assumption, 
what could be benefits and potential problems in this implementation. This 
explanation should be documented in `groupByStrategyV2` and 
`DruidQuery.toGroupByQuery` so that other people don't have to go through the 
same analysis process as what you have done for this PR.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to