kgyrtkirk commented on code in PR #16482:
URL: https://github.com/apache/druid/pull/16482#discussion_r1610984801
##########
sql/src/main/java/org/apache/druid/sql/calcite/rel/Windowing.java:
##########
@@ -270,6 +279,74 @@ public static Windowing fromCalciteStuff(
}
}
+ /**
+ * A wrapper class which stores {@link WindowGroup}
+ * along with its computed {@link WindowOperatorFactory}
+ * <p>
+ * this allows us to sort the window groups in order to optimise the order
of operators we would need to compute
+ * without losing the aggregate column name information (which is part of
the computed WindowOperatorFactory)
+ */
Review Comment:
I don't feel like this apidoc adds much detail - I believe it would be ok
even without it.
Usually I think its better to have the apidoc describe the role or mission
of the class - I believe that adding classes like this are part of an emerging
architecture.
##########
sql/src/main/java/org/apache/druid/sql/calcite/rel/Windowing.java:
##########
@@ -270,6 +279,74 @@ public static Windowing fromCalciteStuff(
}
}
+ /**
+ * A wrapper class which stores {@link WindowGroup}
+ * along with its computed {@link WindowOperatorFactory}
+ * <p>
+ * this allows us to sort the window groups in order to optimise the order
of operators we would need to compute
+ * without losing the aggregate column name information (which is part of
the computed WindowOperatorFactory)
+ */
+ private static class WindowComputationProcessor
+ {
+ private final WindowGroup group;
+ private final OperatorFactory processorOperatorFactory;
+
+ public WindowComputationProcessor(WindowGroup group, OperatorFactory
processorOperatorFactory)
+ {
+ this.group = group;
+ this.processorOperatorFactory = processorOperatorFactory;
+ }
+
+ public WindowGroup getGroup()
+ {
+ return group;
+ }
+
+ public OperatorFactory getProcessorOperatorFactory()
+ {
+ return processorOperatorFactory;
+ }
+
+ @Override
+ public boolean equals(Object o)
+ {
+ if (this == o) {
+ return true;
+ }
+ if (o == null || getClass() != o.getClass()) {
+ return false;
+ }
+ WindowComputationProcessor obj = (WindowComputationProcessor) o;
+ return Objects.equals(group, obj.group) && Objects.equals(
+ processorOperatorFactory,
+ obj.processorOperatorFactory
+ );
+ }
+
+ @Override
+ public int hashCode()
+ {
+ return Objects.hash(group, processorOperatorFactory);
+ }
+ }
+
+ /**
+ * Comparator on {@link WindowComputationProcessor}
+ * to move the empty windows to the front
+ */
+ private static final Comparator<WindowComputationProcessor>
MOVE_EMPTY_GROUPS_FIRST = (o1, o2) -> {
Review Comment:
I think usually its more practical to have the comparator inside the class
it belongs to
##########
sql/src/main/java/org/apache/druid/sql/calcite/rel/Windowing.java:
##########
@@ -232,10 +203,48 @@ public static Windowing fromCalciteStuff(
throw new ISE("No processors from Window[%s], why was this code
called?", window);
}
- ops.add(new WindowOperatorFactory(
+ windowGroupProcessors.add(new WindowComputationProcessor(group, new
WindowOperatorFactory(
processors.size() == 1 ?
processors.get(0) : new ComposingProcessor(processors.toArray(new
Processor[0]))
- ));
+ )));
+ }
+
+ // Track prior partition columns and sort columns group-to-group, so we
only insert sorts and repartitions if
+ // we really need to.
+ List<String> priorPartitionColumns = null;
+ LinkedHashSet<ColumnWithDirection> priorSortColumns = new
LinkedHashSet<>();
+
+ final RelCollation priorCollation =
partialQuery.getScan().getTraitSet().getTrait(RelCollationTraitDef.INSTANCE);
+ if (priorCollation != null) {
+ // Populate initial priorSortColumns using collation of the input to the
window operation. Allows us to skip
+ // the initial sort operator if the rows were already in the desired
order.
+ priorSortColumns = computeSortColumnsFromRelCollation(priorCollation,
sourceRowSignature);
+ }
+
+ windowGroupProcessors.sort(MOVE_EMPTY_GROUPS_FIRST);
+ ArrayList<OperatorFactory> ops = new ArrayList<>();
+ for (WindowComputationProcessor windowComputationProcessor :
windowGroupProcessors) {
Review Comment:
note: I wonder if moving all this into a separate method would make it
clearer what happens
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]