Re: [PR] IGNITE-21897 Sql. Invalid collation passthrough for sorted aggregates [ignite-3]

via GitHub Tue, 02 Apr 2024 00:43:38 -0700


korlov42 commented on code in PR #3537:
URL: https://github.com/apache/ignite-3/pull/3537#discussion_r1547317500



##########
modules/sql-engine/src/main/java/org/apache/ignite/internal/sql/engine/rel/agg/IgniteSortAggregateBase.java:
##########
@@ -77,22 +80,42 @@ default Pair<RelTraitSet, List<RelTraitSet>> 
passThroughCollation(
             RelTraitSet nodeTraits, List<RelTraitSet> inputTraits
     ) {
         RelCollation required = TraitUtils.collation(nodeTraits);
-        ImmutableBitSet requiredKeys = ImmutableBitSet.of(required.getKeys());
-        RelCollation collation;
 
-        if (getGroupSet().contains(requiredKeys)) {
-            List<RelFieldCollation> newCollationFields = new 
ArrayList<>(getGroupSet().cardinality());
-            newCollationFields.addAll(required.getFieldCollations());
-
-            ImmutableBitSet keysLeft = getGroupSet().except(requiredKeys);
+        if (getGroupSet().isEmpty()) {
+            return passThroughCollation(nodeTraits, inputTraits, 
RelCollations.EMPTY); // Erase collation for a single group.
+        } else if (required.getFieldCollations().isEmpty()) {
+            return passThroughCollation(nodeTraits, inputTraits, 
TraitUtils.createCollation(getGroupSet().asList())); // No match.
+        }
 
-            keysLeft.forEach(fieldIdx -> 
newCollationFields.add(TraitUtils.createFieldCollation(fieldIdx)));
+        ImmutableBitSet groupingColumns = 
ImmutableBitSet.range(getGroupSet().cardinality());
+        IntList prefix = maxPrefix(required.getKeys(), 
groupingColumns.asSet());
 
-            collation = RelCollations.of(newCollationFields);
-        } else {
-            collation = TraitUtils.createCollation(getGroupSet().toList());
+        if (prefix.isEmpty()) {
+            return passThroughCollation(nodeTraits, inputTraits, 
TraitUtils.createCollation(getGroupSet().asList())); // No match.
         }
 
+        // Rearrange grouping columns to satisfy required collation (as much 
as possible).
+        List<RelFieldCollation> newCollationColumns = new 
ArrayList<>(getGroupSet().cardinality());
+        Mapping mapping = 
Commons.trimmingMapping(groupingColumns.cardinality(), groupingColumns);
+
+        // Add required columns first.
+        prefix.intStream().map(mapping::getTarget)
+                .forEach(fieldIdx -> 
newCollationColumns.add(TraitUtils.createFieldCollation(fieldIdx)));
+
+        // Then add missed grouping columns.
+        groupingColumns.asList().stream()
+                .filter(not(prefix::contains))
+                .map(mapping::getTarget)
+                .forEach(fieldIdx -> 
newCollationColumns.add(TraitUtils.createFieldCollation(fieldIdx)));
+
+        return passThroughCollation(nodeTraits, inputTraits, 
RelCollations.of(newCollationColumns));
+    }
+
+    private static Pair<RelTraitSet, List<RelTraitSet>> passThroughCollation(
+            RelTraitSet nodeTraits,
+            List<RelTraitSet> inputTraits,
+            RelCollation collation
+    ) {
         return Pair.of(nodeTraits.replace(collation),
                 List.of(inputTraits.get(0).replace(collation)));
     }

Review Comment:
   in general, such propagation (when you use the same collation for input and 
output) is not correct. Here is an example:
   
   assume, I have table `T` with tuple `(a, b, c, d)`. In case of query like 
`GROUP BY b, d ORDER BY b, d`, the input collation should be `[1, 3]`, while 
the node (or, in other words, output) collation will be `[0, 1]`. Such 
transformation caused by the fact, that aggregate node always puts the grouping 
column in the very beginning of the resulting tuple



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] IGNITE-21897 Sql. Invalid collation passthrough for sorted aggregates [ignite-3]

Reply via email to