amaliujia commented on a change in pull request #2006:
URL: https://github.com/apache/calcite/pull/2006#discussion_r439784121



##########
File path: 
core/src/main/java/org/apache/calcite/adapter/enumerable/EnumerableMergeJoin.java
##########
@@ -222,6 +277,30 @@ public static boolean isMergeJoinSupported(JoinRelType 
joinType) {
     return mapping;
   }
 
+  private RelCollation extendCollation(RelCollation collation, List<Integer> 
keys) {
+    List<RelFieldCollation> fieldsForNewCollation = new 
ArrayList<>(keys.size());
+    fieldsForNewCollation.addAll(collation.getFieldCollations());
+    Set<Integer> keySet = new HashSet<>(keys);

Review comment:
       No for subset cases. For example, required collation [foo.a, foo.b] 
defined on `foo.a=bar.a and foo.b=bar.b and foo.c=bar.d and foo.d=bar.d`, 
either pass down [foo.a, foo.b, foo.c, foo.d] or pass down [foo.a, foo.b, 
foo.d, foo.c] will give correct answer (same to push the same collations to 
right join input).
   
   It is because, for MergeJoin implementation, the pointers moves when left 
tuple not equals to right tuple, and the only requirement is next tuple should 
be bigger than previous ones.
   
   Still use example above:
   left join input and right join input both are sorted by [a, b, d, c], and 
right join input is sorted by [a, b]. So 
   1) `left_tuple.compare(right_tuple) < 0`, move left pointer, of course next 
tuple will be both bigger than a) the previous left tuple b) the current right 
tuple (due to they are all sorted by [a, b]).
   2) same for ``left_tuple.compare(right_tuple) > 0` case.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to