gortiz commented on code in PR #14797:
URL: https://github.com/apache/pinot/pull/14797#discussion_r1915363570


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/InStageStatsTreeBuilder.java:
##########
@@ -138,11 +138,11 @@ public ObjectNode visitFilter(FilterNode node, Void 
context) {
 
   @Override
   public ObjectNode visitJoin(JoinNode node, Void context) {
-    if (node.getJoinStrategy() == JoinNode.JoinStrategy.HASH) {
-      return recursiveCase(node, MultiStageOperator.Type.HASH_JOIN);
-    } else {
-      assert node.getJoinStrategy() == JoinNode.JoinStrategy.LOOKUP;
+    if (node.getJoinStrategy() == JoinNode.JoinStrategy.LOOKUP) {
       return recursiveCase(node, MultiStageOperator.Type.LOOKUP_JOIN);
+    } else {
+      // TODO: Consider renaming this operator type. It handles multiple join 
strategies.

Review Comment:
   I think we are using names here in a strange way. This is a hash operator 
because it implements the join using a hash map. The other is a lookup join 
operator because it implements it using lookup logic.
   
   In parallel, we have join strategies. One of the strategies creates logical 
partitions at query time based on the values of the columns being joined. The 
way these partitions are decided is based on hash code, so it is called hash 
strategy. In the documentation I used [Query time partition join 
strategy](https://docs.pinot.apache.org/users/user-guide-query/multi-stage-query/join-strategies/query-time-partition-join-strategy)
 because I didn't want to focus too much on the fact that is being using 
hashes. 
   
   Imagine a scenario where we add sorted joins. The type of the join should be 
_sort_ and the strategy used for the distribution of its inputs may be hash.
   
   TL;DR: I think we need to distinguish between join algorithm (lookup, hash, 
sorted, nested look) and distribution strategies (hash/partitioned, local, 
randon, broadcast, etc)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to