walterddr commented on code in PR #9895:
URL: https://github.com/apache/pinot/pull/9895#discussion_r1042904694


##########
pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/HashJoinOperator.java:
##########
@@ -42,51 +43,98 @@
 
 /**
  * This basic {@code BroadcastJoinOperator} implement a basic broadcast join 
algorithm.
+ * This algorithm assumes that the broadcast table has to fit in memory since 
we are not supporting any spilling.
  *
+ * For left join and inner join,
  * <p>It takes the right table as the broadcast side and materialize a hash 
table. Then for each of the left table row,
  * it looks up for the corresponding row(s) from the hash table and create a 
joint row.
  *
  * <p>For each of the data block received from the left table, it will 
generate a joint data block.
  *
- * We currently support left join, inner join and semi join.
+ * For right join,
+ * We broadcast the left table and probe the hash table using right table.
+ *
+ * We currently support left join, inner join and right join.
  * The output is in the format of [left_row, right_row]
  */
 // TODO: Move inequi out of hashjoin. 
(https://github.com/apache/pinot/issues/9728)
 public class HashJoinOperator extends BaseOperator<TransferableBlock> {
+  private static class JoinResolver {
+    public static JoinResolver create(JoinRelType joinType, 
Operator<TransferableBlock> leftTableOperator,
+        Operator<TransferableBlock> rightTableOperator, KeySelector 
leftKeySelector, KeySelector rightKeySelector) {
+      JoinResolver resolver = new JoinResolver();
+      switch (joinType) {
+        case LEFT:
+        case INNER:
+          resolver._broadcastOperator = rightTableOperator;
+          resolver._probeOperator = leftTableOperator;
+          resolver._probeKeySelector = leftKeySelector;
+          resolver._broadcastKeySelector = rightKeySelector;
+          resolver._getLeftRow = (Object[] probeRow, Object[] broadcastRow) -> 
probeRow;
+          resolver._getRightRow = (Object[] probeRow, Object[] broadcastRow) 
-> broadcastRow;
+          break;
+        case RIGHT:
+          resolver._broadcastOperator = leftTableOperator;
+          resolver._probeOperator = rightTableOperator;
+          resolver._probeKeySelector = rightKeySelector;
+          resolver._broadcastKeySelector = leftKeySelector;
+          resolver._getLeftRow = (Object[] probeRow, Object[] broadcastRow) -> 
broadcastRow;
+          resolver._getRightRow = (Object[] probeRow, Object[] broadcastRow) 
-> probeRow;
+          break;

Review Comment:
   i felt like this approach is a bit too complex and wasn't sure if this is 
the most appropriate approach. 
   for a more straight forward way to handle this we can simple do the exact 
same algorithm
   1. if LEFT JOIN || FULL OUTER, and right-side hashmap match not found, 
return left side + right null
   2. if RIGHT JOIN || FULL OUTER, and right-side hashmap match has not been 
used, return left null + right side.
   
   this way we can also address #9907 together. which you will have to do it 
anyway. 
   for a more concrete / optimized solution we should consider rewriting 
relational correlations.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to