agavra opened a new issue, #9998:
URL: https://github.com/apache/pinot/issues/9998

   Background: imagine we have the two following table schemas:
   ```
   (A) colA INT, colB STRING
   (B) col1 STRING, col2 INT
   ```
   and I want to issue `SELECT * FROM A JOIN B ON A.colA = B.col2 AND A.colB = 
B.col1`. In this case, we will need to hash shuffle the rows to match 
partitions. In Calcite, we generate a hash distribution with the same join keys:
   ```
   leftExchange = LogicalExchange.create(leftInput, 
RelDistributions.hash(joinInfo.leftKeys));
   rightExchange = LogicalExchange.create(rightInput, 
RelDistributions.hash(joinInfo.rightKeys));
   ```
   
   When we do this, Calcite will order the join keys passed into 
`RelDistributions.hash` to be in ascending order, even if we passed them in 
with a particular ordering. In the example above, we would have called:
   ```
   RelDistributions.hash(1, 2); // for A, because join keys are in order
   RelDistributsions.hash(2, 1); // for B, because we want col2 to be the first 
join key
   ```
   But calcite reorders them to both be `[1, 2]`. 
   
   This causes a problem when we hash our keys. Imagine we had the following 
rows:
   ```
   (a) colA: 1, colB: "foo"
   (b) col1: "foo", col2: 1
   ```
   If we simply use the column ordering that calcite provides for the hash 
exchange, we may not hash these two rows into the same partition.
   
   #9996 provides a workaround for the solution by using a hash code algorithm 
that intentionally generates collisions independent of the ordering of the 
columns (using hash code addition).
   
   This is likely an acceptable solution in the long run so long as the 
distribution of the hash codes is semi-random even after addition (some initial 
experimentations show that it is). A better solution, however, would be to fix 
calcite to maintain the join key ordering so that we can just hash the exact 
keys.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to