Dandandan opened a new pull request #11:
URL: https://github.com/apache/arrow-datafusion/pull/11


   This is a first (naive, but probably not that bad) implementation of the 
cartesian join and CROSS JOIN syntax.
   
   The left side gets loaded into memory and the right side is streamed and 
gets combined with the left side.
   
   Memory consumption could be improved, the current implementation results in 
large batches if both of the sides are big, which could be solved by keeping a 
"cursor" of the left side and producing the batches one by one instead of 
concatenating the result of the full cartesian product.
   
   FYI @andygrove @alamb @jorgecarleitao
   
   This also makes query 9 run in DataFusion (though performance is not OK, but 
I believe that should be not related to the cross join itself, but is caused by 
another issue).
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to