Dandandan opened a new pull request #11: URL: https://github.com/apache/arrow-datafusion/pull/11
This is a first (naive, but probably not that bad) implementation of the cartesian join and CROSS JOIN syntax. The left side gets loaded into memory and the right side is streamed and gets combined with the left side. Memory consumption could be improved, the current implementation results in large batches if both of the sides are big, which could be solved by keeping a "cursor" of the left side and producing the batches one by one instead of concatenating the result of the full cartesian product. FYI @andygrove @alamb @jorgecarleitao This also makes query 9 run in DataFusion (though performance is not OK, but I believe that should be not related to the cross join itself, but is caused by another issue). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
