Dandandan commented on issue #235:
URL: 
https://github.com/apache/arrow-datafusion/issues/235#issuecomment-830792427


   Thanks @jorgecarleitao
   
   I added an implementation of left join where unmatched left rows are 
produced at the end of a stream.
   I think there might be some possible improvements:
   
   * Use a bitmap structure instead of `Vec<bool>`. Efficiency-wise, the 
current PR should already be a large improvement though (don't have any 
benchmarks to prove it ATM, but a new hashset for each batch seems like it will 
be quite slow).
   * Generate the unmatched rows in batches with the configured batch size. 
Currently, it generates them in "one go".
   
   @andygrove this also seems to fix the tests in this issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to