[ 
https://issues.apache.org/jira/browse/ARROW-10971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniël Heres updated ARROW-10971:
---------------------------------
    Description: 
Currently the left join generates a null for every row that is not present in 
the right batch.

However, this is wrong, as there should be no match in _all_ of the right 
batches.

The current implementation generates extra (left, none) tuples for every batch 
when there is no match against a left key.

To fix it, we need to mark the keys or indexes on the left side as visited and 
scan the items once at the end to generate the rows without any match. 

  was:
Currently the left join generates a null for every row that is not present in 
the right batch.

However, this is wrong, as there should be no match in _all_ of the right 
batches.

The current implementation generates extra (left, none) tuples for every batch 
where the left side is not present. 

To fix it, we need to mark the keys or indexes on the left side as visited and 
scan the items once at the end to generate the rows without any match. 


> [Rust][DataFusion] Left Join implementation is wrong for multiple batches on 
> right side
> ---------------------------------------------------------------------------------------
>
>                 Key: ARROW-10971
>                 URL: https://issues.apache.org/jira/browse/ARROW-10971
>             Project: Apache Arrow
>          Issue Type: Bug
>            Reporter: Daniël Heres
>            Priority: Blocker
>
> Currently the left join generates a null for every row that is not present in 
> the right batch.
> However, this is wrong, as there should be no match in _all_ of the right 
> batches.
> The current implementation generates extra (left, none) tuples for every 
> batch when there is no match against a left key.
> To fix it, we need to mark the keys or indexes on the left side as visited 
> and scan the items once at the end to generate the rows without any match. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to