[jira] [Updated] (ARROW-12266) [Rust][DataFusion] Fix null handling hash join

Jira Wed, 07 Apr 2021 12:07:04 -0700


     [ 
https://issues.apache.org/jira/browse/ARROW-12266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Daniël Heres updated ARROW-12266:
---------------------------------
    Description: 
Improve null handling of 

SELECT id1, id2 FROM (SELECT null AS id1) t1
 INNER JOIN (SELECT 0 AS id2) t2 ON id1 = id2

> NULL, NULL

(should be empty result set)

We should filter beforehand to make this result correct. Also this can make 
things more efficient as the non-null filter can be pushed down which can lead 
to efficiency gains (making data-set smaller, not having to deal with nullable 
data, or even entire files could be skipped when they only contain nulls).

  was:
Improve null handling of 


SELECT id1, id2 FROM (SELECT null AS id1) t1
LEFT JOIN (SELECT 0 AS id2) t2 ON id1 = id2

> NULL, NULL

(should be empty result set)

We should filter beforehand to make this result correct. Also this can make 
things more efficient as the non-null filter can be pushed down which can lead 
to efficiency gains (making data-set smaller, not having to deal with nullable 
data, or even entire files could be skipped when they only contain nulls).


> [Rust][DataFusion] Fix null handling hash join
> ----------------------------------------------
>
>                 Key: ARROW-12266
>                 URL: https://issues.apache.org/jira/browse/ARROW-12266
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Rust - DataFusion
>            Reporter: Daniël Heres
>            Assignee: Daniël Heres
>            Priority: Major
>
> Improve null handling of 
> SELECT id1, id2 FROM (SELECT null AS id1) t1
>  INNER JOIN (SELECT 0 AS id2) t2 ON id1 = id2
> > NULL, NULL
> (should be empty result set)
> We should filter beforehand to make this result correct. Also this can make 
> things more efficient as the non-null filter can be pushed down which can 
> lead to efficiency gains (making data-set smaller, not having to deal with 
> nullable data, or even entire files could be skipped when they only contain 
> nulls).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (ARROW-12266) [Rust][DataFusion] Fix null handling hash join

Reply via email to