[
https://issues.apache.org/jira/browse/ARROW-12266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniël Heres updated ARROW-12266:
---------------------------------
Description:
Improve null handling of
SELECT id1, id2 FROM (SELECT null AS id1) t1
INNER JOIN (SELECT 0 AS id2) t2 ON id1 = id2
> NULL, NULL
(should be empty result set)
We should filter beforehand to make this result correct. Also this can make
things more efficient as the non-null filter can be pushed down which can lead
to efficiency gains (making data-set smaller, not having to deal with nullable
data, or even entire files could be skipped when they only contain nulls).
was:
Improve null handling of
SELECT id1, id2 FROM (SELECT null AS id1) t1
LEFT JOIN (SELECT 0 AS id2) t2 ON id1 = id2
> NULL, NULL
(should be empty result set)
We should filter beforehand to make this result correct. Also this can make
things more efficient as the non-null filter can be pushed down which can lead
to efficiency gains (making data-set smaller, not having to deal with nullable
data, or even entire files could be skipped when they only contain nulls).
> [Rust][DataFusion] Fix null handling hash join
> ----------------------------------------------
>
> Key: ARROW-12266
> URL: https://issues.apache.org/jira/browse/ARROW-12266
> Project: Apache Arrow
> Issue Type: Bug
> Components: Rust - DataFusion
> Reporter: Daniël Heres
> Assignee: Daniël Heres
> Priority: Major
>
> Improve null handling of
> SELECT id1, id2 FROM (SELECT null AS id1) t1
> INNER JOIN (SELECT 0 AS id2) t2 ON id1 = id2
> > NULL, NULL
> (should be empty result set)
> We should filter beforehand to make this result correct. Also this can make
> things more efficient as the non-null filter can be pushed down which can
> lead to efficiency gains (making data-set smaller, not having to deal with
> nullable data, or even entire files could be skipped when they only contain
> nulls).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)