[
https://issues.apache.org/jira/browse/PHOENIX-83?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maryann Xue resolved PHOENIX-83.
--------------------------------
Resolution: Not A Problem
Strictly speaking, the where clause should be executed after the join
operation, which means the predicates in the where clause should be evaluated
against the join result, instead of against the pre-join individual tables.
However, in a lot of cases, like independent predicates on both tables in a
inner join, or on the left table of a left outer join, the result would be the
same if we apply the predicates in an earlier stage on corresponding tables.
And in Phoenix, we have such optimizations already.
The second case you listed there is a bit tricky, since any comparison
expression (except "is null" test) would return false here when evaluating on a
null value in the join result, so does the predicate "test_join2.int_column >
50" in your query.
> select with left join when the where condition has right tables, the result
> is not expected
> -------------------------------------------------------------------------------------------
>
> Key: PHOENIX-83
> URL: https://issues.apache.org/jira/browse/PHOENIX-83
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 3.0.0
> Reporter: jay wong
> Assignee: Maryann Xue
> Labels: patch
> Fix For: 3.0.0
>
>
> I have two table.
> select * from test_join1;
> | VARCHAR_PK | CHAR_PK | INT_COLUMN | INT_COLUMN_B |
> | Aarchar0 | A0 | 50 | 50 |
> | Aarchar1 | A1 | 51 | 51 |
> select * from test_join2;
> | VARCHAR_PK | CHAR_PK | INT_COLUMN | INT_COLUMN_B |
> | Barchar1 | B1 | 51 | 51 |
> | Barchar2 | B2 | 52 | 52 |
> | Barchar3 | B3 | 53 | 53 |
> | Barchar4 | B4 | 54 | 54 |
> | Barchar5 | B5 | 55 | 55 |
> first query:
> select test_join1.varchar_pk,sum(test_join2.int_column),
> min(test_join2.int_column_b) from test_join1 left join test_join2 on
> test_join1.int_column = test_join2.int_column where test_join1.int_column >
> 40 and test_join1.int_column < 100 GROUP BY TEST_JOIN1.VARCHAR_PK;
> | VARCHAR_PK | SUM(TEST_JOIN2.INT_COLUMN(PROJECTED[1])) |
> MIN(TEST_JOIN2.INT_COLUMN_B(PROJECTED[2])) |
> | Aarchar0 | null | null
> |
> | Aarchar1 | 51 | 51
> |
> the result is expected。
> but when query with
> select test_join1.varchar_pk,sum(test_join2.int_column),
> min(test_join2.int_column_b) from test_join1 left join test_join2 on
> test_join1.int_column = test_join2.int_column where test_join2.int_column >
> 50 and test_join1.int_column < 53 GROUP BY TEST_JOIN1.VARCHAR_PK;
> | VARCHAR_PK | SUM(TEST_JOIN2.INT_COLUMN(PROJECTED[1])) |
> MIN(TEST_JOIN2.INT_COLUMN_B(PROJECTED[2])) |
> | Aarchar1 | 51 | 51
> |
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)