[ 
https://issues.apache.org/jira/browse/PHOENIX-83?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maryann Xue resolved PHOENIX-83.
--------------------------------

    Resolution: Not A Problem

Strictly speaking, the where clause should be executed after the join 
operation, which means the predicates in the where clause should be evaluated 
against the join result, instead of against the pre-join individual tables.
However, in a lot of cases, like independent predicates on both tables in a 
inner join, or on the left table of a left outer join, the result would be the 
same if we apply the predicates in an earlier stage on corresponding tables. 
And in Phoenix, we have such optimizations already.

The second case you listed there is a bit tricky, since any comparison 
expression (except "is null" test) would return false here when evaluating on a 
null value in the join result, so does the predicate "test_join2.int_column > 
50" in your query.

> select with left join when the where condition has right tables, the result 
> is not expected
> -------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-83
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-83
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 3.0.0
>            Reporter: jay wong
>            Assignee: Maryann Xue
>              Labels: patch
>             Fix For: 3.0.0
>
>
> I have two table.
>  select * from test_join1;
> | VARCHAR_PK | CHAR_PK | INT_COLUMN | INT_COLUMN_B |
> | Aarchar0   | A0      | 50         | 50           |
> | Aarchar1   | A1      | 51         | 51           |
> select * from test_join2;
> | VARCHAR_PK | CHAR_PK | INT_COLUMN | INT_COLUMN_B |
> | Barchar1   | B1      | 51         | 51           |
> | Barchar2   | B2      | 52         | 52           |
> | Barchar3   | B3      | 53         | 53           |
> | Barchar4   | B4      | 54         | 54           |
> | Barchar5   | B5      | 55         | 55           |
> first query:
> select test_join1.varchar_pk,sum(test_join2.int_column), 
> min(test_join2.int_column_b) from test_join1 left join test_join2 on 
> test_join1.int_column = test_join2.int_column where test_join1.int_column > 
> 40 and test_join1.int_column < 100    GROUP BY TEST_JOIN1.VARCHAR_PK;
> | VARCHAR_PK | SUM(TEST_JOIN2.INT_COLUMN(PROJECTED[1])) | 
> MIN(TEST_JOIN2.INT_COLUMN_B(PROJECTED[2])) |
> | Aarchar0   | null                                     | null                
>                        |
> | Aarchar1   | 51                                       | 51                  
>                        |
> the result is expected。
> but when query with 
> select test_join1.varchar_pk,sum(test_join2.int_column), 
> min(test_join2.int_column_b) from test_join1 left join test_join2 on 
> test_join1.int_column = test_join2.int_column where test_join2.int_column > 
> 50 and test_join1.int_column < 53    GROUP BY TEST_JOIN1.VARCHAR_PK;
> | VARCHAR_PK | SUM(TEST_JOIN2.INT_COLUMN(PROJECTED[1])) | 
> MIN(TEST_JOIN2.INT_COLUMN_B(PROJECTED[2])) |
> | Aarchar1   | 51                                       | 51                  
>                        |



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to