[
https://issues.apache.org/jira/browse/HIVE-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977003#comment-14977003
]
Bolke de Bruin commented on HIVE-12216:
---------------------------------------
Update: we have noticed this mainly happens with large datasets and smaller
datasets do not seem to affected. Also we see a similar issue that we will
report separately.
> WHERE on the FROM table not (always) working when JOIN are present
> ------------------------------------------------------------------
>
> Key: HIVE-12216
> URL: https://issues.apache.org/jira/browse/HIVE-12216
> Project: Hive
> Issue Type: Bug
> Components: hpl/sql, Parser, Query Processor, SQL
> Affects Versions: 1.1.0, 1.2.1
> Environment: CDH 5.4.7 HDP2.3.2 MR TEZ
> Reporter: Bolke de Bruin
> Priority: Blocker
>
> In case we use a where clause in a state where also joins are present, the
> clauses are not (always) respected. We have been able to reproduce this issue
> consistently with Hive 1.1.0 on MR, Hive 1.2.1 on Tez (MR Fails here).
> So fo the below query we *do* get results back like:
> 'gs.i_s_c = 23' (and this goes for all clauses!)
> CREATE TABLE tmp_hub_and_sats AS
> SELECT
> f.dt,
> f.t_c,
> sum(f.transaction_amount) as amount,
> sum(f.amount_euro) amount_euro,
> IF(f.org_grid is null, f.org_cust, f.org_grid) as org,
> IF(f.org_grid is null, 0, 1) as is_org_grid,
> IF(f.org_up is null, if(f.org_grid is null, f.org_cust, f.org_grid),
> f.org_up) as org_up,
> IF(f.to_grid is null, f.to_cust, f.to_grid) to,
> IF(f.to_grid is null, 0, 1) as is_to_grid,
> IF(f.to_up is null, if(f.to_grid is null, f.to_cust, f.to_grid), f.to_up)
> as to_up,
> gh.i_g_c as customer_code_hub,
> gs.i_g_c as customer_code_satellite
> from x_grid_orders f
> LEFT OUTER JOIN
> grid.grid gh
> ON f.org_grid = gh.hashed_gridid
> LEFT OUTER JOIN
> grid.grid gs
> ON f.to_grid = gs.hashed_gridid
> where
> IF(f.org_up is null, f.org_cust, f.org_up) <> IF(f.to_up is null, f.to_cust,
> f.to_up)
> AND
> (substring(gh.i_g_c, 1, 2) <> "06" or gh.i_g_c is null)
> AND
> (substring(gs.i_g_c, 1, 2) <> "06" or gs.i_g_c is null)
> AND
> (gh.i_s_c <> "23" or gh.i_s_c is null)
> AND
> (gs.i_s_c <> "23" or gs.i_s_c is null)
> group by
> f.dt,
> f.t_c,
> IF(f.org_grid is null, f.org_cust, f.org_grid),
> IF(f.org_grid is null, 0, 1),
> IF(f.org_up is null, if(f.org_grid is null, f.org_cust, f.org_grid),
> f.org_up),
> IF(f.to_grid is null, f.to_cust, f.to_grid),
> IF(f.to_grid is null, 0, 1),
> IF(f.to_up is null, if(f.to_grid is null, f.to_cust, f.to_grid), f.to_up),
> gh.i_g_c,
> gs.i_g_c
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)