[jira] [Updated] (HIVE-7346) Wrong results caused by hive ppd under specific join condition

2014-07-07 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7346:
---

   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Navis!

 Wrong results caused by hive ppd under specific join condition
 --

 Key: HIVE-7346
 URL: https://issues.apache.org/jira/browse/HIVE-7346
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.13.0, 0.13.1
Reporter: dima machlin
Assignee: Navis
 Fix For: 0.14.0

 Attachments: HIVE-7346.1.patch.txt


 Assuming two tables :
 {code:sql} t1(id1 string,id2 string) , t2 (id string,d int) {code}
 t1 contains 1 row : 'a','a'
 t2 contains 1 row : 'a',2
 The following query : 
 {code:sql} select a.*,b.d d1,c.d d2
 from t1 a join t2 b on (a.id1=b.id)
 join t2 c on (a.id2=b.id)
 where b.d =1 and c.d=1 {code}
 Returns 0 rows as expected because t2.d = 2
 Wrapping this query, like so : 
 {code:sql} select * from (
 select a.*,b.d d1,c.d d2
 from t1 a join t2 b on (a.id1=b.id)
 join t2 c on (a.id2=b.id)
 where b.d =1 and c.d=1
 ) z where d11 or d21 {code}
 Where another filter was add on the columns causes the plan to lack the 
 filter of the =1 and return a single row - *Wrong Results*.
 The plan is : 
 {code:sql}
 ABSTRACT SYNTAX TREE:
   (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN 
 (TOK_TABREF (TOK_TABNAME t1) a) (TOK_TABREF (TOK_TABNAME t2) b) (= (. 
 (TOK_TABLE_OR_COL a) id1) (. (TOK_TABLE_OR_COL b) id))) (TOK_TABREF 
 (TOK_TABNAME t2) c) (= (. (TOK_TABLE_OR_COL a) id2) (. (TOK_TABLE_OR_COL b) 
 id (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT 
 (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (. 
 (TOK_TABLE_OR_COL b) d) d1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL c) d) d2)) 
 (TOK_WHERE (and (= (. (TOK_TABLE_OR_COL b) d) 1) (= (. (TOK_TABLE_OR_COL c) 
 d) 1) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) 
 (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or ( (TOK_TABLE_OR_COL 
 d1) 1) ( (TOK_TABLE_OR_COL d2) 1)
 STAGE DEPENDENCIES:
   Stage-7 is a root stage
   Stage-5 depends on stages: Stage-7
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-7
 Map Reduce Local Work
   Alias - Map Local Tables:
 z:b 
   Fetch Operator
 limit: -1
 z:c 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 z:b 
   TableScan
 alias: b
 HashTable Sink Operator
   condition expressions:
 0 {id1} {id2}
 1 {id} {d}
   handleSkewJoin: false
   keys:
 0 [Column[id1]]
 1 [Column[id]]
   Position of Big Table: 0
 z:c 
   TableScan
 alias: c
 HashTable Sink Operator
   condition expressions:
 0 {_col5} {_col0} {_col1}
 1 {d}
   handleSkewJoin: false
   keys:
 0 []
 1 []
   Position of Big Table: 0
   Stage: Stage-5
 Map Reduce
   Alias - Map Operator Tree:
 z:a 
   TableScan
 alias: a
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {id1} {id2}
 1 {id} {d}
   handleSkewJoin: false
   keys:
 0 [Column[id1]]
 1 [Column[id]]
   outputColumnNames: _col0, _col1, _col4, _col5
   Position of Big Table: 0
   Filter Operator
 predicate:
 expr: (_col1 = _col4)
 type: boolean
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col5} {_col0} {_col1}
 1 {d}
   handleSkewJoin: false
   keys:
 0 []
 1 []
   outputColumnNames: _col1, _col4, _col5, _col9
   Position of Big Table: 0
   Filter Operator
 predicate:
 expr: ((_col1  1) or (_col9  1))
 type: boolean
 Select Operator
   expressions:
 expr: _col4
 type: string
 expr: _col5
 type: string
 expr: _col1
 

[jira] [Updated] (HIVE-7346) Wrong results caused by hive ppd under specific join condition

2014-07-03 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7346:


Attachment: HIVE-7346.1.patch.txt

 Wrong results caused by hive ppd under specific join condition
 --

 Key: HIVE-7346
 URL: https://issues.apache.org/jira/browse/HIVE-7346
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0
Reporter: dima machlin
Assignee: Navis
 Attachments: HIVE-7346.1.patch.txt


 Assuming two tables :
 {code:sql} t1(id1 string,id2 string) , t2 (id string,d int) {code}
 t1 contains 1 row : 'a','a'
 t2 contains 1 row : 'a',2
 The following query : 
 {code:sql} select a.*,b.d d1,c.d d2
 from t1 a join t2 b on (a.id1=b.id)
 join t2 c on (a.id2=b.id)
 where b.d =1 and c.d=1 {code}
 Returns 0 rows as expected because t2.d = 2
 Wrapping this query, like so : 
 {code:sql} select * from (
 select a.*,b.d d1,c.d d2
 from t1 a join t2 b on (a.id1=b.id)
 join t2 c on (a.id2=b.id)
 where b.d =1 and c.d=1
 ) z where d11 or d21 {code}
 Where another filter was add on the columns causes the plan to lack the 
 filter of the =1 and return a single row - *Wrong Results*.
 The plan is : 
 {code:sql}
 ABSTRACT SYNTAX TREE:
   (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN 
 (TOK_TABREF (TOK_TABNAME t1) a) (TOK_TABREF (TOK_TABNAME t2) b) (= (. 
 (TOK_TABLE_OR_COL a) id1) (. (TOK_TABLE_OR_COL b) id))) (TOK_TABREF 
 (TOK_TABNAME t2) c) (= (. (TOK_TABLE_OR_COL a) id2) (. (TOK_TABLE_OR_COL b) 
 id (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT 
 (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (. 
 (TOK_TABLE_OR_COL b) d) d1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL c) d) d2)) 
 (TOK_WHERE (and (= (. (TOK_TABLE_OR_COL b) d) 1) (= (. (TOK_TABLE_OR_COL c) 
 d) 1) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) 
 (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or ( (TOK_TABLE_OR_COL 
 d1) 1) ( (TOK_TABLE_OR_COL d2) 1)
 STAGE DEPENDENCIES:
   Stage-7 is a root stage
   Stage-5 depends on stages: Stage-7
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-7
 Map Reduce Local Work
   Alias - Map Local Tables:
 z:b 
   Fetch Operator
 limit: -1
 z:c 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 z:b 
   TableScan
 alias: b
 HashTable Sink Operator
   condition expressions:
 0 {id1} {id2}
 1 {id} {d}
   handleSkewJoin: false
   keys:
 0 [Column[id1]]
 1 [Column[id]]
   Position of Big Table: 0
 z:c 
   TableScan
 alias: c
 HashTable Sink Operator
   condition expressions:
 0 {_col5} {_col0} {_col1}
 1 {d}
   handleSkewJoin: false
   keys:
 0 []
 1 []
   Position of Big Table: 0
   Stage: Stage-5
 Map Reduce
   Alias - Map Operator Tree:
 z:a 
   TableScan
 alias: a
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {id1} {id2}
 1 {id} {d}
   handleSkewJoin: false
   keys:
 0 [Column[id1]]
 1 [Column[id]]
   outputColumnNames: _col0, _col1, _col4, _col5
   Position of Big Table: 0
   Filter Operator
 predicate:
 expr: (_col1 = _col4)
 type: boolean
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col5} {_col0} {_col1}
 1 {d}
   handleSkewJoin: false
   keys:
 0 []
 1 []
   outputColumnNames: _col1, _col4, _col5, _col9
   Position of Big Table: 0
   Filter Operator
 predicate:
 expr: ((_col1  1) or (_col9  1))
 type: boolean
 Select Operator
   expressions:
 expr: _col4
 type: string
 expr: _col5
 type: string
 expr: _col1
 type: int
 expr: _col9
 type: int
   outputColumnNames: _col0, _col1, _col2, _col3

[jira] [Updated] (HIVE-7346) Wrong results caused by hive ppd under specific join condition

2014-07-03 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-7346:


Affects Version/s: 0.13.0
   0.13.1
   Status: Patch Available  (was: Open)

 Wrong results caused by hive ppd under specific join condition
 --

 Key: HIVE-7346
 URL: https://issues.apache.org/jira/browse/HIVE-7346
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1, 0.13.0, 0.12.0
Reporter: dima machlin
Assignee: Navis
 Attachments: HIVE-7346.1.patch.txt


 Assuming two tables :
 {code:sql} t1(id1 string,id2 string) , t2 (id string,d int) {code}
 t1 contains 1 row : 'a','a'
 t2 contains 1 row : 'a',2
 The following query : 
 {code:sql} select a.*,b.d d1,c.d d2
 from t1 a join t2 b on (a.id1=b.id)
 join t2 c on (a.id2=b.id)
 where b.d =1 and c.d=1 {code}
 Returns 0 rows as expected because t2.d = 2
 Wrapping this query, like so : 
 {code:sql} select * from (
 select a.*,b.d d1,c.d d2
 from t1 a join t2 b on (a.id1=b.id)
 join t2 c on (a.id2=b.id)
 where b.d =1 and c.d=1
 ) z where d11 or d21 {code}
 Where another filter was add on the columns causes the plan to lack the 
 filter of the =1 and return a single row - *Wrong Results*.
 The plan is : 
 {code:sql}
 ABSTRACT SYNTAX TREE:
   (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN 
 (TOK_TABREF (TOK_TABNAME t1) a) (TOK_TABREF (TOK_TABNAME t2) b) (= (. 
 (TOK_TABLE_OR_COL a) id1) (. (TOK_TABLE_OR_COL b) id))) (TOK_TABREF 
 (TOK_TABNAME t2) c) (= (. (TOK_TABLE_OR_COL a) id2) (. (TOK_TABLE_OR_COL b) 
 id (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT 
 (TOK_SELEXPR (TOK_ALLCOLREF (TOK_TABNAME a))) (TOK_SELEXPR (. 
 (TOK_TABLE_OR_COL b) d) d1) (TOK_SELEXPR (. (TOK_TABLE_OR_COL c) d) d2)) 
 (TOK_WHERE (and (= (. (TOK_TABLE_OR_COL b) d) 1) (= (. (TOK_TABLE_OR_COL c) 
 d) 1) z)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) 
 (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF)) (TOK_WHERE (or ( (TOK_TABLE_OR_COL 
 d1) 1) ( (TOK_TABLE_OR_COL d2) 1)
 STAGE DEPENDENCIES:
   Stage-7 is a root stage
   Stage-5 depends on stages: Stage-7
   Stage-0 is a root stage
 STAGE PLANS:
   Stage: Stage-7
 Map Reduce Local Work
   Alias - Map Local Tables:
 z:b 
   Fetch Operator
 limit: -1
 z:c 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 z:b 
   TableScan
 alias: b
 HashTable Sink Operator
   condition expressions:
 0 {id1} {id2}
 1 {id} {d}
   handleSkewJoin: false
   keys:
 0 [Column[id1]]
 1 [Column[id]]
   Position of Big Table: 0
 z:c 
   TableScan
 alias: c
 HashTable Sink Operator
   condition expressions:
 0 {_col5} {_col0} {_col1}
 1 {d}
   handleSkewJoin: false
   keys:
 0 []
 1 []
   Position of Big Table: 0
   Stage: Stage-5
 Map Reduce
   Alias - Map Operator Tree:
 z:a 
   TableScan
 alias: a
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {id1} {id2}
 1 {id} {d}
   handleSkewJoin: false
   keys:
 0 [Column[id1]]
 1 [Column[id]]
   outputColumnNames: _col0, _col1, _col4, _col5
   Position of Big Table: 0
   Filter Operator
 predicate:
 expr: (_col1 = _col4)
 type: boolean
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {_col5} {_col0} {_col1}
 1 {d}
   handleSkewJoin: false
   keys:
 0 []
 1 []
   outputColumnNames: _col1, _col4, _col5, _col9
   Position of Big Table: 0
   Filter Operator
 predicate:
 expr: ((_col1  1) or (_col9  1))
 type: boolean
 Select Operator
   expressions:
 expr: _col4
 type: string
 expr: _col5
 type: string
 expr: _col1
 type: int
 expr: _col9