[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)

2016-12-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15122:
---
   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Fails are unrelated. Pushed to master, thanks for reviewing [~ashutoshc]!

> Hive: Upcasting types should not obscure stats (min/max/ndv)
> 
>
> Key: HIVE-15122
> URL: https://issues.apache.org/jira/browse/HIVE-15122
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Jesus Camacho Rodriguez
> Fix For: 2.2.0
>
> Attachments: HIVE-15122.03.patch, HIVE-15122.patch
>
>
> A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in 
> LLAP.
> Snippet from the bad plan.
> {code}
> | STAGE PLANS:
>   
>|
> |   Stage: Stage-1
>   
>|
> | Tez 
>   
>|
> |   DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 
>   
>|
> |   Edges:
>   
>|
> | Map 2 <- Map 1 (BROADCAST_EDGE) 
>   
>|
> | Map 3 <- Map 2 (BROADCAST_EDGE) 
>   
>|
> | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 
> (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)  
> |
> | Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
>   
>|
> | Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
>   
>|
> |   DagName:  
>   
>|
> |   Vertices: 
>   
>|
> | Map 1   
>   
>|
> | Map Operator Tree:  
>   
>|
> | TableScan   
>   
>|
> |   alias: supplier   
>   
>|
> |   filterExpr: (s_suppkey is not null and s_nationkey is not 
> null) (type: boolean) 
>|
> |   Statistics: Num rows: 1000 Data size: 16000 Basic 
> stats: COMPLETE Column stats: COMPLETE
>|
> |   Filter Operator   
>   
>|
> | predicate: (s_suppkey is not null and s_nationkey is 
> not null) (type: boolean) 
>   |
> | Statistics: Num rows: 1000 Data size: 16000 
> Basic stats: COMPLETE Column stats: COMPLETE  
>|
> | Select Operator 
>   
>|
> |   

[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)

2016-12-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15122:
---
Attachment: (was: HIVE-15122.01.patch)

> Hive: Upcasting types should not obscure stats (min/max/ndv)
> 
>
> Key: HIVE-15122
> URL: https://issues.apache.org/jira/browse/HIVE-15122
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15122.02.patch, HIVE-15122.patch
>
>
> A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in 
> LLAP.
> Snippet from the bad plan.
> {code}
> | STAGE PLANS:
>   
>|
> |   Stage: Stage-1
>   
>|
> | Tez 
>   
>|
> |   DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 
>   
>|
> |   Edges:
>   
>|
> | Map 2 <- Map 1 (BROADCAST_EDGE) 
>   
>|
> | Map 3 <- Map 2 (BROADCAST_EDGE) 
>   
>|
> | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 
> (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)  
> |
> | Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
>   
>|
> | Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
>   
>|
> |   DagName:  
>   
>|
> |   Vertices: 
>   
>|
> | Map 1   
>   
>|
> | Map Operator Tree:  
>   
>|
> | TableScan   
>   
>|
> |   alias: supplier   
>   
>|
> |   filterExpr: (s_suppkey is not null and s_nationkey is not 
> null) (type: boolean) 
>|
> |   Statistics: Num rows: 1000 Data size: 16000 Basic 
> stats: COMPLETE Column stats: COMPLETE
>|
> |   Filter Operator   
>   
>|
> | predicate: (s_suppkey is not null and s_nationkey is 
> not null) (type: boolean) 
>   |
> | Statistics: Num rows: 1000 Data size: 16000 
> Basic stats: COMPLETE Column stats: COMPLETE  
>|
> | Select Operator 
>   
>|
> |   expressions: s_suppkey (type: bigint), s_nationkey 
> (type: bigint)
>   |
> | 

[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)

2016-12-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15122:
---
Attachment: HIVE-15122.03.patch

> Hive: Upcasting types should not obscure stats (min/max/ndv)
> 
>
> Key: HIVE-15122
> URL: https://issues.apache.org/jira/browse/HIVE-15122
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15122.03.patch, HIVE-15122.patch
>
>
> A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in 
> LLAP.
> Snippet from the bad plan.
> {code}
> | STAGE PLANS:
>   
>|
> |   Stage: Stage-1
>   
>|
> | Tez 
>   
>|
> |   DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 
>   
>|
> |   Edges:
>   
>|
> | Map 2 <- Map 1 (BROADCAST_EDGE) 
>   
>|
> | Map 3 <- Map 2 (BROADCAST_EDGE) 
>   
>|
> | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 
> (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)  
> |
> | Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
>   
>|
> | Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
>   
>|
> |   DagName:  
>   
>|
> |   Vertices: 
>   
>|
> | Map 1   
>   
>|
> | Map Operator Tree:  
>   
>|
> | TableScan   
>   
>|
> |   alias: supplier   
>   
>|
> |   filterExpr: (s_suppkey is not null and s_nationkey is not 
> null) (type: boolean) 
>|
> |   Statistics: Num rows: 1000 Data size: 16000 Basic 
> stats: COMPLETE Column stats: COMPLETE
>|
> |   Filter Operator   
>   
>|
> | predicate: (s_suppkey is not null and s_nationkey is 
> not null) (type: boolean) 
>   |
> | Statistics: Num rows: 1000 Data size: 16000 
> Basic stats: COMPLETE Column stats: COMPLETE  
>|
> | Select Operator 
>   
>|
> |   expressions: s_suppkey (type: bigint), s_nationkey 
> (type: bigint)
>   |
> |

[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)

2016-12-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15122:
---
Attachment: (was: HIVE-15122.02.patch)

> Hive: Upcasting types should not obscure stats (min/max/ndv)
> 
>
> Key: HIVE-15122
> URL: https://issues.apache.org/jira/browse/HIVE-15122
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15122.03.patch, HIVE-15122.patch
>
>
> A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in 
> LLAP.
> Snippet from the bad plan.
> {code}
> | STAGE PLANS:
>   
>|
> |   Stage: Stage-1
>   
>|
> | Tez 
>   
>|
> |   DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 
>   
>|
> |   Edges:
>   
>|
> | Map 2 <- Map 1 (BROADCAST_EDGE) 
>   
>|
> | Map 3 <- Map 2 (BROADCAST_EDGE) 
>   
>|
> | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 
> (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)  
> |
> | Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
>   
>|
> | Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
>   
>|
> |   DagName:  
>   
>|
> |   Vertices: 
>   
>|
> | Map 1   
>   
>|
> | Map Operator Tree:  
>   
>|
> | TableScan   
>   
>|
> |   alias: supplier   
>   
>|
> |   filterExpr: (s_suppkey is not null and s_nationkey is not 
> null) (type: boolean) 
>|
> |   Statistics: Num rows: 1000 Data size: 16000 Basic 
> stats: COMPLETE Column stats: COMPLETE
>|
> |   Filter Operator   
>   
>|
> | predicate: (s_suppkey is not null and s_nationkey is 
> not null) (type: boolean) 
>   |
> | Statistics: Num rows: 1000 Data size: 16000 
> Basic stats: COMPLETE Column stats: COMPLETE  
>|
> | Select Operator 
>   
>|
> |   expressions: s_suppkey (type: bigint), s_nationkey 
> (type: bigint)
>   |
> | 

[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)

2016-12-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15122:
---
Attachment: HIVE-15122.02.patch

> Hive: Upcasting types should not obscure stats (min/max/ndv)
> 
>
> Key: HIVE-15122
> URL: https://issues.apache.org/jira/browse/HIVE-15122
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15122.02.patch, HIVE-15122.patch
>
>
> A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in 
> LLAP.
> Snippet from the bad plan.
> {code}
> | STAGE PLANS:
>   
>|
> |   Stage: Stage-1
>   
>|
> | Tez 
>   
>|
> |   DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 
>   
>|
> |   Edges:
>   
>|
> | Map 2 <- Map 1 (BROADCAST_EDGE) 
>   
>|
> | Map 3 <- Map 2 (BROADCAST_EDGE) 
>   
>|
> | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 
> (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)  
> |
> | Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
>   
>|
> | Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
>   
>|
> |   DagName:  
>   
>|
> |   Vertices: 
>   
>|
> | Map 1   
>   
>|
> | Map Operator Tree:  
>   
>|
> | TableScan   
>   
>|
> |   alias: supplier   
>   
>|
> |   filterExpr: (s_suppkey is not null and s_nationkey is not 
> null) (type: boolean) 
>|
> |   Statistics: Num rows: 1000 Data size: 16000 Basic 
> stats: COMPLETE Column stats: COMPLETE
>|
> |   Filter Operator   
>   
>|
> | predicate: (s_suppkey is not null and s_nationkey is 
> not null) (type: boolean) 
>   |
> | Statistics: Num rows: 1000 Data size: 16000 
> Basic stats: COMPLETE Column stats: COMPLETE  
>|
> | Select Operator 
>   
>|
> |   expressions: s_suppkey (type: bigint), s_nationkey 
> (type: bigint)
>   |
> |

[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)

2016-12-16 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15122:
---
Attachment: HIVE-15122.01.patch

> Hive: Upcasting types should not obscure stats (min/max/ndv)
> 
>
> Key: HIVE-15122
> URL: https://issues.apache.org/jira/browse/HIVE-15122
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15122.01.patch, HIVE-15122.patch
>
>
> A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in 
> LLAP.
> Snippet from the bad plan.
> {code}
> | STAGE PLANS:
>   
>|
> |   Stage: Stage-1
>   
>|
> | Tez 
>   
>|
> |   DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 
>   
>|
> |   Edges:
>   
>|
> | Map 2 <- Map 1 (BROADCAST_EDGE) 
>   
>|
> | Map 3 <- Map 2 (BROADCAST_EDGE) 
>   
>|
> | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 
> (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)  
> |
> | Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
>   
>|
> | Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
>   
>|
> |   DagName:  
>   
>|
> |   Vertices: 
>   
>|
> | Map 1   
>   
>|
> | Map Operator Tree:  
>   
>|
> | TableScan   
>   
>|
> |   alias: supplier   
>   
>|
> |   filterExpr: (s_suppkey is not null and s_nationkey is not 
> null) (type: boolean) 
>|
> |   Statistics: Num rows: 1000 Data size: 16000 Basic 
> stats: COMPLETE Column stats: COMPLETE
>|
> |   Filter Operator   
>   
>|
> | predicate: (s_suppkey is not null and s_nationkey is 
> not null) (type: boolean) 
>   |
> | Statistics: Num rows: 1000 Data size: 16000 
> Basic stats: COMPLETE Column stats: COMPLETE  
>|
> | Select Operator 
>   
>|
> |   expressions: s_suppkey (type: bigint), s_nationkey 
> (type: bigint)
>   |
> |

[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)

2016-12-15 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15122:
---
Attachment: HIVE-15122.patch

> Hive: Upcasting types should not obscure stats (min/max/ndv)
> 
>
> Key: HIVE-15122
> URL: https://issues.apache.org/jira/browse/HIVE-15122
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15122.patch
>
>
> A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in 
> LLAP.
> Snippet from the bad plan.
> {code}
> | STAGE PLANS:
>   
>|
> |   Stage: Stage-1
>   
>|
> | Tez 
>   
>|
> |   DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 
>   
>|
> |   Edges:
>   
>|
> | Map 2 <- Map 1 (BROADCAST_EDGE) 
>   
>|
> | Map 3 <- Map 2 (BROADCAST_EDGE) 
>   
>|
> | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 
> (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)  
> |
> | Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
>   
>|
> | Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
>   
>|
> |   DagName:  
>   
>|
> |   Vertices: 
>   
>|
> | Map 1   
>   
>|
> | Map Operator Tree:  
>   
>|
> | TableScan   
>   
>|
> |   alias: supplier   
>   
>|
> |   filterExpr: (s_suppkey is not null and s_nationkey is not 
> null) (type: boolean) 
>|
> |   Statistics: Num rows: 1000 Data size: 16000 Basic 
> stats: COMPLETE Column stats: COMPLETE
>|
> |   Filter Operator   
>   
>|
> | predicate: (s_suppkey is not null and s_nationkey is 
> not null) (type: boolean) 
>   |
> | Statistics: Num rows: 1000 Data size: 16000 
> Basic stats: COMPLETE Column stats: COMPLETE  
>|
> | Select Operator 
>   
>|
> |   expressions: s_suppkey (type: bigint), s_nationkey 
> (type: bigint)
>   |
> |   

[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)

2016-12-15 Thread Jesus Camacho Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesus Camacho Rodriguez updated HIVE-15122:
---
Status: Patch Available  (was: In Progress)

> Hive: Upcasting types should not obscure stats (min/max/ndv)
> 
>
> Key: HIVE-15122
> URL: https://issues.apache.org/jira/browse/HIVE-15122
> Project: Hive
>  Issue Type: Bug
>Reporter: Siddharth Seth
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-15122.patch
>
>
> A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in 
> LLAP.
> Snippet from the bad plan.
> {code}
> | STAGE PLANS:
>   
>|
> |   Stage: Stage-1
>   
>|
> | Tez 
>   
>|
> |   DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 
>   
>|
> |   Edges:
>   
>|
> | Map 2 <- Map 1 (BROADCAST_EDGE) 
>   
>|
> | Map 3 <- Map 2 (BROADCAST_EDGE) 
>   
>|
> | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 
> (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE)  
> |
> | Reducer 5 <- Reducer 4 (SIMPLE_EDGE)
>   
>|
> | Reducer 6 <- Reducer 5 (SIMPLE_EDGE)
>   
>|
> |   DagName:  
>   
>|
> |   Vertices: 
>   
>|
> | Map 1   
>   
>|
> | Map Operator Tree:  
>   
>|
> | TableScan   
>   
>|
> |   alias: supplier   
>   
>|
> |   filterExpr: (s_suppkey is not null and s_nationkey is not 
> null) (type: boolean) 
>|
> |   Statistics: Num rows: 1000 Data size: 16000 Basic 
> stats: COMPLETE Column stats: COMPLETE
>|
> |   Filter Operator   
>   
>|
> | predicate: (s_suppkey is not null and s_nationkey is 
> not null) (type: boolean) 
>   |
> | Statistics: Num rows: 1000 Data size: 16000 
> Basic stats: COMPLETE Column stats: COMPLETE  
>|
> | Select Operator 
>   
>|
> |   expressions: s_suppkey (type: bigint), s_nationkey 
> (type: bigint)
>   |
> |