[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)
[ https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-15122: --- Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Fails are unrelated. Pushed to master, thanks for reviewing [~ashutoshc]! > Hive: Upcasting types should not obscure stats (min/max/ndv) > > > Key: HIVE-15122 > URL: https://issues.apache.org/jira/browse/HIVE-15122 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Jesus Camacho Rodriguez > Fix For: 2.2.0 > > Attachments: HIVE-15122.03.patch, HIVE-15122.patch > > > A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in > LLAP. > Snippet from the bad plan. > {code} > | STAGE PLANS: > >| > | Stage: Stage-1 > >| > | Tez > >| > | DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 > >| > | Edges: > >| > | Map 2 <- Map 1 (BROADCAST_EDGE) > >| > | Map 3 <- Map 2 (BROADCAST_EDGE) > >| > | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 > (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE) > | > | Reducer 5 <- Reducer 4 (SIMPLE_EDGE) > >| > | Reducer 6 <- Reducer 5 (SIMPLE_EDGE) > >| > | DagName: > >| > | Vertices: > >| > | Map 1 > >| > | Map Operator Tree: > >| > | TableScan > >| > | alias: supplier > >| > | filterExpr: (s_suppkey is not null and s_nationkey is not > null) (type: boolean) >| > | Statistics: Num rows: 1000 Data size: 16000 Basic > stats: COMPLETE Column stats: COMPLETE >| > | Filter Operator > >| > | predicate: (s_suppkey is not null and s_nationkey is > not null) (type: boolean) > | > | Statistics: Num rows: 1000 Data size: 16000 > Basic stats: COMPLETE Column stats: COMPLETE >| > | Select Operator > >| > |
[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)
[ https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-15122: --- Attachment: (was: HIVE-15122.01.patch) > Hive: Upcasting types should not obscure stats (min/max/ndv) > > > Key: HIVE-15122 > URL: https://issues.apache.org/jira/browse/HIVE-15122 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15122.02.patch, HIVE-15122.patch > > > A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in > LLAP. > Snippet from the bad plan. > {code} > | STAGE PLANS: > >| > | Stage: Stage-1 > >| > | Tez > >| > | DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 > >| > | Edges: > >| > | Map 2 <- Map 1 (BROADCAST_EDGE) > >| > | Map 3 <- Map 2 (BROADCAST_EDGE) > >| > | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 > (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE) > | > | Reducer 5 <- Reducer 4 (SIMPLE_EDGE) > >| > | Reducer 6 <- Reducer 5 (SIMPLE_EDGE) > >| > | DagName: > >| > | Vertices: > >| > | Map 1 > >| > | Map Operator Tree: > >| > | TableScan > >| > | alias: supplier > >| > | filterExpr: (s_suppkey is not null and s_nationkey is not > null) (type: boolean) >| > | Statistics: Num rows: 1000 Data size: 16000 Basic > stats: COMPLETE Column stats: COMPLETE >| > | Filter Operator > >| > | predicate: (s_suppkey is not null and s_nationkey is > not null) (type: boolean) > | > | Statistics: Num rows: 1000 Data size: 16000 > Basic stats: COMPLETE Column stats: COMPLETE >| > | Select Operator > >| > | expressions: s_suppkey (type: bigint), s_nationkey > (type: bigint) > | > |
[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)
[ https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-15122: --- Attachment: HIVE-15122.03.patch > Hive: Upcasting types should not obscure stats (min/max/ndv) > > > Key: HIVE-15122 > URL: https://issues.apache.org/jira/browse/HIVE-15122 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15122.03.patch, HIVE-15122.patch > > > A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in > LLAP. > Snippet from the bad plan. > {code} > | STAGE PLANS: > >| > | Stage: Stage-1 > >| > | Tez > >| > | DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 > >| > | Edges: > >| > | Map 2 <- Map 1 (BROADCAST_EDGE) > >| > | Map 3 <- Map 2 (BROADCAST_EDGE) > >| > | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 > (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE) > | > | Reducer 5 <- Reducer 4 (SIMPLE_EDGE) > >| > | Reducer 6 <- Reducer 5 (SIMPLE_EDGE) > >| > | DagName: > >| > | Vertices: > >| > | Map 1 > >| > | Map Operator Tree: > >| > | TableScan > >| > | alias: supplier > >| > | filterExpr: (s_suppkey is not null and s_nationkey is not > null) (type: boolean) >| > | Statistics: Num rows: 1000 Data size: 16000 Basic > stats: COMPLETE Column stats: COMPLETE >| > | Filter Operator > >| > | predicate: (s_suppkey is not null and s_nationkey is > not null) (type: boolean) > | > | Statistics: Num rows: 1000 Data size: 16000 > Basic stats: COMPLETE Column stats: COMPLETE >| > | Select Operator > >| > | expressions: s_suppkey (type: bigint), s_nationkey > (type: bigint) > | > |
[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)
[ https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-15122: --- Attachment: (was: HIVE-15122.02.patch) > Hive: Upcasting types should not obscure stats (min/max/ndv) > > > Key: HIVE-15122 > URL: https://issues.apache.org/jira/browse/HIVE-15122 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15122.03.patch, HIVE-15122.patch > > > A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in > LLAP. > Snippet from the bad plan. > {code} > | STAGE PLANS: > >| > | Stage: Stage-1 > >| > | Tez > >| > | DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 > >| > | Edges: > >| > | Map 2 <- Map 1 (BROADCAST_EDGE) > >| > | Map 3 <- Map 2 (BROADCAST_EDGE) > >| > | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 > (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE) > | > | Reducer 5 <- Reducer 4 (SIMPLE_EDGE) > >| > | Reducer 6 <- Reducer 5 (SIMPLE_EDGE) > >| > | DagName: > >| > | Vertices: > >| > | Map 1 > >| > | Map Operator Tree: > >| > | TableScan > >| > | alias: supplier > >| > | filterExpr: (s_suppkey is not null and s_nationkey is not > null) (type: boolean) >| > | Statistics: Num rows: 1000 Data size: 16000 Basic > stats: COMPLETE Column stats: COMPLETE >| > | Filter Operator > >| > | predicate: (s_suppkey is not null and s_nationkey is > not null) (type: boolean) > | > | Statistics: Num rows: 1000 Data size: 16000 > Basic stats: COMPLETE Column stats: COMPLETE >| > | Select Operator > >| > | expressions: s_suppkey (type: bigint), s_nationkey > (type: bigint) > | > |
[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)
[ https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-15122: --- Attachment: HIVE-15122.02.patch > Hive: Upcasting types should not obscure stats (min/max/ndv) > > > Key: HIVE-15122 > URL: https://issues.apache.org/jira/browse/HIVE-15122 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15122.02.patch, HIVE-15122.patch > > > A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in > LLAP. > Snippet from the bad plan. > {code} > | STAGE PLANS: > >| > | Stage: Stage-1 > >| > | Tez > >| > | DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 > >| > | Edges: > >| > | Map 2 <- Map 1 (BROADCAST_EDGE) > >| > | Map 3 <- Map 2 (BROADCAST_EDGE) > >| > | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 > (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE) > | > | Reducer 5 <- Reducer 4 (SIMPLE_EDGE) > >| > | Reducer 6 <- Reducer 5 (SIMPLE_EDGE) > >| > | DagName: > >| > | Vertices: > >| > | Map 1 > >| > | Map Operator Tree: > >| > | TableScan > >| > | alias: supplier > >| > | filterExpr: (s_suppkey is not null and s_nationkey is not > null) (type: boolean) >| > | Statistics: Num rows: 1000 Data size: 16000 Basic > stats: COMPLETE Column stats: COMPLETE >| > | Filter Operator > >| > | predicate: (s_suppkey is not null and s_nationkey is > not null) (type: boolean) > | > | Statistics: Num rows: 1000 Data size: 16000 > Basic stats: COMPLETE Column stats: COMPLETE >| > | Select Operator > >| > | expressions: s_suppkey (type: bigint), s_nationkey > (type: bigint) > | > |
[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)
[ https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-15122: --- Attachment: HIVE-15122.01.patch > Hive: Upcasting types should not obscure stats (min/max/ndv) > > > Key: HIVE-15122 > URL: https://issues.apache.org/jira/browse/HIVE-15122 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15122.01.patch, HIVE-15122.patch > > > A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in > LLAP. > Snippet from the bad plan. > {code} > | STAGE PLANS: > >| > | Stage: Stage-1 > >| > | Tez > >| > | DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 > >| > | Edges: > >| > | Map 2 <- Map 1 (BROADCAST_EDGE) > >| > | Map 3 <- Map 2 (BROADCAST_EDGE) > >| > | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 > (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE) > | > | Reducer 5 <- Reducer 4 (SIMPLE_EDGE) > >| > | Reducer 6 <- Reducer 5 (SIMPLE_EDGE) > >| > | DagName: > >| > | Vertices: > >| > | Map 1 > >| > | Map Operator Tree: > >| > | TableScan > >| > | alias: supplier > >| > | filterExpr: (s_suppkey is not null and s_nationkey is not > null) (type: boolean) >| > | Statistics: Num rows: 1000 Data size: 16000 Basic > stats: COMPLETE Column stats: COMPLETE >| > | Filter Operator > >| > | predicate: (s_suppkey is not null and s_nationkey is > not null) (type: boolean) > | > | Statistics: Num rows: 1000 Data size: 16000 > Basic stats: COMPLETE Column stats: COMPLETE >| > | Select Operator > >| > | expressions: s_suppkey (type: bigint), s_nationkey > (type: bigint) > | > |
[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)
[ https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-15122: --- Attachment: HIVE-15122.patch > Hive: Upcasting types should not obscure stats (min/max/ndv) > > > Key: HIVE-15122 > URL: https://issues.apache.org/jira/browse/HIVE-15122 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15122.patch > > > A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in > LLAP. > Snippet from the bad plan. > {code} > | STAGE PLANS: > >| > | Stage: Stage-1 > >| > | Tez > >| > | DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 > >| > | Edges: > >| > | Map 2 <- Map 1 (BROADCAST_EDGE) > >| > | Map 3 <- Map 2 (BROADCAST_EDGE) > >| > | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 > (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE) > | > | Reducer 5 <- Reducer 4 (SIMPLE_EDGE) > >| > | Reducer 6 <- Reducer 5 (SIMPLE_EDGE) > >| > | DagName: > >| > | Vertices: > >| > | Map 1 > >| > | Map Operator Tree: > >| > | TableScan > >| > | alias: supplier > >| > | filterExpr: (s_suppkey is not null and s_nationkey is not > null) (type: boolean) >| > | Statistics: Num rows: 1000 Data size: 16000 Basic > stats: COMPLETE Column stats: COMPLETE >| > | Filter Operator > >| > | predicate: (s_suppkey is not null and s_nationkey is > not null) (type: boolean) > | > | Statistics: Num rows: 1000 Data size: 16000 > Basic stats: COMPLETE Column stats: COMPLETE >| > | Select Operator > >| > | expressions: s_suppkey (type: bigint), s_nationkey > (type: bigint) > | > |
[jira] [Updated] (HIVE-15122) Hive: Upcasting types should not obscure stats (min/max/ndv)
[ https://issues.apache.org/jira/browse/HIVE-15122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-15122: --- Status: Patch Available (was: In Progress) > Hive: Upcasting types should not obscure stats (min/max/ndv) > > > Key: HIVE-15122 > URL: https://issues.apache.org/jira/browse/HIVE-15122 > Project: Hive > Issue Type: Bug >Reporter: Siddharth Seth >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-15122.patch > > > A UDFToLong breaks PK/FK inferences and triggers mis-estimation of joins in > LLAP. > Snippet from the bad plan. > {code} > | STAGE PLANS: > >| > | Stage: Stage-1 > >| > | Tez > >| > | DagId: hive_20161031222730_a700058f-78eb-40d6-a67d-43add60a50e2:6 > >| > | Edges: > >| > | Map 2 <- Map 1 (BROADCAST_EDGE) > >| > | Map 3 <- Map 2 (BROADCAST_EDGE) > >| > | Reducer 4 <- Map 3 (CUSTOM_SIMPLE_EDGE), Map 7 > (CUSTOM_SIMPLE_EDGE), Map 8 (BROADCAST_EDGE), Map 9 (BROADCAST_EDGE) > | > | Reducer 5 <- Reducer 4 (SIMPLE_EDGE) > >| > | Reducer 6 <- Reducer 5 (SIMPLE_EDGE) > >| > | DagName: > >| > | Vertices: > >| > | Map 1 > >| > | Map Operator Tree: > >| > | TableScan > >| > | alias: supplier > >| > | filterExpr: (s_suppkey is not null and s_nationkey is not > null) (type: boolean) >| > | Statistics: Num rows: 1000 Data size: 16000 Basic > stats: COMPLETE Column stats: COMPLETE >| > | Filter Operator > >| > | predicate: (s_suppkey is not null and s_nationkey is > not null) (type: boolean) > | > | Statistics: Num rows: 1000 Data size: 16000 > Basic stats: COMPLETE Column stats: COMPLETE >| > | Select Operator > >| > | expressions: s_suppkey (type: bigint), s_nationkey > (type: bigint) > | > |