[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch, > HIVE-21539.3.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Status: Open (was: Patch Available) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch, > HIVE-21539.3.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Attachment: (was: HIVE-21539.4.patch) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch, > HIVE-21539.3.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Attachment: HIVE-21539.4.patch > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch, > HIVE-21539.3.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Status: Patch Available (was: Open) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch, > HIVE-21539.3.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Status: Open (was: Patch Available) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch, > HIVE-21539.3.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Status: Patch Available (was: Open) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch, > HIVE-21539.3.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Status: Open (was: Patch Available) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch, > HIVE-21539.3.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Status: Patch Available (was: Open) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch, > HIVE-21539.3.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Attachment: HIVE-21539.3.patch > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch, > HIVE-21539.3.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Attachment: HIVE-21539.2.patch > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > | columns.comments
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Status: Patch Available (was: Open) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > | columns.comments
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Status: Open (was: Patch Available) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch, HIVE-21539.2.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > | columns.comments
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Status: Patch Available (was: Open) > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > | columns.comments | > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-21539: --- Attachment: HIVE-21539.1.patch > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Assignee: Vineet Garg >Priority: Major > Attachments: HIVE-21539.1.patch > > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > | columns.comments | > |
[jira] [Updated] (HIVE-21539) GroupBy + where clause on same column results in incorrect query rewrite
[ https://issues.apache.org/jira/browse/HIVE-21539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gopal V updated HIVE-21539: --- Component/s: (was: HiveServer2) CBO > GroupBy + where clause on same column results in incorrect query rewrite > > > Key: HIVE-21539 > URL: https://issues.apache.org/jira/browse/HIVE-21539 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 4.0.0 >Reporter: anishek >Priority: Major > > {code} > create table a (i int, j string); > insert into a values ( 1, 'a'),(2,'b'); > explain extended select min(j) from a where j='a' group by j; > ++ > | Explain | > ++ > | OPTIMIZED SQL: SELECT MIN(TRUE) AS `_o__c0`| > | FROM `default`.`a` | > | WHERE `j` = 'a'| > | GROUP BY TRUE | > | STAGE DEPENDENCIES:| > | Stage-1 is a root stage | > | Stage-0 depends on stages: Stage-1 | > || > | STAGE PLANS: | > | Stage: Stage-1 | > | Tez| > | DagId: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Edges: | > | Reducer 2 <- Map 1 (SIMPLE_EDGE) | > | DagName: > anagarwal_20190318153535_25c1f460-1986-475e-9995-9f6342029dd8:11 | > | Vertices:| > | Map 1 | > | Map Operator Tree: | > | TableScan | > | alias: a | > | filterExpr: (j = 'a') (type: boolean) | > | Statistics: Num rows: 2 Data size: 170 Basic stats: > COMPLETE Column stats: COMPLETE | > | GatherStats: false | > | Filter Operator | > | isSamplingPred: false | > | predicate: (j = 'a') (type: boolean) | > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Select Operator| > | Statistics: Num rows: 1 Data size: 85 Basic stats: > COMPLETE Column stats: COMPLETE | > | Group By Operator| > | aggregations: min(true)| > | keys: true (type: boolean) | > | mode: hash | > | outputColumnNames: _col0, _col1 | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | Reduce Output Operator | > | key expressions: _col0 (type: boolean) | > | null sort order: a | > | sort order: +| > | Map-reduce partition columns: _col0 (type: > boolean) | > | Statistics: Num rows: 1 Data size: 8 Basic stats: > COMPLETE Column stats: COMPLETE | > | tag: -1 | > | value expressions: _col1 (type: boolean) | > | auto parallelism: true | > | Path -> Alias: | > | hdfs://localhost:9000/tmp/hive/warehouse/a [a] | > | Path -> Partition: | > | hdfs://localhost:9000/tmp/hive/warehouse/a | > | Partition | > | base file name: a| > | input format: org.apache.hadoop.mapred.TextInputFormat | > | output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat | > | properties: | > | COLUMN_STATS_ACCURATE > {"BASIC_STATS":"true","COLUMN_STATS":{"i":"true","j":"true"}} | > | bucket_count -1| > | bucketing_version 2| > | column.name.delimiter ,| > | columns i,j| > | columns.comments | > | columns.types int:string | > |