[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Resolution: Fixed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Fix For: 3.0.0 > > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch, HIVE-16793.6.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) >
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Attachment: HIVE-16793.6.patch > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch, HIVE-16793.6.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > >
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Open (was: Patch Available) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] >
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Patch Available (was: Open) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch, HIVE-16793.6.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > >
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Attachment: HIVE-16793.5.patch Latest patch adds a config param {{hive.optimize.remove.sq_count_check}} to enable this optimization. Since this optimization caters to a very specific case but could have adverse effects (join reordering, joins not merging) we have decided to disable this optimization by default > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) >
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Open (was: Patch Available) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] > <-Select
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Patch Available (was: Open) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch, HIVE-16793.5.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] >
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Patch Available (was: Open) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] > <-Select
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Attachment: HIVE-16793.4.patch > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch, HIVE-16793.4.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] > <-Select Operator
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Open (was: Patch Available) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] > <-Select Operator [SEL_59]
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Patch Available (was: Open) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] > <-Select Operator [SEL_59]
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Attachment: HIVE-16793.3.patch > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] > <-Select Operator [SEL_59] (rows=2
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Open (was: Patch Available) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.3.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] > <-Select Operator [SEL_59]
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Attachment: (was: HIVE-16793.2.patch) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] > <-Select Operator [SEL_59] (rows=2 width=621)
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Attachment: HIVE-16793.2.patch > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.2.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] > <-Select Operator [SEL_59] (rows=2
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Patch Available (was: Open) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.2.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] > <-Select Operator [SEL_59]
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Status: Open (was: Patch Available) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch, > HIVE-16793.2.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type"] > <-Select Operator [SEL_59]
[jira] [Updated] (HIVE-16793) Scalar sub-query: sq_count_check not required if gby keys are constant
[ https://issues.apache.org/jira/browse/HIVE-16793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-16793: --- Summary: Scalar sub-query: sq_count_check not required if gby keys are constant (was: Scalar sub-query: Scalar safety checks for explicit group-bys) > Scalar sub-query: sq_count_check not required if gby keys are constant > -- > > Key: HIVE-16793 > URL: https://issues.apache.org/jira/browse/HIVE-16793 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Gopal V >Assignee: Vineet Garg > Attachments: HIVE-16793.1.patch, HIVE-16793.2.patch > > > This query has an sq_count check, though is useless on a constant key. > {code} > hive> explain select * from part where p_size > (select max(p_size) from part > where p_type = '1' group by p_type); > Warning: Map Join MAPJOIN[37][bigTable=?] in task 'Map 1' is a cross product > Warning: Map Join MAPJOIN[36][bigTable=?] in task 'Map 1' is a cross product > OK > Plan optimized by CBO. > Vertex dependency in root stage > Map 1 <- Reducer 4 (BROADCAST_EDGE), Reducer 6 (BROADCAST_EDGE) > Reducer 3 <- Map 2 (SIMPLE_EDGE) > Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE) > Reducer 6 <- Map 5 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Map 1 vectorized, llap > File Output Operator [FS_64] > Select Operator [SEL_63] (rows= width=621) > > Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > Filter Operator [FIL_62] (rows= width=625) > predicate:(_col5 > _col10) > Map Join Operator [MAPJOIN_61] (rows=2 width=625) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col10"] > <-Reducer 6 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_58] > Select Operator [SEL_57] (rows=1 width=4) > Output:["_col0"] > Group By Operator [GBY_56] (rows=1 width=89) > > Output:["_col0","_col1"],aggregations:["max(VALUE._col0)"],keys:KEY._col0 > <-Map 5 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_55] > PartitionCols:_col0 > Group By Operator [GBY_54] (rows=86 width=89) > > Output:["_col0","_col1"],aggregations:["max(_col1)"],keys:'1' > Select Operator [SEL_53] (rows=1212121 width=109) > Output:["_col1"] > Filter Operator [FIL_52] (rows=1212121 width=109) > predicate:(p_type = '1') > TableScan [TS_17] (rows=2 width=109) > > tpch_flat_orc_1000@part,part,Tbl:COMPLETE,Col:COMPLETE,Output:["p_type","p_size"] > <-Map Join Operator [MAPJOIN_60] (rows=2 width=621) > > Conds:(Inner),Output:["_col0","_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"] > <-Reducer 4 [BROADCAST_EDGE] vectorized, llap > BROADCAST [RS_51] > Select Operator [SEL_50] (rows=1 width=8) > Filter Operator [FIL_49] (rows=1 width=8) > predicate:(sq_count_check(_col0) <= 1) > Group By Operator [GBY_48] (rows=1 width=8) > Output:["_col0"],aggregations:["count(VALUE._col0)"] > <-Reducer 3 [CUSTOM_SIMPLE_EDGE] vectorized, llap > PARTITION_ONLY_SHUFFLE [RS_47] > Group By Operator [GBY_46] (rows=1 width=8) > Output:["_col0"],aggregations:["count()"] > Select Operator [SEL_45] (rows=1 width=85) > Group By Operator [GBY_44] (rows=1 width=85) > Output:["_col0"],keys:KEY._col0 > <-Map 2 [SIMPLE_EDGE] vectorized, llap > SHUFFLE [RS_43] > PartitionCols:_col0 > Group By Operator [GBY_42] (rows=83 > width=85) > Output:["_col0"],keys:'1' > Select Operator [SEL_41] (rows=1212121 > width=105) > Filter Operator [FIL_40] (rows=1212121 > width=105) > predicate:(p_type = '1') > TableScan [TS_2] (rows=2 > width=105) > >