[jira] [Commented] (HIVE-13713) We miss vectorization in a case of count(*) when aggregation mode is COMPLETE
[ https://issues.apache.org/jira/browse/HIVE-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308899#comment-15308899 ] Matt McCline commented on HIVE-13713: - I messed up. It doesn't need to be in branch-2.1 > We miss vectorization in a case of count(*) when aggregation mode is COMPLETE > - > > Key: HIVE-13713 > URL: https://issues.apache.org/jira/browse/HIVE-13713 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.1.0, 2.2.0 > > Attachments: HIVE-13713.01.patch, HIVE-13713.02.patch, > HIVE-13713.03.patch > > > E.g. vectorization_limit.q doesn't vectorize Reducer 2 for the query: > {code} > "select ctinyint, count(distinct(cdouble)) from alltypesorc group by ctinyint > order by ctinyint limit 20" > {code} > It was producing a stack trace with this error... when trying to vectorize > the COMPLETE mode GROUP BY operator. > {code} > Vector aggregate not implemented: "count" for type: "NONE > (reduce-merge-partial = true) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13713) We miss vectorization in a case of count(*) when aggregation mode is COMPLETE
[ https://issues.apache.org/jira/browse/HIVE-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15308856#comment-15308856 ] Ashutosh Chauhan commented on HIVE-13713: - [~mmccline] This is not in branch-2.1 > We miss vectorization in a case of count(*) when aggregation mode is COMPLETE > - > > Key: HIVE-13713 > URL: https://issues.apache.org/jira/browse/HIVE-13713 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.1.0, 2.2.0 > > Attachments: HIVE-13713.01.patch, HIVE-13713.02.patch, > HIVE-13713.03.patch > > > E.g. vectorization_limit.q doesn't vectorize Reducer 2 for the query: > {code} > "select ctinyint, count(distinct(cdouble)) from alltypesorc group by ctinyint > order by ctinyint limit 20" > {code} > It was producing a stack trace with this error... when trying to vectorize > the COMPLETE mode GROUP BY operator. > {code} > Vector aggregate not implemented: "count" for type: "NONE > (reduce-merge-partial = true) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13713) We miss vectorization in a case of count(*) when aggregation mode is COMPLETE
[ https://issues.apache.org/jira/browse/HIVE-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15307023#comment-15307023 ] Matt McCline commented on HIVE-13713: - Committed to master and branch-2.1 > We miss vectorization in a case of count(*) when aggregation mode is COMPLETE > - > > Key: HIVE-13713 > URL: https://issues.apache.org/jira/browse/HIVE-13713 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.1.0, 2.2.0 > > Attachments: HIVE-13713.01.patch, HIVE-13713.02.patch, > HIVE-13713.03.patch > > > E.g. vectorization_limit.q doesn't vectorize Reducer 2 for the query: > {code} > "select ctinyint, count(distinct(cdouble)) from alltypesorc group by ctinyint > order by ctinyint limit 20" > {code} > It was producing a stack trace with this error... when trying to vectorize > the COMPLETE mode GROUP BY operator. > {code} > Vector aggregate not implemented: "count" for type: "NONE > (reduce-merge-partial = true) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13713) We miss vectorization in a case of count(*) when aggregation mode is COMPLETE
[ https://issues.apache.org/jira/browse/HIVE-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15306890#comment-15306890 ] Hive QA commented on HIVE-13713: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12806889/HIVE-13713.02.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 16 failed/errored test(s), 10169 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestJdbcWithMiniHA - did not produce a TEST-*.xml file TestJdbcWithMiniMr - did not produce a TEST-*.xml file TestMiniTezCliDriver-vector_grouping_sets.q-mapjoin_mapjoin.q-cte_5.q-and-12-more - did not produce a TEST-*.xml file TestOperationLoggingAPIWithTez - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ivyDownload org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats_list_bucket org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_complex_all org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_udf1 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_constprog_partitioner org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_complex_all org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec org.apache.hadoop.hive.ql.lockmgr.TestDbTxnManager2.testLocksInSubquery {noformat} Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/457/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/457/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-457/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 16 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12806889 - PreCommit-HIVE-MASTER-Build > We miss vectorization in a case of count(*) when aggregation mode is COMPLETE > - > > Key: HIVE-13713 > URL: https://issues.apache.org/jira/browse/HIVE-13713 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13713.01.patch, HIVE-13713.02.patch > > > E.g. vectorization_limit.q doesn't vectorize Reducer 2 for the query: > {code} > "select ctinyint, count(distinct(cdouble)) from alltypesorc group by ctinyint > order by ctinyint limit 20" > {code} > It was producing a stack trace with this error... when trying to vectorize > the COMPLETE mode GROUP BY operator. > {code} > Vector aggregate not implemented: "count" for type: "NONE > (reduce-merge-partial = true) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13713) We miss vectorization in a case of count(*) when aggregation mode is COMPLETE
[ https://issues.apache.org/jira/browse/HIVE-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15304640#comment-15304640 ] Sergey Shelukhin commented on HIVE-13713: - +1 pending a new test run (it looks like the above one had a lot of failures). Some nit on RB. Thanks for the comments! > We miss vectorization in a case of count(*) when aggregation mode is COMPLETE > - > > Key: HIVE-13713 > URL: https://issues.apache.org/jira/browse/HIVE-13713 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13713.01.patch > > > E.g. vectorization_limit.q doesn't vectorize Reducer 2 for the query: > {code} > "select ctinyint, count(distinct(cdouble)) from alltypesorc group by ctinyint > order by ctinyint limit 20" > {code} > It was producing a stack trace with this error... when trying to vectorize > the COMPLETE mode GROUP BY operator. > {code} > Vector aggregate not implemented: "count" for type: "NONE > (reduce-merge-partial = true) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13713) We miss vectorization in a case of count(*) when aggregation mode is COMPLETE
[ https://issues.apache.org/jira/browse/HIVE-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15278038#comment-15278038 ] Hive QA commented on HIVE-13713: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12802922/HIVE-13713.01.patch {color:green}SUCCESS:{color} +1 due to 5 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 22 failed/errored test(s), 9949 tests executed *Failed tests:* {noformat} TestCliDriver-partition_timestamp.q-vector_outer_join5.q-mapreduce1.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-ppd_join4.q-union27.q-show_indexes_edge_cases.q-and-12-more - did not produce a TEST-*.xml file TestCliDriver-ptf_general_queries.q-unionDistinct_1.q-groupby1_noskew.q-and-12-more - did not produce a TEST-*.xml file TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file TestMiniTezCliDriver-enforce_order.q-vector_partition_diff_num_cols.q-unionDistinct_1.q-and-12-more - did not produce a TEST-*.xml file TestMiniTezCliDriver-join1.q-mapjoin_decimal.q-vectorized_distinct_gby.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-bucketsortoptimize_insert_7.q-smb_mapjoin_15.q-mapreduce1.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-skewjoinopt3.q-union27.q-multigroupby_singlemr.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_external_table_ppd org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_binary_storage_queries org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_index_bitmap3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_llap_nullscan org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_aggregate_without_gby org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_short_regress org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_short_regress org.apache.hadoop.hive.llap.tez.TestConverters.testFragmentSpecToTaskSpec org.apache.hadoop.hive.llap.tezplugins.TestLlapTaskCommunicator.testFinishableStateUpdateFailure org.apache.hadoop.hive.metastore.hbase.TestHBaseSchemaTool.oneMondoTest org.apache.hadoop.hive.ql.exec.vector.TestVectorGroupByOperator.testCountReduce org.apache.hive.service.cli.session.TestHiveSessionImpl.testLeakOperationHandle {noformat} Test results: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/228/testReport Console output: http://ec2-54-177-240-2.us-west-1.compute.amazonaws.com/job/PreCommit-HIVE-MASTER-Build/228/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-MASTER-Build-228/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 22 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12802922 - PreCommit-HIVE-MASTER-Build > We miss vectorization in a case of count(*) when aggregation mode is COMPLETE > - > > Key: HIVE-13713 > URL: https://issues.apache.org/jira/browse/HIVE-13713 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13713.01.patch > > > E.g. vectorization_limit.q doesn't vectorize Reducer 2 for the query: > {code} > "select ctinyint, count(distinct(cdouble)) from alltypesorc group by ctinyint > order by ctinyint limit 20" > {code} > It was producing a stack trace with this error... when trying to vectorize > the COMPLETE mode GROUP BY operator. > {code} > Vector aggregate not implemented: "count" for type: "NONE > (reduce-merge-partial = true) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13713) We miss vectorization in a case of count(*) when aggregation mode is COMPLETE
[ https://issues.apache.org/jira/browse/HIVE-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275482#comment-15275482 ] Matt McCline commented on HIVE-13713: - (Running tests on internal PTest cluster) > We miss vectorization in a case of count(*) when aggregation mode is COMPLETE > - > > Key: HIVE-13713 > URL: https://issues.apache.org/jira/browse/HIVE-13713 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-13713.01.patch > > > E.g. vectorization_limit.q doesn't vectorize Reducer 2 for the query: > {code} > "select ctinyint, count(distinct(cdouble)) from alltypesorc group by ctinyint > order by ctinyint limit 20" > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13713) We miss vectorization in a case of count(*) when aggregation mode is COMPLETE
[ https://issues.apache.org/jira/browse/HIVE-13713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275153#comment-15275153 ] Matt McCline commented on HIVE-13713: - With the fix, vectorization_limit.q does vectorize Reducer 2: {code} Reducer 2 Execution mode: vectorized Reduce Operator Tree: Group By Operator keys: KEY._col0 (type: tinyint), KEY._col1 (type: double) mode: mergepartial outputColumnNames: _col0, _col1 Statistics: Num rows: 6144 Data size: 1320982 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(_col1) keys: _col0 (type: tinyint) mode: complete outputColumnNames: _col0, _col1 Statistics: Num rows: 3072 Data size: 660491 Basic stats: COMPLETE Column stats: NONE Limit Number of rows: 20 Statistics: Num rows: 20 Data size: 4300 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 20 Data size: 4300 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe {code} > We miss vectorization in a case of count(*) when aggregation mode is COMPLETE > - > > Key: HIVE-13713 > URL: https://issues.apache.org/jira/browse/HIVE-13713 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > > E.g. vectorization_limit.q doesn't vectorize Reducer 2 for the query: > {code} > "select ctinyint, count(distinct(cdouble)) from alltypesorc group by ctinyint > order by ctinyint limit 20" > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)