[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518216#comment-14518216 ] Sushanth Sowmyan commented on HIVE-10450: - +1 for inclusion into branch-1.2 > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, > HIVE-10450.02.patch, HIVE-10450.03.patch, HIVE-10450.04.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877) > at > org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107) > at > org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019) > at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993) > at > org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14515132#comment-14515132 ] Matt McCline commented on HIVE-10450: - [~gopalv] thank you for reviewing and helping solve the problem. Note the problem can also manifest in MR as a NullPointerException with this stack trace (when tables common column types but with different number of columns): {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.setBatch(VectorExtractRow.java:705) at org.apache.hadoop.hive.ql.exec.vector.VectorExtractRowDynBatch.setBatchOnEntry(VectorExtractRowDynBatch.java:34) at org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:89) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) ... 10 more {noformat} > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, > HIVE-10450.02.patch, HIVE-10450.03.patch, HIVE-10450.04.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877) > at > org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107) > at > org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at org.apache.hadoop.
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14514736#comment-14514736 ] Gopal V commented on HIVE-10450: Patch LGTM - +1. > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, > HIVE-10450.02.patch, HIVE-10450.03.patch, HIVE-10450.04.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877) > at > org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107) > at > org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019) > at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993) > at > org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:244) > at > org.apache.hado
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14514100#comment-14514100 ] Hive QA commented on HIVE-10450: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12728378/HIVE-10450.04.patch {color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8816 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3606/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3606/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3606/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 15 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12728378 - PreCommit-HIVE-TRUNK-Build > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, > HIVE-10450.02.patch, HIVE-10450.03.patch, HIVE-10450.04.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.a
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513918#comment-14513918 ] Matt McCline commented on HIVE-10450: - Fix obviously related TestCliDriver q files issues: mergejoin.q, vectorized_ptf.q, and vectorized_shufflejoin.q. > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, > HIVE-10450.02.patch, HIVE-10450.03.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877) > at > org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107) > at > org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019) > at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993) > at > org.apache.hadoop.hive.cli.Tes
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513876#comment-14513876 ] Hive QA commented on HIVE-10450: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12728342/HIVE-10450.03.patch {color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 8814 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_shufflejoin org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3604/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3604/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3604/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 19 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12728342 - PreCommit-HIVE-TRUNK-Build > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, > HIVE-10450.02.patch, HIVE-10450.03.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513723#comment-14513723 ] Matt McCline commented on HIVE-10450: - Re-coded to loop and count TableScanOperator occurrences. Reject vectorization if count > 1. Not sure I understand the dummy operator thing, but clearly patch #2 was rejecting many, many queries it should have. > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, > HIVE-10450.02.patch, HIVE-10450.03.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877) > at > org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107) > at > org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019) > at org.apache.ha
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513636#comment-14513636 ] Gopal V commented on HIVE-10450: [~mmccline]: I think the dummy operator aliases play into the .size() == 1 in this patch, for aliases which come via the broadcast edge. > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, > HIVE-10450.02.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877) > at > org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107) > at > org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019) > at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993) > at > org.apache.hadoop.hive.cli.TestMiniTe
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513337#comment-14513337 ] Hive QA commented on HIVE-10450: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12728268/HIVE-10450.02.patch {color:red}ERROR:{color} -1 due to 189 failed/errored test(s), 8815 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_aggregate_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_between_in org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_cast_constant org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_mapjoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_simple org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_coalesce org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_coalesce_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_count_distinct org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_data_types org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_date_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_10_0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_aggregate org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_cast org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_expressions org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_mapjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_math_funcs org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_precision org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_round org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_round_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_udf2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_distinct_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_elt org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_if_expr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_inner_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_interval_1 org.apache.hadoop.hive.cli.TestCliDriver.test
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513268#comment-14513268 ] Matt McCline commented on HIVE-10450: - Yes, [~gopalv] good point. Attached simpler patch. > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, > HIVE-10450.02.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877) > at > org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107) > at > org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019) > at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993) > at > org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:244) > at > org.apache.h
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513030#comment-14513030 ] Gopal V commented on HIVE-10450: [~mmccline]: this fix seems too involved - could this issue be fixed by an early-exit critera which does not vectorize anything where there are >1 top-ops? > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877) > at > org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107) > at > org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019) > at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993) > at > org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMin
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510814#comment-14510814 ] Gopal V commented on HIVE-10450: Ah, that's what possibly confused me - this sort of plan is the archaic MR style JOIN. We should be seeing the good vectorized MapJoin for that query in Tez. > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877) > at > org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107) > at > org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019) > at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993) > at > org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(Test
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510812#comment-14510812 ] Matt McCline commented on HIVE-10450: - [~gopalv] this problem is for MR not Tez (that I know of). > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877) > at > org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107) > at > org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019) > at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993) > at > org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:244) > at > org.apache.hadoop.hive.cli.TestMiniTezCliDriver.te
[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization
[ https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510634#comment-14510634 ] Gopal V commented on HIVE-10450: [~mmccline]: this could very well be a planning error in Tez compiler. I'm trying to figure out how the split generation for this would work - the parallelism for store_sales and store tables should be massively different. This might be a scenario where we're fixing up vectorization for an incorrect plan. > More than one TableScan in MapWork not supported in Vectorization -- causes > query to fail during vectorization > --- > > Key: HIVE-10450 > URL: https://issues.apache.org/jira/browse/HIVE-10450 > Project: Hive > Issue Type: Bug >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-10450.01.patch > > > [~gopalv] found a error with this query: > {noformat} > explain select > s_state, count(1) > from store_sales, > store, > date_dim > where store_sales.ss_sold_date_sk = date_dim.d_date_sk and >store_sales.ss_store_sk = store.s_store_sk and >store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT') > group by s_state > order by s_state > limit 100; > {noformat} > Stack trace: > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: > org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.reflect.InvocationTargetException > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735) > at > org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54) > at > org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59) > at > org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180) > at > org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125) > at > org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877) > at > org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107) > at > org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270) > at > org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084) > at > org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at > org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) > at > org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019) >