[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-28 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14518216#comment-14518216
 ] 

Sushanth Sowmyan commented on HIVE-10450:
-

+1 for inclusion into branch-1.2

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, 
> HIVE-10450.02.patch, HIVE-10450.03.patch, HIVE-10450.04.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
>   at 
> org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-27 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14515132#comment-14515132
 ] 

Matt McCline commented on HIVE-10450:
-

[~gopalv] thank you for reviewing and helping solve the problem.

Note the problem can also manifest in MR as a NullPointerException with this 
stack trace (when tables common column types but with different number of 
columns):

{noformat}
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRow.setBatch(VectorExtractRow.java:705)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorExtractRowDynBatch.setBatchOnEntry(VectorExtractRowDynBatch.java:34)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorReduceSinkOperator.process(VectorReduceSinkOperator.java:89)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at 
org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97)
at 
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:162)
at 
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
... 10 more
{noformat}

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, 
> HIVE-10450.02.patch, HIVE-10450.03.patch, HIVE-10450.04.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
>   at 
> org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at org.apache.hadoop.

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-27 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14514736#comment-14514736
 ] 

Gopal V commented on HIVE-10450:


Patch LGTM - +1.

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, 
> HIVE-10450.02.patch, HIVE-10450.03.patch, HIVE-10450.04.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
>   at 
> org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:244)
>   at 
> org.apache.hado

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14514100#comment-14514100
 ] 

Hive QA commented on HIVE-10450:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728378/HIVE-10450.04.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8816 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchEmptyCommit
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3606/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3606/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3606/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728378 - PreCommit-HIVE-TRUNK-Build

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, 
> HIVE-10450.02.patch, HIVE-10450.03.patch, HIVE-10450.04.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.a

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-27 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513918#comment-14513918
 ] 

Matt McCline commented on HIVE-10450:
-

Fix obviously related TestCliDriver q files issues: mergejoin.q, 
vectorized_ptf.q, and vectorized_shufflejoin.q.

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, 
> HIVE-10450.02.patch, HIVE-10450.03.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
>   at 
> org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
>   at 
> org.apache.hadoop.hive.cli.Tes

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-27 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513876#comment-14513876
 ] 

Hive QA commented on HIVE-10450:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728342/HIVE-10450.03.patch

{color:red}ERROR:{color} -1 due to 19 failed/errored test(s), 8814 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_index_bitmap_auto_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_shufflejoin
org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler
org.apache.hive.jdbc.TestSSL.testSSLFetchHttp
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3604/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3604/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3604/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 19 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728342 - PreCommit-HIVE-TRUNK-Build

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, 
> HIVE-10450.02.patch, HIVE-10450.03.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-27 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513723#comment-14513723
 ] 

Matt McCline commented on HIVE-10450:
-

Re-coded to loop and count TableScanOperator occurrences.  Reject vectorization 
if count > 1.  Not sure I understand the dummy operator thing, but clearly 
patch #2 was rejecting many, many queries it should have.

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, 
> HIVE-10450.02.patch, HIVE-10450.03.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
>   at 
> org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
>   at org.apache.ha

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-27 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513636#comment-14513636
 ] 

Gopal V commented on HIVE-10450:


[~mmccline]: I think the dummy operator aliases play into the .size() == 1 in 
this patch, for aliases which come via the broadcast edge.

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, 
> HIVE-10450.02.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
>   at 
> org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTe

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-26 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513337#comment-14513337
 ] 

Hive QA commented on HIVE-10450:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728268/HIVE-10450.02.patch

{color:red}ERROR:{color} -1 due to 189 failed/errored test(s), 8815 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mergejoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_aggregate_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_between_in
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_cast_constant
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_mapjoin1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_simple
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_coalesce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_coalesce_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_count_distinct
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_data_types
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_date_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_10_0
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_aggregate
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_cast
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_expressions
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_mapjoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_math_funcs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_precision
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_round
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_round_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_udf2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_distinct_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_elt
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_groupby_reduce
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_if_expr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_inner_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_interval_1
org.apache.hadoop.hive.cli.TestCliDriver.test

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-26 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513268#comment-14513268
 ] 

Matt McCline commented on HIVE-10450:
-

Yes, [~gopalv] good point.  Attached simpler patch.

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch, HIVE-10450.01.patch, 
> HIVE-10450.02.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
>   at 
> org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:244)
>   at 
> org.apache.h

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-26 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14513030#comment-14513030
 ] 

Gopal V commented on HIVE-10450:


[~mmccline]: this fix seems too involved - could this issue be fixed by an 
early-exit critera which does not vectorize anything where there are >1 top-ops?

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
>   at 
> org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMin

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-24 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510814#comment-14510814
 ] 

Gopal V commented on HIVE-10450:


Ah, that's what possibly confused me - this sort of plan is the archaic MR 
style JOIN.

We should be seeing the good vectorized MapJoin for that query in Tez.

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
>   at 
> org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(Test

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-24 Thread Matt McCline (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510812#comment-14510812
 ] 

Matt McCline commented on HIVE-10450:
-

[~gopalv] this problem is for MR not Tez (that I know of).

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
>   at 
> org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
>   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:244)
>   at 
> org.apache.hadoop.hive.cli.TestMiniTezCliDriver.te

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-24 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14510634#comment-14510634
 ] 

Gopal V commented on HIVE-10450:


[~mmccline]: this could very well be a planning error in Tez compiler.

I'm trying to figure out how the split generation for this would work - the 
parallelism for store_sales and store tables should be massively different.

This might be a scenario where we're fixing up vectorization for an incorrect 
plan.

> More than one TableScan in MapWork not supported in Vectorization -- causes  
> query to fail during vectorization
> ---
>
> Key: HIVE-10450
> URL: https://issues.apache.org/jira/browse/HIVE-10450
> Project: Hive
>  Issue Type: Bug
>Reporter: Matt McCline
>Assignee: Matt McCline
>Priority: Critical
> Attachments: HIVE-10450.01.patch
>
>
> [~gopalv] found a error with this query:
> {noformat}
> explain select
> s_state, count(1)
>  from store_sales,
>  store,
>  date_dim
>  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
>store_sales.ss_store_sk = store.s_store_sk and
>store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
>  group by s_state
>  order by s_state
>  limit 100;
> {noformat}
> Stack trace:
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.reflect.InvocationTargetException
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
>   at 
> org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
>   at 
> org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
>   at 
> org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
>   at 
> org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
>   at 
> org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
>   at 
> org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
>   at 
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
>   at 
> org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at 
> org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
>   at 
> org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
>