date:20150424


[ 
https://issues.apache.org/jira/browse/HIVE-10476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510534#comment-14510534
 ] 

Hive QA commented on HIVE-10476:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12727803/HIVE-10476.1-spark.patch

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8721 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucket6.q-scriptfile1_win.q-quotedid_smb.q-and-1-more - did 
not produce a TEST-*.xml file
TestMinimrCliDriver-bucketizedhiveinputformat.q-empty_dir_in_table.q - did not 
produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-infer_bucket_sort_map_operators.q-load_hdfs_file_with_space_in_the_name.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-import_exported_table.q-truncate_column_buckets.q-bucket_num_reducers2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-infer_bucket_sort_num_buckets.q-parallel_orderby.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-join1.q-infer_bucket_sort_bucketed_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-bucket5.q-infer_bucket_sort_merge.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-input16_cc.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-bucket_num_reducers.q-scriptfile1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx_cbo_2.q-bucketmapjoin6.q-bucket4.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-reduce_deduplicate.q-infer_bucket_sort_dyn_part.q-udf_using.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-uber_reduce.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-stats_counter_partitioned.q-external_table_with_space_in_location_path.q-disable_merge_for_bucketing.q-and-1-more
 - did not produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/840/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/840/console
Test logs: 
http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-840/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12727803 - PreCommit-HIVE-SPARK-Build

 Hive query should fail when it fails to initialize a session in 
 SetSparkReducerParallelism [Spark Branch]
 -

 Key: HIVE-10476
 URL: https://issues.apache.org/jira/browse/HIVE-10476
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Chao Sun
Assignee: Chao Sun
Priority: Minor
 Attachments: HIVE-10476.1-spark.patch


 Currently, for a Hive query HoS need to get a session
 a session twice, once in SparkSetReducerParallelism, and another when 
 submitting the actual job.
 The issue is that sometimes there's problem when launching a Yarn application 
 (e.g., don't have permission), then user will have to wait for two timeouts, 
 because both session initializations will fail. This turned out to happen 
 frequently.
 This JIRA proposes to fail the query in SparkSetReducerParallelism, when it 
 cannot initialize the session.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9671) Support Impersonation [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510542#comment-14510542
 ] 

Lefty Leverenz commented on HIVE-9671:
--

[~brocknoland], now that this has been merged to trunk (aka master) I'll ask 
again:  does this need documentation?

 Support Impersonation [Spark Branch]
 

 Key: HIVE-9671
 URL: https://issues.apache.org/jira/browse/HIVE-9671
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: spark-branch

 Attachments: HIVE-9671.1-spark.patch, HIVE-9671.1-spark.patch, 
 HIVE-9671.2-spark.patch, HIVE-9671.3-spark.patch


 SPARK-5493 in 1.3 implemented proxy user authentication. We need to implement 
 using this option in spark client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10347) Merge spark to trunk 4/15/2015


[ 
https://issues.apache.org/jira/browse/HIVE-10347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510571#comment-14510571
 ] 

Lefty Leverenz commented on HIVE-10347:
---

It turns out there aren't any doc issues in this merge (unless HIVE-9671 
Support impersonation needs to be documented).

 Merge spark to trunk 4/15/2015
 --

 Key: HIVE-10347
 URL: https://issues.apache.org/jira/browse/HIVE-10347
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Szehon Ho
Assignee: Szehon Ho
 Fix For: 1.2.0

 Attachments: HIVE-10347.2.patch, HIVE-10347.2.patch, 
 HIVE-10347.3.patch, HIVE-10347.4.patch, HIVE-10347.5.patch, 
 HIVE-10347.5.patch, HIVE-10347.6.patch, HIVE-10347.6.patch, HIVE-10347.patch


 CLEAR LIBRARY CACHE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-24 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510812#comment-14510812
 ] 

Matt McCline commented on HIVE-10450:
-

[~gopalv] this problem is for MR not Tez (that I know of).

 More than one TableScan in MapWork not supported in Vectorization -- causes  
 query to fail during vectorization
 ---

 Key: HIVE-10450
 URL: https://issues.apache.org/jira/browse/HIVE-10450
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-10450.01.patch


 [~gopalv] found a error with this query:
 {noformat}
 explain select
 s_state, count(1)
  from store_sales,
  store,
  date_dim
  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
store_sales.ss_store_sk = store.s_store_sk and
store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
  group by s_state
  order by s_state
  limit 100;
 {noformat}
 Stack trace:
 {noformat}
 org.apache.hadoop.hive.ql.parse.SemanticException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.reflect.InvocationTargetException
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
   at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
   at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
   at 
 org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
   at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
   at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:244)
   at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_inner_join5(TestMiniTezCliDriver.java:180)
   at

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization


[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510814#comment-14510814
 ] 

Gopal V commented on HIVE-10450:


Ah, that's what possibly confused me - this sort of plan is the archaic MR 
style JOIN.

We should be seeing the good vectorized MapJoin for that query in Tez.

 More than one TableScan in MapWork not supported in Vectorization -- causes  
 query to fail during vectorization
 ---

 Key: HIVE-10450
 URL: https://issues.apache.org/jira/browse/HIVE-10450
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-10450.01.patch


 [~gopalv] found a error with this query:
 {noformat}
 explain select
 s_state, count(1)
  from store_sales,
  store,
  date_dim
  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
store_sales.ss_store_sk = store.s_store_sk and
store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
  group by s_state
  order by s_state
  limit 100;
 {noformat}
 Stack trace:
 {noformat}
 org.apache.hadoop.hive.ql.parse.SemanticException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.reflect.InvocationTargetException
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
   at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
   at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
   at 
 org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
   at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
   at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:244)
   at

[jira] [Updated] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization

2015-04-24 Thread Matt McCline (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-10450:

Attachment: HIVE-10450.01.patch

 More than one TableScan in MapWork not supported in Vectorization -- causes  
 query to fail during vectorization
 ---

 Key: HIVE-10450
 URL: https://issues.apache.org/jira/browse/HIVE-10450
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-10450.01.patch


 [~gopalv] found a error with this query:
 {noformat}
 explain select
 s_state, count(1)
  from store_sales,
  store,
  date_dim
  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
store_sales.ss_store_sk = store.s_store_sk and
store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
  group by s_state
  order by s_state
  limit 100;
 {noformat}
 Stack trace:
 {noformat}
 org.apache.hadoop.hive.ql.parse.SemanticException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.reflect.InvocationTargetException
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
   at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
   at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
   at 
 org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
   at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
   at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:244)
   at 
 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_inner_join5(TestMiniTezCliDriver.java:180)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at

[jira] [Updated] (HIVE-5672) Insert with custom separator not supported for non-local directory

2015-04-24 Thread Nemon Lou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-5672:

Attachment: HIVE-5672.6.patch.tar.gz

 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch, HIVE-5672.3.patch, 
 HIVE-5672.4.patch, HIVE-5672.5.patch, HIVE-5672.5.patch.tar.gz, 
 HIVE-5672.6.patch.tar.gz


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10450) More than one TableScan in MapWork not supported in Vectorization -- causes query to fail during vectorization


[ 
https://issues.apache.org/jira/browse/HIVE-10450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510634#comment-14510634
 ] 

Gopal V commented on HIVE-10450:


[~mmccline]: this could very well be a planning error in Tez compiler.

I'm trying to figure out how the split generation for this would work - the 
parallelism for store_sales and store tables should be massively different.

This might be a scenario where we're fixing up vectorization for an incorrect 
plan.

 More than one TableScan in MapWork not supported in Vectorization -- causes  
 query to fail during vectorization
 ---

 Key: HIVE-10450
 URL: https://issues.apache.org/jira/browse/HIVE-10450
 Project: Hive
  Issue Type: Bug
Reporter: Matt McCline
Assignee: Matt McCline
Priority: Critical
 Attachments: HIVE-10450.01.patch


 [~gopalv] found a error with this query:
 {noformat}
 explain select
 s_state, count(1)
  from store_sales,
  store,
  date_dim
  where store_sales.ss_sold_date_sk = date_dim.d_date_sk and
store_sales.ss_store_sk = store.s_store_sk and
store.s_state in ('KS','AL', 'MN', 'AL', 'SC', 'VT')
  group by s_state
  order by s_state
  limit 100;
 {noformat}
 Stack trace:
 {noformat}
 org.apache.hadoop.hive.ql.parse.SemanticException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: 
 java.lang.reflect.InvocationTargetException
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationNodeProcessor.doVectorize(Vectorizer.java:676)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$MapWorkVectorizationNodeProcessor.process(Vectorizer.java:735)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:95)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:79)
   at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:54)
   at 
 org.apache.hadoop.hive.ql.lib.PreOrderWalker.walk(PreOrderWalker.java:59)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:110)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.vectorizeMapWork(Vectorizer.java:422)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.convertMapWork(Vectorizer.java:354)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer$VectorizationDispatcher.dispatch(Vectorizer.java:322)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
   at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.Vectorizer.resolve(Vectorizer.java:877)
   at 
 org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
   at 
 org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
   at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
   at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:204)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)
   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)
   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311)
   at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1019)
   at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:993)
   at

[jira] [Updated] (HIVE-10436) closed

2015-04-24 Thread anna ken (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anna ken updated HIVE-10436:

Description: closed  (was: Hi,

In Hadoop, while creating Table in hive i am getting stuck in below error

15/04/21 12:35:34 INFO log.PerfLogger: PERFLOG method=Driver.run 
from=org.apache.hadoop.hive.ql.Driver
15/04/21 12:35:34 INFO log.PerfLogger: PERFLOG method=TimeToSubmit 
from=org.apache.hadoop.hive.ql.Driver
15/04/21 12:35:34 INFO log.PerfLogger: PERFLOG method=acquireReadWriteLocks 
from=org.apache.hadoop.hive.ql.Driver
15/04/21 12:35:34 INFO lockmgr.DummyTxnManager: Creating lock manager of type 
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
15/04/21 12:35:34 INFO zookeeper.ZooKeeper: Initiating client connection, 
connectString=dkhc3013.dcsg.com:2181,dkhc3010.dcsg.com:2181,dkhc3011.dcsg.com:2181
 sessionTimeout=60 
watcher=org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager$DummyWatcher@5b9e1cd4
15/04/21 12:35:34 DEBUG lockmgr.DummyTxnManager: Adding 
/incoming/mkt/gcdb.etl_master_account_pref to list of lock inputs
15/04/21 12:35:34 DEBUG lockmgr.DummyTxnManager: Adding database:mkt_incoming 
to list of lock outputs

After restart the zookeeper service i am able to successfully run the query,

But after some time again facing the same issue/error, I am stuck on the same 
error.

is there any solution to overcome this issue, or any tuning i can do for 
resolve this issue. ?

Please suggest on this.)
Summary: closed  (was: DEBUG lockmgr.DummyTxnManager: Adding database 
:mkt_incoming to list of lock outputs)

 closed
 --

 Key: HIVE-10436
 URL: https://issues.apache.org/jira/browse/HIVE-10436
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
Reporter: ankush
  Labels: hadoop-2.0, hive, zookeeper

 closed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5672) Insert with custom separator not supported for non-local directory

2015-04-24 Thread Nemon Lou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511006#comment-14511006
 ] 

Nemon Lou commented on HIVE-5672:
-

This patch only adds the grammar support and does some code reflact.And it 
works as expected.Hive can write to hdfs directory after applying this patch.

 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch, HIVE-5672.3.patch, 
 HIVE-5672.4.patch, HIVE-5672.5.patch, HIVE-5672.5.patch.tar.gz, 
 HIVE-5672.6.patch, HIVE-5672.6.patch.tar.gz


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10436) closed

2015-04-24 Thread anna ken (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510944#comment-14510944
 ] 

anna ken commented on HIVE-10436:
-

resolved

 closed
 --

 Key: HIVE-10436
 URL: https://issues.apache.org/jira/browse/HIVE-10436
 Project: Hive
  Issue Type: Bug
  Components: Hive, HiveServer2
Reporter: ankush
  Labels: hadoop-2.0, hive, zookeeper

 closed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10477) Provide option to disable Spark tests in Windows OS


[ 
https://issues.apache.org/jira/browse/HIVE-10477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511038#comment-14511038
 ] 

Xuefu Zhang commented on HIVE-10477:


I'm wondering it's possible to detect the OS type in pom.xml and skip spark 
test automatically if os is windows.

 Provide option to disable Spark tests in Windows OS
 ---

 Key: HIVE-10477
 URL: https://issues.apache.org/jira/browse/HIVE-10477
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10477.1.patch


 In the current master branch, unit tests fail with windows OS because of the 
 dependency on bash executable in itests/hive-unit/pom.xml around these 
 lines :
 {code}
  target
 exec executable=bash dir=${basedir} 
 failonerror=true
   arg line=../target/download.sh/
 /exec
   /target
 {code}
 We should provide an option to disable spark tests in OSes  like Windows 
 where bash might be absent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-5672) Insert with custom separator not supported for non-local directory

2015-04-24 Thread Nemon Lou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nemon Lou updated HIVE-5672:

Attachment: HIVE-5672.6.patch

 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch, HIVE-5672.3.patch, 
 HIVE-5672.4.patch, HIVE-5672.5.patch, HIVE-5672.5.patch.tar.gz, 
 HIVE-5672.6.patch, HIVE-5672.6.patch.tar.gz


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-5672) Insert with custom separator not supported for non-local directory


[ 
https://issues.apache.org/jira/browse/HIVE-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511084#comment-14511084
 ] 

Xuefu Zhang commented on HIVE-5672:
---

Here is what I have about the combined grammar:
{code}
  (local = KW_LOCAL)? KW_DIRECTORY StringLiteral tableRowFormat? 
tableFileFormat?
- ^(TOK_DIR StringLiteral $local? tableRowFormat? tableFileFormat?)
{code}

With this, I'm sure SemanticAnalyzer has the information about whether the 
directory is local.



 Insert with custom separator not supported for non-local directory
 --

 Key: HIVE-5672
 URL: https://issues.apache.org/jira/browse/HIVE-5672
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 1.0.0
Reporter: Romain Rigaux
Assignee: Nemon Lou
 Attachments: HIVE-5672.1.patch, HIVE-5672.2.patch, HIVE-5672.3.patch, 
 HIVE-5672.4.patch, HIVE-5672.5.patch, HIVE-5672.5.patch.tar.gz, 
 HIVE-5672.6.patch, HIVE-5672.6.patch.tar.gz


 https://issues.apache.org/jira/browse/HIVE-3682 is great but non local 
 directory don't seem to be supported:
 {code}
 insert overwrite directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select description FROM sample_07
 {code}
 {code}
 Error while compiling statement: FAILED: ParseException line 2:0 cannot 
 recognize input near 'row' 'format' 'delimited' in select clause
 {code}
 This works (with 'local'):
 {code}
 insert overwrite local directory '/tmp/test-02'
 row format delimited
 FIELDS TERMINATED BY ':'
 select code, description FROM sample_07
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10455) CBO (Calcite Return Path): Different data types at Reducer before JoinOp

2015-04-24 Thread Pengcheng Xiong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511455#comment-14511455
 ] 

Pengcheng Xiong commented on HIVE-10455:


[~jcamachorodriguez], as per [~jpullokkaran]'s request, could you please review 
the patch? Thanks!

 CBO (Calcite Return Path): Different data types at Reducer before JoinOp
 

 Key: HIVE-10455
 URL: https://issues.apache.org/jira/browse/HIVE-10455
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: 1.2.0

 Attachments: HIVE-10455.01.patch


 The following error occured for cbo_subq_not_in.q 
 {code}
 java.lang.Exception: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable 
 to deserialize reduce input key from x1x128x0x0x1 with properties 
 {columns=reducesinkkey0, 
 serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
  serialization.sort.order=+, columns.types=double}
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
 {code}
 A more easier way to reproduce is 
 {code}
 set hive.cbo.enable=true;
 set hive.exec.check.crossproducts=false;
 set hive.stats.fetch.column.stats=true;
 set hive.auto.convert.join=false;
 select p_size, src.key
 from 
 part join src
 on p_size=key;
 {code}
 As you can see, p_size is integer while src.key is string. Both of them 
 should be cast to double when they join. When return path is off, this will 
 happen before Join, at RS. However, when return path is on, this will be 
 considered as an expression in Join. Thus, when reducer is collecting 
 different types of keys from different join branches, it throws exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9671) Support Impersonation [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511514#comment-14511514
 ] 

Szehon Ho commented on HIVE-9671:
-

Doesn't seem to per my understanding, as it seems to work with current Hive 
impersonation config, just passed into spark instead of MR.  Thanks

 Support Impersonation [Spark Branch]
 

 Key: HIVE-9671
 URL: https://issues.apache.org/jira/browse/HIVE-9671
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: spark-branch
Reporter: Brock Noland
Assignee: Brock Noland
 Fix For: spark-branch

 Attachments: HIVE-9671.1-spark.patch, HIVE-9671.1-spark.patch, 
 HIVE-9671.2-spark.patch, HIVE-9671.3-spark.patch


 SPARK-5493 in 1.3 implemented proxy user authentication. We need to implement 
 using this option in spark client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10455) CBO (Calcite Return Path): Different data types at Reducer before JoinOp

2015-04-24 Thread Pengcheng Xiong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-10455:
---
Attachment: HIVE-10455.01.patch

 CBO (Calcite Return Path): Different data types at Reducer before JoinOp
 

 Key: HIVE-10455
 URL: https://issues.apache.org/jira/browse/HIVE-10455
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Fix For: 1.2.0

 Attachments: HIVE-10455.01.patch


 The following error occured for cbo_subq_not_in.q 
 {code}
 java.lang.Exception: java.lang.RuntimeException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error: Unable 
 to deserialize reduce input key from x1x128x0x0x1 with properties 
 {columns=reducesinkkey0, 
 serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
  serialization.sort.order=+, columns.types=double}
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
 at 
 org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
 {code}
 A more easier way to reproduce is 
 {code}
 set hive.cbo.enable=true;
 set hive.exec.check.crossproducts=false;
 set hive.stats.fetch.column.stats=true;
 set hive.auto.convert.join=false;
 select p_size, src.key
 from 
 part join src
 on p_size=key;
 {code}
 As you can see, p_size is integer while src.key is string. Both of them 
 should be cast to double when they join. When return path is off, this will 
 happen before Join, at RS. However, when return path is on, this will be 
 considered as an expression in Join. Thus, when reducer is collecting 
 different types of keys from different join branches, it throws exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10302) Load small tables (for map join) in executor memory only once [Spark Branch]

2015-04-24 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-10302:
---
Attachment: HIVE-10302.4-spark.patch

Thanks a lot for the discussion and review. Yes, the spark related test 
failures are related. It is because the loaded small tables are cleared after 
the map join. So the table container is empty in the cache.  Attached v4 that 
doesn't clear the small tables in case it is spark and dedicated cluster, when 
the small tables are cached.

 Load small tables (for map join) in executor memory only once [Spark Branch]
 

 Key: HIVE-10302
 URL: https://issues.apache.org/jira/browse/HIVE-10302
 Project: Hive
  Issue Type: Improvement
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Fix For: spark-branch

 Attachments: HIVE-10302.2-spark.patch, HIVE-10302.3-spark.patch, 
 HIVE-10302.4-spark.patch, HIVE-10302.spark-1.patch


 Usually there are multiple cores in a Spark executor, and thus it's possible 
 that multiple map-join tasks can be running in the same executor 
 (concurrently or sequentially). Currently, each task will load its own copy 
 of the small tables for map join into memory, ending up with inefficiency. 
 Ideally, we only load the small tables once and share them among the tasks 
 running in that executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10462) CBO (Calcite Return Path): Exception thrown in conversion to MapJoin

2015-04-24 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511505#comment-14511505
 ] 

Laljo John Pullokkaran commented on HIVE-10462:
---

+1

 CBO (Calcite Return Path): Exception thrown in conversion to MapJoin
 

 Key: HIVE-10462
 URL: https://issues.apache.org/jira/browse/HIVE-10462
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Jesus Camacho Rodriguez
Assignee: Jesus Camacho Rodriguez
 Fix For: 1.2.0

 Attachments: HIVE-10462.01.patch, HIVE-10462.02.patch, 
 HIVE-10462.patch


 When the return path is on, the mapjoin conversion optimization fails as some 
 DS in the Join descriptor have not been initialized properly.
 The failure can be reproduced with auto_join4.q. In particular, the following 
 Exception is thrown:
 {noformat}
 org.apache.hadoop.hive.ql.parse.SemanticException: Generate Map Join Task 
 Error: null
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinTaskDispatcher.processCurrentTask(CommonJoinTaskDispatcher.java:516)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.AbstractJoinTaskDispatcher.dispatch(AbstractJoinTaskDispatcher.java:179)
 at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.dispatch(TaskGraphWalker.java:111)
 at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.walk(TaskGraphWalker.java:180)
 at 
 org.apache.hadoop.hive.ql.lib.TaskGraphWalker.startWalking(TaskGraphWalker.java:125)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.CommonJoinResolver.resolve(CommonJoinResolver.java:79)
 at 
 org.apache.hadoop.hive.ql.optimizer.physical.PhysicalOptimizer.optimize(PhysicalOptimizer.java:107)
 at 
 org.apache.hadoop.hive.ql.parse.MapReduceCompiler.optimizeTaskPlan(MapReduceCompiler.java:270)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:227)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10084)
 at 
 org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:203)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:225)
 ...
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10431) HIVE-9555 broke hadoop-1 build

2015-04-24 Thread Prasanth Jayachandran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511509#comment-14511509
 ] 

Prasanth Jayachandran commented on HIVE-10431:
--

+1

 HIVE-9555 broke hadoop-1 build
 --

 Key: HIVE-10431
 URL: https://issues.apache.org/jira/browse/HIVE-10431
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Sergey Shelukhin
 Attachments: HIVE-10431.patch


 HIVE-9555 RecordReaderUtils uses direct bytebuffer read from 
 FSDataInputStream which is not present in hadoop-1. This breaks hadoop-1 
 compilation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10431) HIVE-9555 broke hadoop-1 build


[ 
https://issues.apache.org/jira/browse/HIVE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511419#comment-14511419
 ] 

Sergey Shelukhin commented on HIVE-10431:
-

ping?

 HIVE-9555 broke hadoop-1 build
 --

 Key: HIVE-10431
 URL: https://issues.apache.org/jira/browse/HIVE-10431
 Project: Hive
  Issue Type: Bug
Affects Versions: 1.2.0
Reporter: Prasanth Jayachandran
Assignee: Sergey Shelukhin
 Attachments: HIVE-10431.patch


 HIVE-9555 RecordReaderUtils uses direct bytebuffer read from 
 FSDataInputStream which is not present in hadoop-1. This breaks hadoop-1 
 compilation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10421) DROP TABLE with qualified table name ignores database name when checking partitions

2015-04-24 Thread Thejas M Nair (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511963#comment-14511963
 ] 

Thejas M Nair commented on HIVE-10421:
--

+1

 DROP TABLE with qualified table name ignores database name when checking 
 partitions
 ---

 Key: HIVE-10421
 URL: https://issues.apache.org/jira/browse/HIVE-10421
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10421.1.patch, HIVE-10421.2.patch


 Hive was only recently changed to allow drop table dbname.tabname. However 
 DDLTask.dropTable() is still using an older version of 
 Hive.getPartitionNames(), which only took in a single string for the table 
 name, rather than the database and table names. As a result Hive is filling 
 in the current database name as the dbname during the listPartitions call to 
 the MetaStore.
 It also appears that on the Hive Metastore side, in the non-auth path there 
 is no validation to check that the dbname.tablename actually exists - this 
 call simply returns back an empty list of partitions, which causes the table 
 to be dropped without checking any of the partition information. I will open 
 a separate issue for this one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10483) ACID: insert overwrite with self join deadlocks on itself


 [ 
https://issues.apache.org/jira/browse/HIVE-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10483:
--
Description: 
insert overwrite ta partition(part=) select xxx from tb join ta where 
part=

It seems like the Shared conflicts with the Exclusive lock for Insert Overwrite 
even though both are part of the same txn.
More precisely insert overwrite requires X lock on partition and the read side 
needs an S lock on the query.

  was:
insert overwrite ta partition(part=) select xxx from tb join ta where 
part=

It seems like the Shared conflicts with the Exclusive lock for Insert Overwrite 
even though both are part of the same txn.


 ACID: insert overwrite with self join deadlocks on itself
 -

 Key: HIVE-10483
 URL: https://issues.apache.org/jira/browse/HIVE-10483
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Query Processor, Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 insert overwrite ta partition(part=) select xxx from tb join ta where 
 part=
 It seems like the Shared conflicts with the Exclusive lock for Insert 
 Overwrite even though both are part of the same txn.
 More precisely insert overwrite requires X lock on partition and the read 
 side needs an S lock on the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10483) ACID: insert overwrite with self join deadlocks on itself with DbTxnManager

2015-04-24 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10483:
--
Description: 
insert overwrite ta partition(part=) select xxx from tb join ta where 
part=

It seems like the Shared conflicts with the Exclusive lock for Insert Overwrite 
even though both are part of the same txn.
More precisely insert overwrite requires X lock on partition and the read side 
needs an S lock on the query.

A simpler case is
insert overwrite ta partition(part=) select * from ta

  was:
insert overwrite ta partition(part=) select xxx from tb join ta where 
part=

It seems like the Shared conflicts with the Exclusive lock for Insert Overwrite 
even though both are part of the same txn.
More precisely insert overwrite requires X lock on partition and the read side 
needs an S lock on the query.


 ACID: insert overwrite with self join deadlocks on itself with DbTxnManager
 ---

 Key: HIVE-10483
 URL: https://issues.apache.org/jira/browse/HIVE-10483
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Query Processor, Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 insert overwrite ta partition(part=) select xxx from tb join ta where 
 part=
 It seems like the Shared conflicts with the Exclusive lock for Insert 
 Overwrite even though both are part of the same txn.
 More precisely insert overwrite requires X lock on partition and the read 
 side needs an S lock on the query.
 A simpler case is
 insert overwrite ta partition(part=) select * from ta



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column

2015-04-24 Thread Mostafa Mokhtar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mostafa Mokhtar updated HIVE-10484:
---
Summary: Vectorization : RuntimeException Big Table Retained Mapping 
duplicate column  (was: Vectorization : Big Table Retained Mapping duplicate 
column)

 Vectorization : RuntimeException Big Table Retained Mapping duplicate column
 --

 Key: HIVE-10484
 URL: https://issues.apache.org/jira/browse/HIVE-10484
 Project: Hive
  Issue Type: Bug
  Components: Tez, Vectorization
Affects Versions: 1.2.0
Reporter: Mostafa Mokhtar
Assignee: Matt McCline
 Fix For: 1.2.0


 With vectorization and tez enabled TPC-DS Q70 fails with 
 {code}
 Caused by: java.lang.RuntimeException: Big Table Retained Mapping duplicate 
 column 6 in ordered column map {6=(value column: 6, type name: int), 
 21=(value column: 21, type name: float), 22=(value column: 22, type name: 
 int)} when adding value column 6, type int
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOrderedMap.add(VectorColumnOrderedMap.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.vector.VectorColumnOutputMapping.add(VectorColumnOutputMapping.java:40)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.determineCommonInfo(VectorMapJoinCommonOperator.java:320)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.init(VectorMapJoinCommonOperator.java:254)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.init(VectorMapJoinGenerateResultOperator.java:89)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.init(VectorMapJoinInnerGenerateResultOperator.java:97)
   at 
 org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.init(VectorMapJoinInnerLongOperator.java:79)
   ... 49 more
 {code}
 Query 
 {code}
  select s_state
from  (select s_state as s_state, sum(ss_net_profit),
  rank() over ( partition by s_state order by 
 sum(ss_net_profit) desc) as ranking
   from   store_sales, store, date_dim
   where  d_month_seq between 1193 and 1193+11
 and date_dim.d_date_sk = 
 store_sales.ss_sold_date_sk
 and store.s_store_sk  = store_sales.ss_store_sk
   group by s_state
  ) tmp1
where ranking = 5
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10447) Beeline JDBC Driver to support 2 way SSL


 [ 
https://issues.apache.org/jira/browse/HIVE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10447:
-
Attachment: HIVE-10447.2.patch

Addressing [~thejas]'s comments.

Thanks
Hari

 Beeline JDBC Driver to support 2 way SSL
 

 Key: HIVE-10447
 URL: https://issues.apache.org/jira/browse/HIVE-10447
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10447.1.patch, HIVE-10447.2.patch


 This jira should cover 2-way SSL authentication between the JDBC Client and 
 server which requires the driver to support it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10483) ACID: insert overwrite with self join deadlocks on itself with DbTxnManager


 [ 
https://issues.apache.org/jira/browse/HIVE-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10483:
--
Summary: ACID: insert overwrite with self join deadlocks on itself with 
DbTxnManager  (was: ACID: insert overwrite with self join deadlocks on itself)

 ACID: insert overwrite with self join deadlocks on itself with DbTxnManager
 ---

 Key: HIVE-10483
 URL: https://issues.apache.org/jira/browse/HIVE-10483
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Query Processor, Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman

 insert overwrite ta partition(part=) select xxx from tb join ta where 
 part=
 It seems like the Shared conflicts with the Exclusive lock for Insert 
 Overwrite even though both are part of the same txn.
 More precisely insert overwrite requires X lock on partition and the read 
 side needs an S lock on the query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10312) SASL.QOP in JDBC URL is ignored for Delegation token Authentication

2015-04-24 Thread Mubashir Kazia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512073#comment-14512073
 ] 

Mubashir Kazia commented on HIVE-10312:
---

Thanks [~xuefuz]. 
[~leftylev] Do we need clarity on the language in the current documentation 
that says QOP is supported only for Kerberos authentication? Do we consider 
delegation token authentication in Kerberos enabled HS2/HMS as Kerberos 
authentication or pure Kerberos service ticket based authentication as Kerberos 
authentication?

 SASL.QOP in JDBC URL is ignored for Delegation token Authentication
 ---

 Key: HIVE-10312
 URL: https://issues.apache.org/jira/browse/HIVE-10312
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 1.2.0
Reporter: Mubashir Kazia
Assignee: Mubashir Kazia
 Fix For: 1.2.0

 Attachments: HIVE-10312.1.patch, HIVE-10312.1.patch


 When HS2 is configured for QOP other than auth (auth-int or auth-conf), 
 Kerberos client connection works fine when the JDBC URL specifies the 
 matching QOP, however when this HS2 is accessed through Oozie (Delegation 
 token / Digest authentication), connections fails because the JDBC driver 
 ignores the SASL.QOP parameters in the JDBC URL. SASL.QOP setting should be 
 valid for DIGEST Auth mech.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9613) Left join query plan outputs wrong column when using subquery

2015-04-24 Thread Vikram Dixit K (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511975#comment-14511975
 ] 

Vikram Dixit K commented on HIVE-9613:
--

+1 for 1.0

 Left join query plan outputs  wrong column when using subquery
 --

 Key: HIVE-9613
 URL: https://issues.apache.org/jira/browse/HIVE-9613
 Project: Hive
  Issue Type: Bug
  Components: Parser, Query Planning
Affects Versions: 0.14.0, 1.0.0
 Environment: apache hadoop 2.5.1 
Reporter: Li Xin
 Fix For: 1.2.0

 Attachments: HIVE-9613.1.patch, test.sql


 I have a query that outputs a column with wrong contents when using 
 subquery,and the contents of that column is equal to another column,not its 
 own.
 I have three tables,as follows:
 table 1: _hivetemp.category_city_rank_:
 ||category||city||rank||
 |jinrongfuwu|shanghai|1|
 |ktvjiuba|shanghai|2|
 table 2:_hivetemp.category_match_:
 ||src_category_en||src_category_cn||dst_category_en||dst_category_cn||
 |danbaobaoxiantouzi|投资担保|担保/贷款|jinrongfuwu|
 |zpwentiyingshi|娱乐/休闲|KTV/酒吧|ktvjiuba|
 table 3:_hivetemp.city_match_:
 ||src_city_name_en||dst_city_name_en||city_name_cn||
 |sh|shanghai|上海|
 And the query is :
 {code}
 select
 a.category,
 a.city,
 a.rank,
 b.src_category_en,
 c.src_city_name_en
 from
 hivetemp.category_city_rank a
 left outer join
 (select
 src_category_en,
 dst_category_en
 from
 hivetemp.category_match) b
 on  a.category = b.dst_category_en
 left outer join
 (select
 src_city_name_en,
 dst_city_name_en
 from
 hivetemp.city_match) c
 on  a.city = c.dst_city_name_en
 {code}
 which shoud output the results as follows,and i test it in hive 0.13:
 ||category||city||rank||src_category_en||src_city_name_en||
 |jinrongfuwu|shanghai|1|danbaobaoxiantouzi|sh|
 |ktvjiuba|shanghai|2|zpwentiyingshi|sh|
 but int hive0.14,the results in the column *src_category_en* is wrong,and is 
 just the *city* contents:
 ||category||city||rank||src_category_en||src_city_name_en||
 |jinrongfuwu|shanghai|1|shanghai|sh|
 |ktvjiuba|shanghai|2|shanghai|sh|
 Using explain to examine the execution plan,i can see the first subquery just 
 outputs one column of *dst_category_en*,and *src_category_en* is just missing.
 {quote}
b:category_match
   TableScan
 alias: category_match
 Statistics: Num rows: 131 Data size: 13149 Basic stats: COMPLETE 
 Column stats: NONE
 Select Operator
   expressions: dst_category_en (type: string)
   outputColumnNames: _col1
   Statistics: Num rows: 131 Data size: 13149 Basic stats: 
 COMPLETE Column stats: NONE
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10481) ACID table update finishes but values not really updated if column names are not all lower case


[ 
https://issues.apache.org/jira/browse/HIVE-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511977#comment-14511977
 ] 

Hive QA commented on HIVE-10481:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728027/HIVE-10481.2.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8815 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3578/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3578/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3578/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728027 - PreCommit-HIVE-TRUNK-Build

 ACID table update finishes but values not really updated if column names are 
 not all lower case
 ---

 Key: HIVE-10481
 URL: https://issues.apache.org/jira/browse/HIVE-10481
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-10481.2.patch, HIVE-10481.patch


 Column in table is defined with upper case or mixed case, when do update 
 command with verbatim column names, update doesn't update the value. when do 
 update with all lower case column names, it works.
 STEPS TO REPRODUCE:
 create table testable( a string, Bb string, c string)
 clustered by (c) into 3 buckets
 stored as orc
 tblproperties(transactional=true);
 insert into table testable values ('a1','b1','c1), ('a2','b2','c2'), 
 ('a3','b3','c3');
 update table testable set Bb='bb';
 job finishes, but the values are not really updated.
 update table testable set bb='bb'; it works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10421) DROP TABLE with qualified table name ignores database name when checking partitions


[ 
https://issues.apache.org/jira/browse/HIVE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512099#comment-14512099
 ] 

Hive QA commented on HIVE-10421:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728033/HIVE-10421.2.patch

{color:red}ERROR:{color} -1 due to 15 failed/errored test(s), 8815 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23
org.apache.hive.jdbc.TestSSL.testSSLConnectionWithProperty
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3579/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3579/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3579/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 15 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728033 - PreCommit-HIVE-TRUNK-Build

 DROP TABLE with qualified table name ignores database name when checking 
 partitions
 ---

 Key: HIVE-10421
 URL: https://issues.apache.org/jira/browse/HIVE-10421
 Project: Hive
  Issue Type: Bug
Reporter: Jason Dere
Assignee: Jason Dere
 Attachments: HIVE-10421.1.patch, HIVE-10421.2.patch


 Hive was only recently changed to allow drop table dbname.tabname. However 
 DDLTask.dropTable() is still using an older version of 
 Hive.getPartitionNames(), which only took in a single string for the table 
 name, rather than the database and table names. As a result Hive is filling 
 in the current database name as the dbname during the listPartitions call to 
 the MetaStore.
 It also appears that on the Hive Metastore side, in the non-auth path there 
 is no validation to check that the dbname.tablename actually exists - this 
 call simply returns back an empty list of partitions, which causes the table 
 to be dropped without checking any of the partition information. I will open 
 a separate issue for this one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-3404) Create quarter UDF


 [ 
https://issues.apache.org/jira/browse/HIVE-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-3404:
--
Attachment: HIVE-3404.2.patch

patch #2.
extends GenericUDF

 Create quarter UDF
 --

 Key: HIVE-3404
 URL: https://issues.apache.org/jira/browse/HIVE-3404
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Sanam Naz
Assignee: Alexander Pivovarov
 Attachments: HIVE-3404.1.patch.txt, HIVE-3404.2.patch


 The function QUARTER(date) would return the quarter  from a string / date / 
 timestamp. This will be useful for different domains like retail ,finance etc.
 MySQL has QUARTER function
 https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_quarter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10312) SASL.QOP in JDBC URL is ignored for Delegation token Authentication


 [ 
https://issues.apache.org/jira/browse/HIVE-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang resolved HIVE-10312.

Resolution: Fixed

Committed to master. Thanks, Mubashir!

 SASL.QOP in JDBC URL is ignored for Delegation token Authentication
 ---

 Key: HIVE-10312
 URL: https://issues.apache.org/jira/browse/HIVE-10312
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 1.2.0
Reporter: Mubashir Kazia
Assignee: Mubashir Kazia
 Fix For: 1.2.0

 Attachments: HIVE-10312.1.patch, HIVE-10312.1.patch


 When HS2 is configured for QOP other than auth (auth-int or auth-conf), 
 Kerberos client connection works fine when the JDBC URL specifies the 
 matching QOP, however when this HS2 is accessed through Oozie (Delegation 
 token / Digest authentication), connections fails because the JDBC driver 
 ignores the SASL.QOP parameters in the JDBC URL. SASL.QOP setting should be 
 valid for DIGEST Auth mech.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-3404) Create quarter UDF


 [ 
https://issues.apache.org/jira/browse/HIVE-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-3404:
--
Summary: Create quarter UDF  (was: UDF to obtain the quarter of an year if 
a date or timestamp is given .)

 Create quarter UDF
 --

 Key: HIVE-3404
 URL: https://issues.apache.org/jira/browse/HIVE-3404
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Sanam Naz
Assignee: Alexander Pivovarov
 Attachments: HIVE-3404.1.patch.txt


 Hive current releases lacks a function which returns the quarter of an year 
 if a date or timestamp is given .The function QUARTER(date) would return the 
 quarter  from a date / timestamp .This can be used in HiveQL.This will be 
 useful for different domains like retail ,finance etc.
 MySQL has QUARTER function
 https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_quarter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-3404) Create quarter UDF


 [ 
https://issues.apache.org/jira/browse/HIVE-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-3404:
--
Description: 
The function QUARTER(date) would return the quarter  from a string / date / 
timestamp. This will be useful for different domains like retail ,finance etc.

MySQL has QUARTER function
https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_quarter

  was:
Hive current releases lacks a function which returns the quarter of an year if 
a date or timestamp is given .The function QUARTER(date) would return the 
quarter  from a date / timestamp .This can be used in HiveQL.This will be 
useful for different domains like retail ,finance etc.

MySQL has QUARTER function
https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_quarter


 Create quarter UDF
 --

 Key: HIVE-3404
 URL: https://issues.apache.org/jira/browse/HIVE-3404
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Sanam Naz
Assignee: Alexander Pivovarov
 Attachments: HIVE-3404.1.patch.txt


 The function QUARTER(date) would return the quarter  from a string / date / 
 timestamp. This will be useful for different domains like retail ,finance etc.
 MySQL has QUARTER function
 https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_quarter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10484) Vectorization : RuntimeException Big Table Retained Mapping duplicate column

2015-04-24 Thread Matt McCline (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512088#comment-14512088
 ] 

Matt McCline commented on HIVE-10484:
-


I was able to vectorize the query.  I'm wondering what environment variables 
are different that cause the issue you reported.

{noformat}
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
Tez
  Edges:
Map 1 - Map 2 (BROADCAST_EDGE)
Map 3 - Map 1 (BROADCAST_EDGE)
Reducer 4 - Map 3 (SIMPLE_EDGE)
Reducer 5 - Reducer 4 (SIMPLE_EDGE)
 A masked pattern was here 
  Vertices:
Map 1 
Map Operator Tree:
TableScan
  alias: store_sales
  Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
stats: NONE
  Filter Operator
predicate: (ss_store_sk is not null and ss_sold_date_sk is 
not null) (type: boolean)
Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
Column stats: NONE
Map Join Operator
  condition map:
   Inner Join 0 to 1
  keys:
0 ss_store_sk (type: int)
1 s_store_sk (type: int)
  outputColumnNames: _col0, _col21, _col22, _col26, _col50
  input vertices:
1 Map 2
  Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
Column stats: NONE
  HybridGraceHashJoin: true
  Reduce Output Operator
key expressions: _col0 (type: int)
sort order: +
Map-reduce partition columns: _col0 (type: int)
Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
Column stats: NONE
value expressions: _col21 (type: decimal(7,2)), _col22 
(type: int), _col26 (type: int), _col50 (type: string)
Execution mode: vectorized
Map 2 
Map Operator Tree:
TableScan
  alias: store
  Statistics: Num rows: 0 Data size: 0 Basic stats: NONE Column 
stats: NONE
  Filter Operator
predicate: s_store_sk is not null (type: boolean)
Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
Column stats: NONE
Reduce Output Operator
  key expressions: s_store_sk (type: int)
  sort order: +
  Map-reduce partition columns: s_store_sk (type: int)
  Statistics: Num rows: 0 Data size: 0 Basic stats: NONE 
Column stats: NONE
  value expressions: s_state (type: string)
Execution mode: vectorized
Map 3 
Map Operator Tree:
TableScan
  alias: date_dim
  Statistics: Num rows: 73049 Data size: 81741831 Basic stats: 
COMPLETE Column stats: NONE
  Filter Operator
predicate: (d_date_sk is not null and d_month_seq BETWEEN 
1193 AND 1204) (type: boolean)
Statistics: Num rows: 18262 Data size: 20435178 Basic 
stats: COMPLETE Column stats: NONE
Map Join Operator
  condition map:
   Inner Join 0 to 1
  keys:
0 _col0 (type: int)
1 d_date_sk (type: int)
  outputColumnNames: _col0, _col21, _col22, _col26, _col50, 
_col58, _col61
  input vertices:
0 Map 1
  Statistics: Num rows: 20088 Data size: 22478696 Basic 
stats: COMPLETE Column stats: NONE
  HybridGraceHashJoin: true
  Filter Operator
predicate: ((_col61 BETWEEN 1193 AND 1204 and (_col58 = 
_col0)) and (_col26 = _col22)) (type: boolean)
Statistics: Num rows: 2511 Data size: 2809837 Basic 
stats: COMPLETE Column stats: NONE
Select Operator
  expressions: _col50 (type: string), _col21 (type: 
decimal(7,2))
  outputColumnNames: _col50, _col21
  Statistics: Num rows: 2511 Data size: 2809837 Basic 
stats: COMPLETE Column stats: NONE
  Group By Operator
aggregations: sum(_col21)
keys: _col50 (type: string)
mode: hash
outputColumnNames: _col0, _col1
Statistics: Num rows: 2511 Data size: 2809837 Basic 
stats: COMPLETE Column stats:

[jira] [Commented] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow


[ 
https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511281#comment-14511281
 ] 

Sergey Shelukhin commented on HIVE-10474:
-

When I was running, I only saw this pattern with IO not enabled. Note that each 
6-query run is a separate LLAP cluster

 LLAP: investigate why TPCH Q1 1k is slow
 

 Key: HIVE-10474
 URL: https://issues.apache.org/jira/browse/HIVE-10474
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
 Attachments: llap-gc-pauses.png


 While most queries run faster in LLAP than just Tez with container reuse, 
 TPCH Q1 is much slower.
 On my run, on tez with container reuse (current default LLAP configuration 
 but mode == container and no daemons running)  runs 2-6 (out of 6 consecutive 
 runs in the same session) finished in 25.5sec average; with 16 LLAP daemons 
 in default config the average was 35.5sec; same w/o IO elevator (to rule out 
 its impact) it took 59.7sec w/strange distribution (later runs were slower 
 than earlier runs, still, fastest run was 49.5sec).
 So excluding IO elevator it's more than 2x degradation.
 We need to figure out why this is happening. Is it just slot discrepancy? 
 Regardless, this needs to be addressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow

[
https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Gopal V updated HIVE-10474:
---
Attachment: llap-gc-pauses.png

I restored the HADOOP-11772 fix on the cluster and re-ran this.

The GC pressure has gone way up since I tested this last - 20-25 full
collections every minute.

!llap-gc-pauses.png!

something's changed that made the tenured generation huge recently - the daemon
slows down as you keep using it. This looks like a recent regression in perf.

LLAP: investigate why TPCH Q1 1k is slow

Key: HIVE-10474
URL: https://issues.apache.org/jira/browse/HIVE-10474
Project: Hive
Issue Type: Sub-task
Reporter: Sergey Shelukhin
Attachments: llap-gc-pauses.png

While most queries run faster in LLAP than just Tez with container reuse,
TPCH Q1 is much slower.
On my run, on tez with container reuse (current default LLAP configuration
but mode == container and no daemons running) runs 2-6 (out of 6 consecutive
runs in the same session) finished in 25.5sec average; with 16 LLAP daemons
in default config the average was 35.5sec; same w/o IO elevator (to rule out
its impact) it took 59.7sec w/strange distribution (later runs were slower
than earlier runs, still, fastest run was 49.5sec).
So excluding IO elevator it's more than 2x degradation.
We need to figure out why this is happening. Is it just slot discrepancy?
Regardless, this needs to be addressed.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow


[ 
https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511244#comment-14511244
 ] 

Sergey Shelukhin commented on HIVE-10474:
-

IO enabled or not?

 LLAP: investigate why TPCH Q1 1k is slow
 

 Key: HIVE-10474
 URL: https://issues.apache.org/jira/browse/HIVE-10474
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
 Attachments: llap-gc-pauses.png


 While most queries run faster in LLAP than just Tez with container reuse, 
 TPCH Q1 is much slower.
 On my run, on tez with container reuse (current default LLAP configuration 
 but mode == container and no daemons running)  runs 2-6 (out of 6 consecutive 
 runs in the same session) finished in 25.5sec average; with 16 LLAP daemons 
 in default config the average was 35.5sec; same w/o IO elevator (to rule out 
 its impact) it took 59.7sec w/strange distribution (later runs were slower 
 than earlier runs, still, fastest run was 49.5sec).
 So excluding IO elevator it's more than 2x degradation.
 We need to figure out why this is happening. Is it just slot discrepancy? 
 Regardless, this needs to be addressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow


[ 
https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511255#comment-14511255
 ] 

Gopal V commented on HIVE-10474:


IO is enabled - this is not a map-join so it is not the hashtable cache that's 
accumulating across queries.

 LLAP: investigate why TPCH Q1 1k is slow
 

 Key: HIVE-10474
 URL: https://issues.apache.org/jira/browse/HIVE-10474
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
 Attachments: llap-gc-pauses.png


 While most queries run faster in LLAP than just Tez with container reuse, 
 TPCH Q1 is much slower.
 On my run, on tez with container reuse (current default LLAP configuration 
 but mode == container and no daemons running)  runs 2-6 (out of 6 consecutive 
 runs in the same session) finished in 25.5sec average; with 16 LLAP daemons 
 in default config the average was 35.5sec; same w/o IO elevator (to rule out 
 its impact) it took 59.7sec w/strange distribution (later runs were slower 
 than earlier runs, still, fastest run was 49.5sec).
 So excluding IO elevator it's more than 2x degradation.
 We need to figure out why this is happening. Is it just slot discrepancy? 
 Regardless, this needs to be addressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow


[ 
https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511260#comment-14511260
 ] 

Gopal V commented on HIVE-10474:


I should be able to get a memory dump on the cluster, but my internet 
connection is too slow for me to copy it to my laptop.

Let me see if I can analyze from the histograms.

 LLAP: investigate why TPCH Q1 1k is slow
 

 Key: HIVE-10474
 URL: https://issues.apache.org/jira/browse/HIVE-10474
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
 Attachments: llap-gc-pauses.png


 While most queries run faster in LLAP than just Tez with container reuse, 
 TPCH Q1 is much slower.
 On my run, on tez with container reuse (current default LLAP configuration 
 but mode == container and no daemons running)  runs 2-6 (out of 6 consecutive 
 runs in the same session) finished in 25.5sec average; with 16 LLAP daemons 
 in default config the average was 35.5sec; same w/o IO elevator (to rule out 
 its impact) it took 59.7sec w/strange distribution (later runs were slower 
 than earlier runs, still, fastest run was 49.5sec).
 So excluding IO elevator it's more than 2x degradation.
 We need to figure out why this is happening. Is it just slot discrepancy? 
 Regardless, this needs to be addressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow


[ 
https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511287#comment-14511287
 ] 

Gopal V commented on HIVE-10474:


Huge numbers of int[] arrays - I was expecting to find a huge number of 
ColumnVectors or something like that

{code}
 JVM version is 25.25-b02
   5 Object Histogram:
   6 
   7 num   #instances#bytes  Class description
   8 --
   9 1:  70473   227976176   int[]
  10 2:  852738  88757176char[]
  11 3:  715258638832double[]
  12 4:  30035   22348928byte[][]
  13 5:  686743  21975776java.util.Hashtable$Entry
  14 6:  293811  18803904java.nio.DirectByteBuffer
  15 7:  762767  18306408java.lang.String
  16 8:  219901  14073664
org.apache.hadoop.hive.llap.cache.LlapDataBuffer
  17 9:  37270   13536408boolean[]
  18 10: 357189  11430048
java.util.concurrent.ConcurrentHashMap$Node
  19 11: 457895  10989480java.lang.Long
  20 12: 129347  9312984 
org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics
  21 13: 141337  8986752 java.lang.Object[]
  22 14: 316768  7641760 java.lang.String[]
  23 15: 14816373792 java.util.Hashtable$Entry[]
  24 16: 127001  6096048 
org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry
  25 17: 234784  5634816 
java.util.concurrent.ConcurrentSkipListMap$Node
  26 18: 857 4361200 
java.util.concurrent.ConcurrentHashMap$Node[]
  27 19: 64053   3586968 
org.apache.hadoop.hive.ql.io.orc.OrcProto$DoubleStatistics
  28 20: 221391  3542256 java.util.concurrent.atomic.AtomicInteger
  29 21: 133161  3195864 java.util.ArrayList
  30 22: 112042  2689008 
java.util.Collections$UnmodifiableRandomAccessList
  31 23: 109038  2616912 
java.util.concurrent.ConcurrentSkipListMap$Index
  32 24: 48730   2339040 
org.apache.hadoop.hive.ql.io.orc.OrcProto$StringStatistics
  33 25: 66161   1587864 com.google.protobuf.LiteralByteString
  34 26: 32529   1301160 sun.misc.Cleaner
  35 27: 24668   1184064 
org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapper
  36 28: 22467   1078416 
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$BufferChunk
  37 29: 14850   1069200 java.lang.reflect.Field
  38 30: 32529   1040928 java.nio.DirectByteBuffer$Deallocator
  39 31: 14496   927744  
org.apache.hive.com.esotericsoftware.kryo.serializers.UnsafeCacheFields$UnsafeObjectField
  40 32: 516 881544  long[]
{code}

 LLAP: investigate why TPCH Q1 1k is slow
 

 Key: HIVE-10474
 URL: https://issues.apache.org/jira/browse/HIVE-10474
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin
 Attachments: llap-gc-pauses.png


 While most queries run faster in LLAP than just Tez with container reuse, 
 TPCH Q1 is much slower.
 On my run, on tez with container reuse (current default LLAP configuration 
 but mode == container and no daemons running)  runs 2-6 (out of 6 consecutive 
 runs in the same session) finished in 25.5sec average; with 16 LLAP daemons 
 in default config the average was 35.5sec; same w/o IO elevator (to rule out 
 its impact) it took 59.7sec w/strange distribution (later runs were slower 
 than earlier runs, still, fastest run was 49.5sec).
 So excluding IO elevator it's more than 2x degradation.
 We need to figure out why this is happening. Is it just slot discrepancy? 
 Regardless, this needs to be addressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-8342) Potential null dereference in ColumnTruncateMapper#jobClose()

2015-04-24 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-8342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HIVE-8342:
-
Description: 
{code}
Utilities.mvFileToFinalPath(outputPath, job, success, LOG, dynPartCtx, null,
  reporter);
{code}

Utilities.mvFileToFinalPath() calls createEmptyBuckets() where conf is 
dereferenced:
{code}
boolean isCompressed = conf.getCompressed();
TableDesc tableInfo = conf.getTableInfo();
{code}

  was:
{code}
Utilities.mvFileToFinalPath(outputPath, job, success, LOG, dynPartCtx, null,
  reporter);
{code}
Utilities.mvFileToFinalPath() calls createEmptyBuckets() where conf is 
dereferenced:
{code}
boolean isCompressed = conf.getCompressed();
TableDesc tableInfo = conf.getTableInfo();
{code}


 Potential null dereference in ColumnTruncateMapper#jobClose()
 -

 Key: HIVE-8342
 URL: https://issues.apache.org/jira/browse/HIVE-8342
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
Assignee: skrho
Priority: Minor
 Attachments: HIVE-8342_001.patch, HIVE-8342_002.patch


 {code}
 Utilities.mvFileToFinalPath(outputPath, job, success, LOG, dynPartCtx, 
 null,
   reporter);
 {code}
 Utilities.mvFileToFinalPath() calls createEmptyBuckets() where conf is 
 dereferenced:
 {code}
 boolean isCompressed = conf.getCompressed();
 TableDesc tableInfo = conf.getTableInfo();
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10473) Spark client is recreated even spark configuration is not changed

2015-04-24 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512263#comment-14512263
 ] 

Szehon Ho commented on HIVE-10473:
--

Looks good, but should still set , if new value is null right?

 Spark client is recreated even spark configuration is not changed
 -

 Key: HIVE-10473
 URL: https://issues.apache.org/jira/browse/HIVE-10473
 Project: Hive
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: HIVE-10473.1-spark.patch, HIVE-10473.1.patch


 Currently, we think a spark setting is changed as long as the set method is 
 called, even we set it to the same value as before. We should check if the 
 value is changed too, since it takes time to start a new spark client. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9534) incorrect result set for query that projects a windowed aggregate

2015-04-24 Thread Harsh J (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-9534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512276#comment-14512276
 ] 

Harsh J commented on HIVE-9534:
---

Applying a similar SQL in PostgreSQL or Impala returns an error of the form 
DISTINCT is not implemented for window functions. Hive, unless it did add 
proper support for distinct in such a context, should likely output the same 
error (if not a bug-fix).

 incorrect result set for query that projects a windowed aggregate
 -

 Key: HIVE-9534
 URL: https://issues.apache.org/jira/browse/HIVE-9534
 Project: Hive
  Issue Type: Bug
  Components: SQL
Reporter: N Campbell

 Result set returned by Hive has one row instead of 5
 {code}
 select avg(distinct tsint.csint) over () from tsint 
 create table  if not exists TSINT (RNUM int , CSINT smallint)
  ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' 
  STORED AS TEXTFILE;
 0|\N
 1|-1
 2|0
 3|1
 4|10
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10447) Beeline JDBC Driver to support 2 way SSL


[ 
https://issues.apache.org/jira/browse/HIVE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512184#comment-14512184
 ] 

Hari Sankar Sivarama Subramaniyan commented on HIVE-10447:
--

The test failures look unrelated to the fix made for this jira. I ran TestSSL 
locally and it passes without any issues.

Thanks
Hari

 Beeline JDBC Driver to support 2 way SSL
 

 Key: HIVE-10447
 URL: https://issues.apache.org/jira/browse/HIVE-10447
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10447.1.patch, HIVE-10447.2.patch


 This jira should cover 2-way SSL authentication between the JDBC Client and 
 server which requires the driver to support it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10339) Allow JDBC Driver to pass HTTP header Key/Value pairs


 [ 
https://issues.apache.org/jira/browse/HIVE-10339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-10339:
--
Labels: TODOC1.2  (was: )

 Allow JDBC Driver to pass HTTP header Key/Value pairs
 -

 Key: HIVE-10339
 URL: https://issues.apache.org/jira/browse/HIVE-10339
 Project: Hive
  Issue Type: Improvement
  Components: Beeline
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
  Labels: TODOC1.2
 Fix For: 1.2.0

 Attachments: HIVE-10339.1.patch, HIVE-10339.2.patch


 Currently Beeline  ODBC driver does not support carrying user specified HTTP 
 header.
 The beeline JDBC driver in HTTP mode connection string is as 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,
 When transport mode is http Beeline/ODBC driver should allow end user to send 
 arbitrary HTTP Header name value pair.
 All the beeline driver needs to do is to use the user specified name values 
 and call the underlying HTTPClient API to set the header.
 E.g the Beeline connection string could be 
 jdbc:hive2://host:port/db?hive.server2.transport.mode=http;hive.server2.thrift.http.path=http_endpoint,http.header.name1=value1,
 And the beeline will call underlying to set HTTP header to name1 and value1
 This is required for the  end user to send  identity in a HTTP header down to 
 Knox via beeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10217) LLAP: Support caching of uncompressed ORC data


[ 
https://issues.apache.org/jira/browse/HIVE-10217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512212#comment-14512212
 ] 

Lefty Leverenz commented on HIVE-10217:
---

Glitch:  The commit message says HIVE-20217 instead of HIVE-10217.

The commit ID is 582f4e1bc39b9605d11f762480b29561a44688ae.

 LLAP: Support caching of uncompressed ORC data
 --

 Key: HIVE-10217
 URL: https://issues.apache.org/jira/browse/HIVE-10217
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Fix For: llap

 Attachments: HIVE-10127.patch


 {code}
 Caused by: java.io.IOException: ORC compression buffer size (0) is smaller 
 than LLAP low-level cache minimum allocation size (131072). Decrease the 
 value for hive.llap.io.cache.orc.alloc.min
 at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:137)
 at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:48)
 at 
 org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
 ... 4 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-3404) Create quarter UDF


 [ 
https://issues.apache.org/jira/browse/HIVE-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Pivovarov updated HIVE-3404:
--
Attachment: HIVE-3404.2.patch

 Create quarter UDF
 --

 Key: HIVE-3404
 URL: https://issues.apache.org/jira/browse/HIVE-3404
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Sanam Naz
Assignee: Alexander Pivovarov
 Attachments: HIVE-3404.1.patch.txt, HIVE-3404.2.patch, 
 HIVE-3404.2.patch


 The function QUARTER(date) would return the quarter  from a string / date / 
 timestamp. This will be useful for different domains like retail ,finance etc.
 MySQL has QUARTER function
 https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_quarter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10312) SASL.QOP in JDBC URL is ignored for Delegation token Authentication


[ 
https://issues.apache.org/jira/browse/HIVE-10312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512227#comment-14512227
 ] 

Lefty Leverenz commented on HIVE-10312:
---

bq.  Do we need clarity on the language in the current documentation ...

This question is beyond my technical knowledge.  Can someone else answer?  (See 
the doc links a few comments back.)

 SASL.QOP in JDBC URL is ignored for Delegation token Authentication
 ---

 Key: HIVE-10312
 URL: https://issues.apache.org/jira/browse/HIVE-10312
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 1.2.0
Reporter: Mubashir Kazia
Assignee: Mubashir Kazia
 Fix For: 1.2.0

 Attachments: HIVE-10312.1.patch, HIVE-10312.1.patch


 When HS2 is configured for QOP other than auth (auth-int or auth-conf), 
 Kerberos client connection works fine when the JDBC URL specifies the 
 matching QOP, however when this HS2 is accessed through Oozie (Delegation 
 token / Digest authentication), connections fails because the JDBC driver 
 ignores the SASL.QOP parameters in the JDBC URL. SASL.QOP setting should be 
 valid for DIGEST Auth mech.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10483) insert overwrite partition deadlocks on itself with DbTxnManager


 [ 
https://issues.apache.org/jira/browse/HIVE-10483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10483:
--
Summary: insert overwrite partition deadlocks on itself with DbTxnManager  
(was: ACID: insert overwrite with self join deadlocks on itself with 
DbTxnManager)

 insert overwrite partition deadlocks on itself with DbTxnManager
 

 Key: HIVE-10483
 URL: https://issues.apache.org/jira/browse/HIVE-10483
 Project: Hive
  Issue Type: Bug
  Components: Query Planning, Query Processor, Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-10483.patch


 insert overwrite ta partition(part=) select xxx from tb join ta where 
 part=
 It seems like the Shared conflicts with the Exclusive lock for Insert 
 Overwrite even though both are part of the same txn.
 More precisely insert overwrite requires X lock on partition and the read 
 side needs an S lock on the query.
 A simpler case is
 insert overwrite ta partition(part=) select * from ta



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10447) Beeline JDBC Driver to support 2 way SSL


[ 
https://issues.apache.org/jira/browse/HIVE-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512173#comment-14512173
 ] 

Hive QA commented on HIVE-10447:




{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728073/HIVE-10447.2.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8814 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hive.jdbc.TestSSL.testSSLVersion
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3580/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3580/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3580/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728073 - PreCommit-HIVE-TRUNK-Build

 Beeline JDBC Driver to support 2 way SSL
 

 Key: HIVE-10447
 URL: https://issues.apache.org/jira/browse/HIVE-10447
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10447.1.patch, HIVE-10447.2.patch


 This jira should cover 2-way SSL authentication between the JDBC Client and 
 server which requires the driver to support it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-3404) Create quarter UDF


[ 
https://issues.apache.org/jira/browse/HIVE-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14512210#comment-14512210
 ] 

Hive QA commented on HIVE-3404:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12728111/HIVE-3404.2.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 8818 tests 
executed
*Failed tests:*
{noformat}
TestMinimrCliDriver-bucketmapjoin6.q-constprog_partitioner.q-infer_bucket_sort_dyn_part.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-external_table_with_space_in_location_path.q-infer_bucket_sort_merge.q-auto_sortmerge_join_16.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-groupby2.q-import_exported_table.q-bucketizedhiveinputformat.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-index_bitmap3.q-stats_counter_partitioned.q-temp_table_external.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_map_operators.q-join1.q-bucketmapjoin7.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_num_buckets.q-disable_merge_for_bucketing.q-uber_reduce.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-scriptfile1.q-scriptfile1_win.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-leftsemijoin_mr.q-load_hdfs_file_with_space_in_the_name.q-root_dir_external_table.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-list_bucket_dml_10.q-bucket_num_reducers.q-bucket6.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-load_fs2.q-file_with_header_footer.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-parallel_orderby.q-reduce_deduplicate.q-ql_rewrite_gbtoidx_cbo_2.q-and-1-more
 - did not produce a TEST-*.xml file
TestMinimrCliDriver-ql_rewrite_gbtoidx.q-smb_mapjoin_8.q - did not produce a 
TEST-*.xml file
TestMinimrCliDriver-schemeAuthority2.q-bucket4.q-input16_cc.q-and-1-more - did 
not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3581/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3581/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3581/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12728111 - PreCommit-HIVE-TRUNK-Build

 Create quarter UDF
 --

 Key: HIVE-3404
 URL: https://issues.apache.org/jira/browse/HIVE-3404
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Sanam Naz
Assignee: Alexander Pivovarov
 Attachments: HIVE-3404.1.patch.txt, HIVE-3404.2.patch


 The function QUARTER(date) would return the quarter  from a string / date / 
 timestamp. This will be useful for different domains like retail ,finance etc.
 MySQL has QUARTER function
 https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_quarter



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10474) LLAP: investigate why TPCH Q1 1k is slow

[
https://issues.apache.org/jira/browse/HIVE-10474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511298#comment-14511298
]

Sergey Shelukhin commented on HIVE-10474:
-

Is it possible to see w/o IO enabled on fresh daemon?
LlapDataBuffer can be reused (it's passed between threads), DirectByteBuffer-s
may also be reusable to an extent (or we can get read buffers from Allocator).
Gopal suggested that reuse of objects that will be collected efficiently (are
not passed between threads) may actually hurt us.
There's way too much protobuf stuff. I wonder if statistics are read and
discarded somewhere where they could be cached/reused?

LLAP: investigate why TPCH Q1 1k is slow

Key: HIVE-10474
URL: https://issues.apache.org/jira/browse/HIVE-10474
Project: Hive
Issue Type: Sub-task
Reporter: Sergey Shelukhin
Attachments: llap-gc-pauses.png

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HIVE-10217) LLAP: Support caching of uncompressed ORC data


 [ 
https://issues.apache.org/jira/browse/HIVE-10217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10217.
-
Resolution: Fixed

committed to branch

 LLAP: Support caching of uncompressed ORC data
 --

 Key: HIVE-10217
 URL: https://issues.apache.org/jira/browse/HIVE-10217
 Project: Hive
  Issue Type: Sub-task
Affects Versions: llap
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Fix For: llap

 Attachments: HIVE-10127.patch


 {code}
 Caused by: java.io.IOException: ORC compression buffer size (0) is smaller 
 than LLAP low-level cache minimum allocation size (131072). Decrease the 
 value for hive.llap.io.cache.orc.alloc.min
 at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:137)
 at 
 org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:48)
 at 
 org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
 ... 4 more
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9365) The Metastore should take port configuration from hive-site.xml


[ 
https://issues.apache.org/jira/browse/HIVE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511431#comment-14511431
 ] 

Reuben Kuhnert commented on HIVE-9365:
--

I've added a configuration property {{hive.metastore.port}}:

{code}
METASTORE_SERVER_PORT(hive.metastore.port, 9083, Hive Metastore listener 
port),
{code}

but that leaves the old cli configuration:

{code}
  static public class HiveMetastoreCli extends CommonCliOptions {
int port = DEFAULT_HIVE_METASTORE_PORT; // defaults to 9083

@SuppressWarnings(static-access)
public HiveMetastoreCli() {
  super(hivemetastore, true);

  // -p port
  OPTIONS.addOption(OptionBuilder
  .hasArg()
  .withArgName(port)
  .withDescription(Hive Metastore port number, default:
  + DEFAULT_HIVE_METASTORE_PORT)
  .create('p'));

}

@Override
public void parse(String[] args) {
  super.parse(args);

  // support the old syntax hivemetastore [port] but complain
  args = commandLine.getArgs();
  if (args.length  0) {
// complain about the deprecated syntax -- but still run
System.err.println(
This usage has been deprecated, consider using the new command 
+ line syntax (run with -h to see usage information));

port = new Integer(args[0]);
  }

  // notice that command line options take precedence over the
  // deprecated (old style) naked args...
  if (commandLine.hasOption('p')) {
port = Integer.parseInt(commandLine.getOptionValue('p'));
  } else {
// legacy handling
String metastorePort = System.getenv(METASTORE_PORT);
if (metastorePort != null) {
  port = Integer.parseInt(metastorePort);
}
  }
}
  }
{code}

Should we continue to allow the user to configure the port through the CLI (but 
override the {{hive.metastore.port}} configuration?), or should we complain? 
completely disallow? Let me know what the correct solution on this is and I'll 
go ahead and get a patch ready.

Thanks

 The Metastore should take port configuration from hive-site.xml
 ---

 Key: HIVE-9365
 URL: https://issues.apache.org/jira/browse/HIVE-9365
 Project: Hive
  Issue Type: Improvement
Reporter: Nicolas Thiébaud
Assignee: Reuben Kuhnert
Priority: Minor
  Labels: metastore
   Original Estimate: 3h
  Remaining Estimate: 3h

 As opposed to the cli. Having this configuration in the launcher script 
 create fragmentation and does is not consistent with the way the hive stack 
 is configured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10434) Cancel connection when remote Spark driver process has failed [Spark Branch]

2015-04-24 Thread Hari Sankar Sivarama Subramaniyan (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511546#comment-14511546
 ] 

Xuefu Zhang commented on HIVE-10434:


+1

 Cancel connection when remote Spark driver process has failed [Spark Branch] 
 -

 Key: HIVE-10434
 URL: https://issues.apache.org/jira/browse/HIVE-10434
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.2.0
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10434.1-spark.patch, HIVE-10434.3-spark.patch, 
 HIVE-10434.4-spark.patch


 Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
 and then wait for it to connect back to the HS2. However, in certain 
 situations (for instance, permission issue), the remote process may fail and 
 exit with error code. In this situation, the HS2 process will still wait for 
 the process to connect, and wait for a full timeout period before it throws 
 the exception.
 What makes it worth, user may need to wait for two timeout periods: one for 
 the SparkSetReducerParallelism, and another for the actual Spark job. This 
 could be very annoying.
 We should cancel the timeout task once we found out that the process has 
 failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10477) Provide option to disable Spark tests


 [ 
https://issues.apache.org/jira/browse/HIVE-10477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sankar Sivarama Subramaniyan updated HIVE-10477:
-
Summary: Provide option to disable Spark tests   (was: Provide option to 
disable Spark tests in Windows OS)

 Provide option to disable Spark tests 
 --

 Key: HIVE-10477
 URL: https://issues.apache.org/jira/browse/HIVE-10477
 Project: Hive
  Issue Type: Bug
Reporter: Hari Sankar Sivarama Subramaniyan
Assignee: Hari Sankar Sivarama Subramaniyan
 Attachments: HIVE-10477.1.patch


 In the current master branch, unit tests fail with windows OS because of the 
 dependency on bash executable in itests/hive-unit/pom.xml around these 
 lines :
 {code}
  target
 exec executable=bash dir=${basedir} 
 failonerror=true
   arg line=../target/download.sh/
 /exec
   /target
 {code}
 We should provide an option to disable spark tests in OSes  like Windows 
 where bash might be absent



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9365) The Metastore should take port configuration from hive-site.xml


[ 
https://issues.apache.org/jira/browse/HIVE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511523#comment-14511523
 ] 

Reuben Kuhnert commented on HIVE-9365:
--

Can do, thanks.

 The Metastore should take port configuration from hive-site.xml
 ---

 Key: HIVE-9365
 URL: https://issues.apache.org/jira/browse/HIVE-9365
 Project: Hive
  Issue Type: Improvement
Reporter: Nicolas Thiébaud
Assignee: Reuben Kuhnert
Priority: Minor
  Labels: metastore
   Original Estimate: 3h
  Remaining Estimate: 3h

 As opposed to the cli. Having this configuration in the launcher script 
 create fragmentation and does is not consistent with the way the hive stack 
 is configured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10120) Disallow create table with dot/colon in column name

2015-04-24 Thread Laljo John Pullokkaran (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511525#comment-14511525
 ] 

Laljo John Pullokkaran commented on HIVE-10120:
---

+1

 Disallow create table with dot/colon in column name
 ---

 Key: HIVE-10120
 URL: https://issues.apache.org/jira/browse/HIVE-10120
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Assignee: Pengcheng Xiong
 Attachments: HIVE-10120.01.patch, HIVE-10120.02.patch


 Since we don't allow users to query column names with dot in the middle such 
 as emp.no, don't allow users to create tables with such columns that cannot 
 be queried. Fix the documentation to reflect this fix.
 Here is an example. Consider this table:
 {code}
 CREATE TABLE a (`emp.no` string);
 select `emp.no` from a; fails with this message:
 FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp 
 from [0:emp.no]
 {code}
 The hive documentation needs to be fixed:
 {code}
  (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL) seems 
 to  indicate that any Unicode character can go between the backticks in the 
 select statement, but it doesn’t like the dot/colon or even select * when 
 there is a column that has a dot/colon. 
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10434) Cancel connection when remote Spark driver process has failed [Spark Branch]

2015-04-24 Thread Chao Sun (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chao Sun updated HIVE-10434:

Attachment: (was: HIVE-10434.4-spark.patch)

 Cancel connection when remote Spark driver process has failed [Spark Branch] 
 -

 Key: HIVE-10434
 URL: https://issues.apache.org/jira/browse/HIVE-10434
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.2.0
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10434.1-spark.patch, HIVE-10434.3-spark.patch, 
 HIVE-10434.4-spark.patch, HIVE-10434.4-spark.patch


 Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
 and then wait for it to connect back to the HS2. However, in certain 
 situations (for instance, permission issue), the remote process may fail and 
 exit with error code. In this situation, the HS2 process will still wait for 
 the process to connect, and wait for a full timeout period before it throws 
 the exception.
 What makes it worth, user may need to wait for two timeout periods: one for 
 the SparkSetReducerParallelism, and another for the actual Spark job. This 
 could be very annoying.
 We should cancel the timeout task once we found out that the process has 
 failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9094) TimeoutException when trying get executor count from RSC [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511539#comment-14511539
 ] 

Lefty Leverenz commented on HIVE-9094:
--

Changed TODOC1.1 to TODOC15 to match all the others.  (Might change them all to 
1.1 later.)

 TimeoutException when trying get executor count from RSC [Spark Branch]
 ---

 Key: HIVE-9094
 URL: https://issues.apache.org/jira/browse/HIVE-9094
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chengxiang Li
  Labels: TODOC-SPARK, TODOC15
 Fix For: spark-branch

 Attachments: HIVE-9094.1-spark.patch, HIVE-9094.2-spark.patch


 In 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/532/testReport,
  join25.q failed because:
 {code}
 2014-12-12 19:14:50,084 ERROR [main]: ql.Driver 
 (SessionState.java:printError(838)) - FAILED: SemanticException Failed to get 
 spark memory/core info: java.util.concurrent.TimeoutException
 org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get spark 
 memory/core info: java.util.concurrent.TimeoutException
 at 
 org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:120)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:134)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1035)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:362)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:297)
 at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.runTest(TestSparkCliDriver.java:234)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25(TestSparkCliDriver.java:162)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at junit.framework.TestCase.runTest(TestCase.java:176)
 at junit.framework.TestCase.runBare(TestCase.java:141)
 at junit.framework.TestResult$1.protect(TestResult.java:122)
 at junit.framework.TestResult.runProtected(TestResult.java:142)
 at junit.framework.TestResult.run(TestResult.java:125)
 at junit.framework.TestCase.run(TestCase.java:129)
 at junit.framework.TestSuite.runTest(TestSuite.java:255)
 at junit.framework.TestSuite.run(TestSuite.java:250)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at

[jira] [Commented] (HIVE-10434) Cancel connection when remote Spark driver process has failed [Spark Branch]

2015-04-24 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511543#comment-14511543
 ] 

Chao Sun commented on HIVE-10434:
-

[~vanzin] or [~xuefuz], can you take a look at the latest patch? The failures 
are not related.

 Cancel connection when remote Spark driver process has failed [Spark Branch] 
 -

 Key: HIVE-10434
 URL: https://issues.apache.org/jira/browse/HIVE-10434
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.2.0
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10434.1-spark.patch, HIVE-10434.3-spark.patch, 
 HIVE-10434.4-spark.patch


 Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
 and then wait for it to connect back to the HS2. However, in certain 
 situations (for instance, permission issue), the remote process may fail and 
 exit with error code. In this situation, the HS2 process will still wait for 
 the process to connect, and wait for a full timeout period before it throws 
 the exception.
 What makes it worth, user may need to wait for two timeout periods: one for 
 the SparkSetReducerParallelism, and another for the actual Spark job. This 
 could be very annoying.
 We should cancel the timeout task once we found out that the process has 
 failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10434) Cancel connection when remote Spark driver process has failed [Spark Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10434:
---
Attachment: (was: HIVE-10434.4-spark.patch)

 Cancel connection when remote Spark driver process has failed [Spark Branch] 
 -

 Key: HIVE-10434
 URL: https://issues.apache.org/jira/browse/HIVE-10434
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Affects Versions: 1.2.0
Reporter: Chao Sun
Assignee: Chao Sun
 Attachments: HIVE-10434.1-spark.patch, HIVE-10434.3-spark.patch, 
 HIVE-10434.4-spark.patch


 Currently in HoS, in SparkClientImpl it first launch a remote Driver process, 
 and then wait for it to connect back to the HS2. However, in certain 
 situations (for instance, permission issue), the remote process may fail and 
 exit with error code. In this situation, the HS2 process will still wait for 
 the process to connect, and wait for a full timeout period before it throws 
 the exception.
 What makes it worth, user may need to wait for two timeout periods: one for 
 the SparkSetReducerParallelism, and another for the actual Spark job. This 
 could be very annoying.
 We should cancel the timeout task once we found out that the process has 
 failed, and set the promise as failed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9395) Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor level.[Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511554#comment-14511554
 ] 

Lefty Leverenz commented on HIVE-9395:
--

Changed TODOC1.1 to TODOC15 to match the others. (Might change them all to 1.1 
later.)

 Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor 
 level.[Spark Branch]
 --

 Key: HIVE-9395
 URL: https://issues.apache.org/jira/browse/HIVE-9395
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M5, TODOC-SPARK, TODOC15
 Fix For: spark-branch

 Attachments: HIVE-9395.1-spark.patch, HIVE-9395.2-spark.patch


 SparkJobMonitor may hang if job state return null all the times, we should 
 move the timeout check here to avoid it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9395) Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor level.[Spark Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-9395:
-
Labels: Spark-M5 TODOC-SPARK TODOC15  (was: Spark-M5 TODOC-SPARK TODOC1.1)

 Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor 
 level.[Spark Branch]
 --

 Key: HIVE-9395
 URL: https://issues.apache.org/jira/browse/HIVE-9395
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M5, TODOC-SPARK, TODOC15
 Fix For: spark-branch

 Attachments: HIVE-9395.1-spark.patch, HIVE-9395.2-spark.patch


 SparkJobMonitor may hang if job state return null all the times, we should 
 move the timeout check here to avoid it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10472) Jenkins HMS upgrade test is not publishing results because JIRAService class is not found.


 [ 
https://issues.apache.org/jira/browse/HIVE-10472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergio Peña updated HIVE-10472:
---
Summary: Jenkins HMS upgrade test is not publishing results because 
JIRAService class is not found.  (was: Jenkins HMS upgrade test is not 
publishing results due to GIT change)

 Jenkins HMS upgrade test is not publishing results because JIRAService class 
 is not found.
 --

 Key: HIVE-10472
 URL: https://issues.apache.org/jira/browse/HIVE-10472
 Project: Hive
  Issue Type: Bug
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10472.1.patch


 This error is happening on Jenkins when running the HMS upgrade tests. 
 The class used to publish the results is not found on any directory.
 + cd /var/lib/jenkins/jobs/PreCommit-HIVE-METASTORE-Test/workspace
 + set +x
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hive/ptest/execution/JIRAService
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hive.ptest.execution.JIRAService
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.hive.ptest.execution.JIRAService.  
 Program will exit.
 + ret=0
 The problem is because the jenkins-execute-hms-test.sh is downloading the 
 code to another directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10239) Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and PostgreSQL


[ 
https://issues.apache.org/jira/browse/HIVE-10239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511568#comment-14511568
 ] 

Sergio Peña commented on HIVE-10239:


Patch looks good [~ngangam]
+1

 Create scripts to do metastore upgrade tests on jenkins for Derby, Oracle and 
 PostgreSQL
 

 Key: HIVE-10239
 URL: https://issues.apache.org/jira/browse/HIVE-10239
 Project: Hive
  Issue Type: Improvement
Affects Versions: 1.1.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
 Attachments: HIVE-10239-donotcommit.patch, HIVE-10239.0.patch, 
 HIVE-10239.0.patch, HIVE-10239.00.patch, HIVE-10239.01.patch, 
 HIVE-10239.02.patch, HIVE-10239.03.patch, HIVE-10239.03.patch, 
 HIVE-10239.1.patch, HIVE-10239.patch


 Need to create DB-implementation specific scripts to use the framework 
 introduced in HIVE-9800 to have any metastore schema changes tested across 
 all supported databases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9395) Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor level.[Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511563#comment-14511563
 ] 

Szehon Ho commented on HIVE-9395:
-

Ideally we could change all the fix versions, though I am not so sure is there 
any way besides manually. 

 Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor 
 level.[Spark Branch]
 --

 Key: HIVE-9395
 URL: https://issues.apache.org/jira/browse/HIVE-9395
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M5, TODOC-SPARK, TODOC15
 Fix For: spark-branch

 Attachments: HIVE-9395.1-spark.patch, HIVE-9395.2-spark.patch


 SparkJobMonitor may hang if job state return null all the times, we should 
 move the timeout check here to avoid it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2015-04-24 Thread Gabor Liptak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Liptak updated HIVE-7150:
---
Attachment: (was: HIVE-7150.patch)

 FileInputStream is not closed in HiveConnection#getHttpClient()
 ---

 Key: HIVE-7150
 URL: https://issues.apache.org/jira/browse/HIVE-7150
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
  Labels: jdbc
 Fix For: 1.2.0

 Attachments: HIVE-7150.1.patch


 Here is related code:
 {code}
 sslTrustStore.load(new FileInputStream(sslTrustStorePath),
 sslTrustStorePassword.toCharArray());
 {code}
 The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9365) The Metastore should take port configuration from hive-site.xml


 [ 
https://issues.apache.org/jira/browse/HIVE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuben Kuhnert updated HIVE-9365:
-
Attachment: HIVE-9365.01.patch

 The Metastore should take port configuration from hive-site.xml
 ---

 Key: HIVE-9365
 URL: https://issues.apache.org/jira/browse/HIVE-9365
 Project: Hive
  Issue Type: Improvement
Reporter: Nicolas Thiébaud
Assignee: Reuben Kuhnert
Priority: Minor
  Labels: metastore
 Attachments: HIVE-9365.01.patch

   Original Estimate: 3h
  Remaining Estimate: 3h

 As opposed to the cli. Having this configuration in the launcher script 
 create fragmentation and does is not consistent with the way the hive stack 
 is configured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9365) The Metastore should take port configuration from hive-site.xml


[ 
https://issues.apache.org/jira/browse/HIVE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511520#comment-14511520
 ] 

Sergio Peña commented on HIVE-9365:
---

We should continue to use the port number from CLI. There might be users that 
still use this option to pass the port.
Also, override the {{hive.metastore.port}} if the CLI option is found is 
desired.

 The Metastore should take port configuration from hive-site.xml
 ---

 Key: HIVE-9365
 URL: https://issues.apache.org/jira/browse/HIVE-9365
 Project: Hive
  Issue Type: Improvement
Reporter: Nicolas Thiébaud
Assignee: Reuben Kuhnert
Priority: Minor
  Labels: metastore
   Original Estimate: 3h
  Remaining Estimate: 3h

 As opposed to the cli. Having this configuration in the launcher script 
 create fragmentation and does is not consistent with the way the hive stack 
 is configured.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10472) Jenkins HMS upgrade test is not publishing results due to GIT change


[ 
https://issues.apache.org/jira/browse/HIVE-10472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511519#comment-14511519
 ] 

Szehon Ho commented on HIVE-10472:
--

+1, thought isnt it a general issue and not only git?

 Jenkins HMS upgrade test is not publishing results due to GIT change
 

 Key: HIVE-10472
 URL: https://issues.apache.org/jira/browse/HIVE-10472
 Project: Hive
  Issue Type: Bug
Reporter: Sergio Peña
Assignee: Sergio Peña
 Attachments: HIVE-10472.1.patch


 This error is happening on Jenkins when running the HMS upgrade tests. 
 The class used to publish the results is not found on any directory.
 + cd /var/lib/jenkins/jobs/PreCommit-HIVE-METASTORE-Test/workspace
 + set +x
 Exception in thread main java.lang.NoClassDefFoundError: 
 org/apache/hive/ptest/execution/JIRAService
 Caused by: java.lang.ClassNotFoundException: 
 org.apache.hive.ptest.execution.JIRAService
   at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
 Could not find the main class: org.apache.hive.ptest.execution.JIRAService.  
 Program will exit.
 + ret=0
 The problem is because the jenkins-execute-hms-test.sh is downloading the 
 code to another directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9094) TimeoutException when trying get executor count from RSC [Spark Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-9094:
-
Labels: TODOC-SPARK TODOC15  (was: TODOC-SPARK TODOC1.1)

 TimeoutException when trying get executor count from RSC [Spark Branch]
 ---

 Key: HIVE-9094
 URL: https://issues.apache.org/jira/browse/HIVE-9094
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chengxiang Li
  Labels: TODOC-SPARK, TODOC15
 Fix For: spark-branch

 Attachments: HIVE-9094.1-spark.patch, HIVE-9094.2-spark.patch


 In 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/532/testReport,
  join25.q failed because:
 {code}
 2014-12-12 19:14:50,084 ERROR [main]: ql.Driver 
 (SessionState.java:printError(838)) - FAILED: SemanticException Failed to get 
 spark memory/core info: java.util.concurrent.TimeoutException
 org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get spark 
 memory/core info: java.util.concurrent.TimeoutException
 at 
 org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:120)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:134)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1035)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:362)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:297)
 at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.runTest(TestSparkCliDriver.java:234)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25(TestSparkCliDriver.java:162)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at junit.framework.TestCase.runTest(TestCase.java:176)
 at junit.framework.TestCase.runBare(TestCase.java:141)
 at junit.framework.TestResult$1.protect(TestResult.java:122)
 at junit.framework.TestResult.runProtected(TestResult.java:142)
 at junit.framework.TestResult.run(TestResult.java:125)
 at junit.framework.TestCase.run(TestCase.java:129)
 at junit.framework.TestSuite.runTest(TestSuite.java:255)
 at junit.framework.TestSuite.run(TestSuite.java:250)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
 at

[jira] [Commented] (HIVE-10472) Jenkins HMS upgrade test is not publishing results due to GIT change

[
https://issues.apache.org/jira/browse/HIVE-10472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511551#comment-14511551
]

Sergio Peña commented on HIVE-10472:

Thanks [~szehon].
It's true. It is not related with Git, but it was a bug I left when switching
to Git.
I will change the subject.

Jenkins HMS upgrade test is not publishing results due to GIT change

Key: HIVE-10472
URL: https://issues.apache.org/jira/browse/HIVE-10472
Project: Hive
Issue Type: Bug
Reporter: Sergio Peña
Assignee: Sergio Peña
Attachments: HIVE-10472.1.patch

This error is happening on Jenkins when running the HMS upgrade tests.
The class used to publish the results is not found on any directory.
+ cd /var/lib/jenkins/jobs/PreCommit-HIVE-METASTORE-Test/workspace
+ set +x
Exception in thread main java.lang.NoClassDefFoundError:
org/apache/hive/ptest/execution/JIRAService
Caused by: java.lang.ClassNotFoundException:
org.apache.hive.ptest.execution.JIRAService
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: org.apache.hive.ptest.execution.JIRAService.
Program will exit.
+ ret=0
The problem is because the jenkins-execute-hms-test.sh is downloading the
code to another directory.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-7150) FileInputStream is not closed in HiveConnection#getHttpClient()

2015-04-24 Thread Gabor Liptak (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-7150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Liptak updated HIVE-7150:
---
Attachment: HIVE-7150.1.patch

 FileInputStream is not closed in HiveConnection#getHttpClient()
 ---

 Key: HIVE-7150
 URL: https://issues.apache.org/jira/browse/HIVE-7150
 Project: Hive
  Issue Type: Bug
Reporter: Ted Yu
  Labels: jdbc
 Fix For: 1.2.0

 Attachments: HIVE-7150.1.patch, HIVE-7150.patch


 Here is related code:
 {code}
 sslTrustStore.load(new FileInputStream(sslTrustStorePath),
 sslTrustStorePassword.toCharArray());
 {code}
 The FileInputStream is not closed upon returning from the method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10480) LLAP: Tez task is interrupted for unknown reason after an IPC exception and then fails to report completion


 [ 
https://issues.apache.org/jira/browse/HIVE-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10480:

Summary: LLAP: Tez task is interrupted for unknown reason after an IPC 
exception and then fails to report completion  (was: LLAP: Tez task is 
interrupted for no reason and then fails to report completion)

 LLAP: Tez task is interrupted for unknown reason after an IPC exception and 
 then fails to report completion
 ---

 Key: HIVE-10480
 URL: https://issues.apache.org/jira/browse/HIVE-10480
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin

 No idea if this is LLAP bug, Tez bug, Hadoop IPC bug (due to patch on the 
 cluster), or all 3.
 So for now I will just dump all I have here.
 TPCH Q1 started running for a long time for me on large number of runs today 
 (didn't happen yesterday). It would always be one Map task timing out.
  Example attempt (logs from am):
 {noformat}
 2015-04-24 11:11:01,073 INFO [TaskCommunicator # 0] 
 tezplugins.LlapTaskCommunicator: Successfully launched task: 
 attempt_1429683757595_0321_9_00_000928_0
 2015-04-24 11:16:25,498 INFO [Dispatcher thread: Central] 
 history.HistoryEventHandler: 
 [HISTORY][DAG:dag_1429683757595_0321_9][Event:TASK_ATTEMPT_FINISHED]: 
 vertexName=Map 1, taskAttemptId=attempt_1429683757595_0321_9_00_000928_0, 
 startTime=1429899061071, finishTime=1429899385498, timeTaken=324427, 
 status=FAILED, errorEnum=TASK_HEARTBEAT_ERROR, 
 diagnostics=AttemptID:attempt_1429683757595_0321_9_00_000928_0 Timed out 
 after 300 secs, counters=Counters: 1, 
 org.apache.tez.common.counters.DAGCounter, RACK_LOCAL_TASKS=1
 {noformat}
 No other lines for this attempt in between.
 However there's this:
 {noformat}
 2015-04-24 11:11:01,074 WARN [Socket Reader #1 for port 59446] ipc.Server: 
 Unable to read call parameters for client 172.19.128.56on connection protocol 
 org.apache.hadoop.hive.llap.protocol.LlapTaskUmbilicalProtocol for rpcKind 
 RPC_WRITABLE
 java.lang.ArrayIndexOutOfBoundsException
 2015-04-24 11:11:01,075 INFO [Socket Reader #1 for port 59446] ipc.Server: 
 Socket Reader #1 for port 59446: readAndProcess from client 172.19.128.56 
 threw exception [org.apache.hadoop.ipc.RpcServerException: IPC server unable 
 to read call parameters: null]
 {noformat}
 On LLAP, the following is logged 
 {noformat}
 2015-04-24 11:11:01,142 [TaskHeartbeatThread()] ERROR 
 org.apache.tez.runtime.task.TezTaskRunner: TaskReporter reported error
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RpcServerException):
  IPC server unable to read call parameters: null
 at org.apache.hadoop.ipc.Client.call(Client.java:1492)
 at org.apache.hadoop.ipc.Client.call(Client.java:1423)
 at 
 org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:242)
 at com.sun.proxy.$Proxy19.heartbeat(Unknown Source)
 at 
 org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.heartbeat(LlapTaskReporter.java:258)
 at 
 org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:186)
 at 
 org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:128)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 The attempt starts but is then interrupted (not clear by whom)
 {noformat}
 2015-04-24 11:11:01,144 [Initializer 
 0(container_1_0321_01_008943_sershe_20150424110948_86ce1f6f-7cd2-4a40-b9a6-4a6854f010f6:9_Map
  1_928_0)] INFO org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: 
 Initialized Input with src edge: lineitem
 2015-04-24 11:11:01,145 
 [TezTaskRunner_attempt_1429683757595_0321_9_00_000928_0(container_1_0321_01_008943_sershe_20150424110948_86ce1f6f-7cd2-4a40-b9a6-4a6854f010f6:9_Map
  1_928_0)] INFO org.apache.tez.runtime.task.TezTaskRunner: Encounted an error 
 while executing task: attempt_1429683757595_0321_9_00_000928_0
 java.lang.InterruptedException
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
 at 
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:439)
 at 
 java.util.concurrent.ExecutorCompletionService.take(ExecutorCompletionService.java:193)
 at

[jira] [Commented] (HIVE-10480) LLAP: Tez task is interrupted for unknown reason after an IPC exception and then fails to report completion


[ 
https://issues.apache.org/jira/browse/HIVE-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511553#comment-14511553
 ] 

Sergey Shelukhin commented on HIVE-10480:
-

[~gopalv], [~sseth] fyi

 LLAP: Tez task is interrupted for unknown reason after an IPC exception and 
 then fails to report completion
 ---

 Key: HIVE-10480
 URL: https://issues.apache.org/jira/browse/HIVE-10480
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin

 No idea if this is LLAP bug, Tez bug, Hadoop IPC bug (due to patch on the 
 cluster), or all 3.
 So for now I will just dump all I have here.
 TPCH Q1 started running for a long time for me on large number of runs today 
 (didn't happen yesterday). It would always be one Map task timing out.
  Example attempt (logs from am):
 {noformat}
 2015-04-24 11:11:01,073 INFO [TaskCommunicator # 0] 
 tezplugins.LlapTaskCommunicator: Successfully launched task: 
 attempt_1429683757595_0321_9_00_000928_0
 2015-04-24 11:16:25,498 INFO [Dispatcher thread: Central] 
 history.HistoryEventHandler: 
 [HISTORY][DAG:dag_1429683757595_0321_9][Event:TASK_ATTEMPT_FINISHED]: 
 vertexName=Map 1, taskAttemptId=attempt_1429683757595_0321_9_00_000928_0, 
 startTime=1429899061071, finishTime=1429899385498, timeTaken=324427, 
 status=FAILED, errorEnum=TASK_HEARTBEAT_ERROR, 
 diagnostics=AttemptID:attempt_1429683757595_0321_9_00_000928_0 Timed out 
 after 300 secs, counters=Counters: 1, 
 org.apache.tez.common.counters.DAGCounter, RACK_LOCAL_TASKS=1
 {noformat}
 No other lines for this attempt in between.
 However there's this:
 {noformat}
 2015-04-24 11:11:01,074 WARN [Socket Reader #1 for port 59446] ipc.Server: 
 Unable to read call parameters for client 172.19.128.56on connection protocol 
 org.apache.hadoop.hive.llap.protocol.LlapTaskUmbilicalProtocol for rpcKind 
 RPC_WRITABLE
 java.lang.ArrayIndexOutOfBoundsException
 2015-04-24 11:11:01,075 INFO [Socket Reader #1 for port 59446] ipc.Server: 
 Socket Reader #1 for port 59446: readAndProcess from client 172.19.128.56 
 threw exception [org.apache.hadoop.ipc.RpcServerException: IPC server unable 
 to read call parameters: null]
 {noformat}
 On LLAP, the following is logged 
 {noformat}
 2015-04-24 11:11:01,142 [TaskHeartbeatThread()] ERROR 
 org.apache.tez.runtime.task.TezTaskRunner: TaskReporter reported error
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RpcServerException):
  IPC server unable to read call parameters: null
 at org.apache.hadoop.ipc.Client.call(Client.java:1492)
 at org.apache.hadoop.ipc.Client.call(Client.java:1423)
 at 
 org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:242)
 at com.sun.proxy.$Proxy19.heartbeat(Unknown Source)
 at 
 org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.heartbeat(LlapTaskReporter.java:258)
 at 
 org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:186)
 at 
 org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:128)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 The attempt starts but is then interrupted (not clear by whom)
 {noformat}
 2015-04-24 11:11:01,144 [Initializer 
 0(container_1_0321_01_008943_sershe_20150424110948_86ce1f6f-7cd2-4a40-b9a6-4a6854f010f6:9_Map
  1_928_0)] INFO org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: 
 Initialized Input with src edge: lineitem
 2015-04-24 11:11:01,145 
 [TezTaskRunner_attempt_1429683757595_0321_9_00_000928_0(container_1_0321_01_008943_sershe_20150424110948_86ce1f6f-7cd2-4a40-b9a6-4a6854f010f6:9_Map
  1_928_0)] INFO org.apache.tez.runtime.task.TezTaskRunner: Encounted an error 
 while executing task: attempt_1429683757595_0321_9_00_000928_0
 java.lang.InterruptedException
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1220)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:335)
 at 
 java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:439)
 at 
 java.util.concurrent.ExecutorCompletionService.take(ExecutorCompletionService.java:193)
 at 
 org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.initialize(LogicalIOProcessorRuntimeTask.java:218)
 at

[jira] [Commented] (HIVE-10190) CBO: AST mode checks for TABLESAMPLE with AST.toString().contains(TOK_TABLESPLITSAMPLE)


[ 
https://issues.apache.org/jira/browse/HIVE-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511602#comment-14511602
 ] 

Reuben Kuhnert commented on HIVE-10190:
---

Going to review the tests and then submit another patch, thanks.

 CBO: AST mode checks for TABLESAMPLE with 
 AST.toString().contains(TOK_TABLESPLITSAMPLE)
 -

 Key: HIVE-10190
 URL: https://issues.apache.org/jira/browse/HIVE-10190
 Project: Hive
  Issue Type: Bug
  Components: CBO
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Reuben Kuhnert
Priority: Trivial
  Labels: perfomance
 Attachments: HIVE-10190-querygen.py, HIVE-10190.01.patch, 
 HIVE-10190.02.patch, HIVE-10190.03.patch, HIVE-10190.04.patch, 
 HIVE-10190.05.patch, HIVE-10190.05.patch


 {code}
 public static boolean validateASTForUnsupportedTokens(ASTNode ast) {
 String astTree = ast.toStringTree();
 // if any of following tokens are present in AST, bail out
 String[] tokens = { TOK_CHARSETLITERAL, TOK_TABLESPLITSAMPLE };
 for (String token : tokens) {
   if (astTree.contains(token)) {
 return false;
   }
 }
 return true;
   }
 {code}
 This is an issue for a SQL query which is bigger in AST form than in text 
 (~700kb).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10481) ACID table update finishes but values not really updated if column names are not all lower case


 [ 
https://issues.apache.org/jira/browse/HIVE-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koifman updated HIVE-10481:
--
Attachment: HIVE-10481.patch

[~alangates], could you review please

 ACID table update finishes but values not really updated if column names are 
 not all lower case
 ---

 Key: HIVE-10481
 URL: https://issues.apache.org/jira/browse/HIVE-10481
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 1.0.0
Reporter: Eugene Koifman
Assignee: Eugene Koifman
 Attachments: HIVE-10481.patch


 Column in table is defined with upper case or mixed case, when do update 
 command with verbatim column names, update doesn't update the value. when do 
 update with all lower case column names, it works.
 STEPS TO REPRODUCE:
 create table testable( a string, Bb string, c string)
 clustered by (c) into 3 buckets
 stored as orc
 tblproperties(transactional=true);
 insert into table testable values ('a1','b1','c1), ('a2','b2','c2'), 
 ('a3','b3','c3');
 update table testable set Bb='bb';
 job finishes, but the values are not really updated.
 update table testable set bb='bb'; it works.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-10473) Spark client is recreated even spark configuration is not changed

2015-04-24 Thread Jimmy Xiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-10473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-10473:
---
Attachment: HIVE-10473.1-spark.patch

 Spark client is recreated even spark configuration is not changed
 -

 Key: HIVE-10473
 URL: https://issues.apache.org/jira/browse/HIVE-10473
 Project: Hive
  Issue Type: Bug
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
Priority: Minor
 Attachments: HIVE-10473.1-spark.patch, HIVE-10473.1.patch


 Currently, we think a spark setting is changed as long as the set method is 
 called, even we set it to the same value as before. We should check if the 
 value is changed too, since it takes time to start a new spark client. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HIVE-9395) Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor level.[Spark Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-9395:
-
Labels: Spark-M5 TODOC-SPARK TODOC1.1  (was: Spark-M5 TODOC-SPARK)

 Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor 
 level.[Spark Branch]
 --

 Key: HIVE-9395
 URL: https://issues.apache.org/jira/browse/HIVE-9395
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M5, TODOC-SPARK, TODOC1.1
 Fix For: spark-branch

 Attachments: HIVE-9395.1-spark.patch, HIVE-9395.2-spark.patch


 SparkJobMonitor may hang if job state return null all the times, we should 
 move the timeout check here to avoid it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9395) Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor level.[Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510584#comment-14510584
 ] 

Lefty Leverenz commented on HIVE-9395:
--

This is in the 1.1.0 release.

 Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor 
 level.[Spark Branch]
 --

 Key: HIVE-9395
 URL: https://issues.apache.org/jira/browse/HIVE-9395
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M5, TODOC-SPARK, TODOC1.1
 Fix For: spark-branch

 Attachments: HIVE-9395.1-spark.patch, HIVE-9395.2-spark.patch


 SparkJobMonitor may hang if job state return null all the times, we should 
 move the timeout check here to avoid it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Issue Comment Deleted] (HIVE-9395) Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor level.[Spark Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-9395:
-
Comment: was deleted

(was: This is in the 1.1.0 release.)

 Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor 
 level.[Spark Branch]
 --

 Key: HIVE-9395
 URL: https://issues.apache.org/jira/browse/HIVE-9395
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M5, TODOC-SPARK, TODOC1.1
 Fix For: spark-branch

 Attachments: HIVE-9395.1-spark.patch, HIVE-9395.2-spark.patch


 SparkJobMonitor may hang if job state return null all the times, we should 
 move the timeout check here to avoid it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9395) Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor level.[Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510585#comment-14510585
 ] 

Lefty Leverenz commented on HIVE-9395:
--

This parameter is in the 1.1.0 release.

 Make WAIT_SUBMISSION_TIMEOUT configuable and check timeout in SparkJobMonitor 
 level.[Spark Branch]
 --

 Key: HIVE-9395
 URL: https://issues.apache.org/jira/browse/HIVE-9395
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Chengxiang Li
Assignee: Chengxiang Li
  Labels: Spark-M5, TODOC-SPARK, TODOC1.1
 Fix For: spark-branch

 Attachments: HIVE-9395.1-spark.patch, HIVE-9395.2-spark.patch


 SparkJobMonitor may hang if job state return null all the times, we should 
 move the timeout check here to avoid it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-9094) TimeoutException when trying get executor count from RSC [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510590#comment-14510590
 ] 

Lefty Leverenz commented on HIVE-9094:
--

Update:  This parameter is in the 1.1.0 release, so I'm adding a TODOC1.1 label.

 TimeoutException when trying get executor count from RSC [Spark Branch]
 ---

 Key: HIVE-9094
 URL: https://issues.apache.org/jira/browse/HIVE-9094
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chengxiang Li
  Labels: TODOC-SPARK
 Fix For: spark-branch

 Attachments: HIVE-9094.1-spark.patch, HIVE-9094.2-spark.patch


 In 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/532/testReport,
  join25.q failed because:
 {code}
 2014-12-12 19:14:50,084 ERROR [main]: ql.Driver 
 (SessionState.java:printError(838)) - FAILED: SemanticException Failed to get 
 spark memory/core info: java.util.concurrent.TimeoutException
 org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get spark 
 memory/core info: java.util.concurrent.TimeoutException
 at 
 org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:120)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:134)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1035)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:362)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:297)
 at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.runTest(TestSparkCliDriver.java:234)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25(TestSparkCliDriver.java:162)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at junit.framework.TestCase.runTest(TestCase.java:176)
 at junit.framework.TestCase.runBare(TestCase.java:141)
 at junit.framework.TestResult$1.protect(TestResult.java:122)
 at junit.framework.TestResult.runProtected(TestResult.java:142)
 at junit.framework.TestResult.run(TestResult.java:125)
 at junit.framework.TestCase.run(TestCase.java:129)
 at junit.framework.TestSuite.runTest(TestSuite.java:255)
 at junit.framework.TestSuite.run(TestSuite.java:250)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at

[jira] [Commented] (HIVE-9094) TimeoutException when trying get executor count from RSC [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510587#comment-14510587
 ] 

Lefty Leverenz commented on HIVE-9094:
--

Update:  This parameter is in the 1.1.0 release, so I'm adding a TODOC1.1 label.

 TimeoutException when trying get executor count from RSC [Spark Branch]
 ---

 Key: HIVE-9094
 URL: https://issues.apache.org/jira/browse/HIVE-9094
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chengxiang Li
  Labels: TODOC-SPARK
 Fix For: spark-branch

 Attachments: HIVE-9094.1-spark.patch, HIVE-9094.2-spark.patch


 In 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/532/testReport,
  join25.q failed because:
 {code}
 2014-12-12 19:14:50,084 ERROR [main]: ql.Driver 
 (SessionState.java:printError(838)) - FAILED: SemanticException Failed to get 
 spark memory/core info: java.util.concurrent.TimeoutException
 org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get spark 
 memory/core info: java.util.concurrent.TimeoutException
 at 
 org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:120)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:134)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1035)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:362)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:297)
 at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.runTest(TestSparkCliDriver.java:234)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25(TestSparkCliDriver.java:162)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at junit.framework.TestCase.runTest(TestCase.java:176)
 at junit.framework.TestCase.runBare(TestCase.java:141)
 at junit.framework.TestResult$1.protect(TestResult.java:122)
 at junit.framework.TestResult.runProtected(TestResult.java:142)
 at junit.framework.TestResult.run(TestResult.java:125)
 at junit.framework.TestCase.run(TestCase.java:129)
 at junit.framework.TestSuite.runTest(TestSuite.java:255)
 at junit.framework.TestSuite.run(TestSuite.java:250)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at

[jira] [Commented] (HIVE-9094) TimeoutException when trying get executor count from RSC [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510588#comment-14510588
 ] 

Lefty Leverenz commented on HIVE-9094:
--

Update:  This parameter is in the 1.1.0 release, so I'm adding a TODOC1.1 label.

 TimeoutException when trying get executor count from RSC [Spark Branch]
 ---

 Key: HIVE-9094
 URL: https://issues.apache.org/jira/browse/HIVE-9094
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chengxiang Li
  Labels: TODOC-SPARK
 Fix For: spark-branch

 Attachments: HIVE-9094.1-spark.patch, HIVE-9094.2-spark.patch


 In 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/532/testReport,
  join25.q failed because:
 {code}
 2014-12-12 19:14:50,084 ERROR [main]: ql.Driver 
 (SessionState.java:printError(838)) - FAILED: SemanticException Failed to get 
 spark memory/core info: java.util.concurrent.TimeoutException
 org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get spark 
 memory/core info: java.util.concurrent.TimeoutException
 at 
 org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:120)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:134)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1035)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:362)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:297)
 at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.runTest(TestSparkCliDriver.java:234)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25(TestSparkCliDriver.java:162)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at junit.framework.TestCase.runTest(TestCase.java:176)
 at junit.framework.TestCase.runBare(TestCase.java:141)
 at junit.framework.TestResult$1.protect(TestResult.java:122)
 at junit.framework.TestResult.runProtected(TestResult.java:142)
 at junit.framework.TestResult.run(TestResult.java:125)
 at junit.framework.TestCase.run(TestCase.java:129)
 at junit.framework.TestSuite.runTest(TestSuite.java:255)
 at junit.framework.TestSuite.run(TestSuite.java:250)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at

[jira] [Commented] (HIVE-9094) TimeoutException when trying get executor count from RSC [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14510589#comment-14510589
 ] 

Lefty Leverenz commented on HIVE-9094:
--

Update:  This parameter is in the 1.1.0 release, so I'm adding a TODOC1.1 label.

 TimeoutException when trying get executor count from RSC [Spark Branch]
 ---

 Key: HIVE-9094
 URL: https://issues.apache.org/jira/browse/HIVE-9094
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chengxiang Li
  Labels: TODOC-SPARK
 Fix For: spark-branch

 Attachments: HIVE-9094.1-spark.patch, HIVE-9094.2-spark.patch


 In 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/532/testReport,
  join25.q failed because:
 {code}
 2014-12-12 19:14:50,084 ERROR [main]: ql.Driver 
 (SessionState.java:printError(838)) - FAILED: SemanticException Failed to get 
 spark memory/core info: java.util.concurrent.TimeoutException
 org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get spark 
 memory/core info: java.util.concurrent.TimeoutException
 at 
 org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:120)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:134)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1035)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:362)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:297)
 at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.runTest(TestSparkCliDriver.java:234)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25(TestSparkCliDriver.java:162)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at junit.framework.TestCase.runTest(TestCase.java:176)
 at junit.framework.TestCase.runBare(TestCase.java:141)
 at junit.framework.TestResult$1.protect(TestResult.java:122)
 at junit.framework.TestResult.runProtected(TestResult.java:142)
 at junit.framework.TestResult.run(TestResult.java:125)
 at junit.framework.TestCase.run(TestCase.java:129)
 at junit.framework.TestSuite.runTest(TestSuite.java:255)
 at junit.framework.TestSuite.run(TestSuite.java:250)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at

[jira] [Updated] (HIVE-9094) TimeoutException when trying get executor count from RSC [Spark Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-9094:
-
Labels: TODOC-SPARK TODOC1.1  (was: TODOC-SPARK)

 TimeoutException when trying get executor count from RSC [Spark Branch]
 ---

 Key: HIVE-9094
 URL: https://issues.apache.org/jira/browse/HIVE-9094
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chengxiang Li
  Labels: TODOC-SPARK, TODOC1.1
 Fix For: spark-branch

 Attachments: HIVE-9094.1-spark.patch, HIVE-9094.2-spark.patch


 In 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/532/testReport,
  join25.q failed because:
 {code}
 2014-12-12 19:14:50,084 ERROR [main]: ql.Driver 
 (SessionState.java:printError(838)) - FAILED: SemanticException Failed to get 
 spark memory/core info: java.util.concurrent.TimeoutException
 org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get spark 
 memory/core info: java.util.concurrent.TimeoutException
 at 
 org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:120)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:134)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1035)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:362)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:297)
 at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.runTest(TestSparkCliDriver.java:234)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25(TestSparkCliDriver.java:162)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at junit.framework.TestCase.runTest(TestCase.java:176)
 at junit.framework.TestCase.runBare(TestCase.java:141)
 at junit.framework.TestResult$1.protect(TestResult.java:122)
 at junit.framework.TestResult.runProtected(TestResult.java:142)
 at junit.framework.TestResult.run(TestResult.java:125)
 at junit.framework.TestCase.run(TestCase.java:129)
 at junit.framework.TestSuite.runTest(TestSuite.java:255)
 at junit.framework.TestSuite.run(TestSuite.java:250)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at 
 org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
 at

[jira] [Issue Comment Deleted] (HIVE-9094) TimeoutException when trying get executor count from RSC [Spark Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-9094:
-
Comment: was deleted

(was: Update:  This parameter is in the 1.1.0 release, so I'm adding a TODOC1.1 
label.)

 TimeoutException when trying get executor count from RSC [Spark Branch]
 ---

 Key: HIVE-9094
 URL: https://issues.apache.org/jira/browse/HIVE-9094
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chengxiang Li
  Labels: TODOC-SPARK, TODOC1.1
 Fix For: spark-branch

 Attachments: HIVE-9094.1-spark.patch, HIVE-9094.2-spark.patch


 In 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/532/testReport,
  join25.q failed because:
 {code}
 2014-12-12 19:14:50,084 ERROR [main]: ql.Driver 
 (SessionState.java:printError(838)) - FAILED: SemanticException Failed to get 
 spark memory/core info: java.util.concurrent.TimeoutException
 org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get spark 
 memory/core info: java.util.concurrent.TimeoutException
 at 
 org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:120)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:134)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1035)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:362)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:297)
 at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.runTest(TestSparkCliDriver.java:234)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25(TestSparkCliDriver.java:162)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at junit.framework.TestCase.runTest(TestCase.java:176)
 at junit.framework.TestCase.runBare(TestCase.java:141)
 at junit.framework.TestResult$1.protect(TestResult.java:122)
 at junit.framework.TestResult.runProtected(TestResult.java:142)
 at junit.framework.TestResult.run(TestResult.java:125)
 at junit.framework.TestCase.run(TestCase.java:129)
 at junit.framework.TestSuite.runTest(TestSuite.java:255)
 at junit.framework.TestSuite.run(TestSuite.java:250)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at

[jira] [Issue Comment Deleted] (HIVE-9094) TimeoutException when trying get executor count from RSC [Spark Branch]


 [ 
https://issues.apache.org/jira/browse/HIVE-9094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lefty Leverenz updated HIVE-9094:
-
Comment: was deleted

(was: Update:  This parameter is in the 1.1.0 release, so I'm adding a TODOC1.1 
label.)

 TimeoutException when trying get executor count from RSC [Spark Branch]
 ---

 Key: HIVE-9094
 URL: https://issues.apache.org/jira/browse/HIVE-9094
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang
Assignee: Chengxiang Li
  Labels: TODOC-SPARK, TODOC1.1
 Fix For: spark-branch

 Attachments: HIVE-9094.1-spark.patch, HIVE-9094.2-spark.patch


 In 
 http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/532/testReport,
  join25.q failed because:
 {code}
 2014-12-12 19:14:50,084 ERROR [main]: ql.Driver 
 (SessionState.java:printError(838)) - FAILED: SemanticException Failed to get 
 spark memory/core info: java.util.concurrent.TimeoutException
 org.apache.hadoop.hive.ql.parse.SemanticException: Failed to get spark 
 memory/core info: java.util.concurrent.TimeoutException
 at 
 org.apache.hadoop.hive.ql.optimizer.spark.SetSparkReducerParallelism.process(SetSparkReducerParallelism.java:120)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:94)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:78)
 at 
 org.apache.hadoop.hive.ql.lib.ForwardWalker.walk(ForwardWalker.java:79)
 at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:109)
 at 
 org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.optimizeOperatorPlan(SparkCompiler.java:134)
 at 
 org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:99)
 at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10202)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
 at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:221)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:420)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:306)
 at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1108)
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1045)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1035)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:151)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:362)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:297)
 at 
 org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:837)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.runTest(TestSparkCliDriver.java:234)
 at 
 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_join25(TestSparkCliDriver.java:162)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at junit.framework.TestCase.runTest(TestCase.java:176)
 at junit.framework.TestCase.runBare(TestCase.java:141)
 at junit.framework.TestResult$1.protect(TestResult.java:122)
 at junit.framework.TestResult.runProtected(TestResult.java:142)
 at junit.framework.TestResult.run(TestResult.java:125)
 at junit.framework.TestCase.run(TestCase.java:129)
 at junit.framework.TestSuite.runTest(TestSuite.java:255)
 at junit.framework.TestSuite.run(TestSuite.java:250)
 at 
 org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
 at 
 org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
 at

[jira] [Commented] (HIVE-9863) Querying parquet tables fails with IllegalStateException [Spark Branch]


[ 
https://issues.apache.org/jira/browse/HIVE-9863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511730#comment-14511730
 ] 

Sergio Peña commented on HIVE-9863:
---

Hive currently uses the 1.6.0rc6, but we're waiting for 1.6.0 bits on the maven 
repository. We have other changes we want to do based on 1.6.0.

 Querying parquet tables fails with IllegalStateException [Spark Branch]
 ---

 Key: HIVE-9863
 URL: https://issues.apache.org/jira/browse/HIVE-9863
 Project: Hive
  Issue Type: Sub-task
  Components: Spark
Reporter: Xuefu Zhang

 Not necessarily happens only in spark branch, queries such as select count(*) 
 from table_name fails with error:
 {code}
 hive select * from content limit 2;
 OK
 Failed with exception java.io.IOException:java.lang.IllegalStateException: 
 All the offsets listed in the split should be found in the file. expected: 
 [4, 4] found: [BlockMetaData{69644, 881917418 [ColumnMetaData{GZIP [guid] 
 BINARY  [PLAIN, BIT_PACKED], 4}, ColumnMetaData{GZIP [collection_name] BINARY 
  [PLAIN_DICTIONARY, BIT_PACKED], 389571}, ColumnMetaData{GZIP [doc_type] 
 BINARY  [PLAIN_DICTIONARY, BIT_PACKED], 389790}, ColumnMetaData{GZIP [stage] 
 INT64  [PLAIN_DICTIONARY, BIT_PACKED], 389887}, ColumnMetaData{GZIP 
 [meta_timestamp] INT64  [RLE, PLAIN_DICTIONARY, BIT_PACKED], 397673}, 
 ColumnMetaData{GZIP [doc_timestamp] INT64  [RLE, PLAIN_DICTIONARY, 
 BIT_PACKED], 422161}, ColumnMetaData{GZIP [meta_size] INT32  [RLE, 
 PLAIN_DICTIONARY, BIT_PACKED], 460215}, ColumnMetaData{GZIP [content_size] 
 INT32  [RLE, PLAIN_DICTIONARY, BIT_PACKED], 521728}, ColumnMetaData{GZIP 
 [source] BINARY  [RLE, PLAIN, BIT_PACKED], 683740}, ColumnMetaData{GZIP 
 [delete_flag] BOOLEAN  [RLE, PLAIN, BIT_PACKED], 683787}, ColumnMetaData{GZIP 
 [meta] BINARY  [RLE, PLAIN, BIT_PACKED], 683834}, ColumnMetaData{GZIP 
 [content] BINARY  [RLE, PLAIN, BIT_PACKED], 6992365}]}] out of: [4, 
 129785482, 260224757] in range 0, 134217728
 Time taken: 0.253 seconds
 hive 
 {code}
 I can reproduce the problem with either local or yarn-cluster. It seems 
 happening to MR also. Thus, I suspect this is an parquet problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-10480) LLAP: Tez task is interrupted for unknown reason after an IPC exception and then fails to report completion

2015-04-24 Thread Siddharth Seth (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14511738#comment-14511738
 ] 

Siddharth Seth commented on HIVE-10480:
---

This is what is happening here.

The heartbeat being sent out for a task (from the LLAP daemon) to the AM is 
corrupt - TEZ-2367.
This causes an error to be reported and an Interrupt on the task.
The NPE and Ignoring exception can be ignored - that's caused by the task being 
unregistered as Prasanth pointed out. It's not the root cause of failure, and 
logging it always causes confusion. The log line has already been pruned in Tez 
(yesterday).

Since the daemon considers the task to be dead - it won't send another 
heartbeat to the AM.
The AM has no idea that the task is dead - since the last heartbeat was 
corrupt. The regular timeout mechanism kicks in, and the task is considered 
dead after 5 minutes (the default timeout). 
A new attempt of the same task is setup and runs to completion.

 LLAP: Tez task is interrupted for unknown reason after an IPC exception and 
 then fails to report completion
 ---

 Key: HIVE-10480
 URL: https://issues.apache.org/jira/browse/HIVE-10480
 Project: Hive
  Issue Type: Sub-task
Reporter: Sergey Shelukhin

 No idea if this is LLAP bug, Tez bug, Hadoop IPC bug (due to patch on the 
 cluster), or all 3.
 So for now I will just dump all I have here.
 TPCH Q1 started running for a long time for me on large number of runs today 
 (didn't happen yesterday). It would always be one Map task timing out.
  Example attempt (logs from am):
 {noformat}
 2015-04-24 11:11:01,073 INFO [TaskCommunicator # 0] 
 tezplugins.LlapTaskCommunicator: Successfully launched task: 
 attempt_1429683757595_0321_9_00_000928_0
 2015-04-24 11:16:25,498 INFO [Dispatcher thread: Central] 
 history.HistoryEventHandler: 
 [HISTORY][DAG:dag_1429683757595_0321_9][Event:TASK_ATTEMPT_FINISHED]: 
 vertexName=Map 1, taskAttemptId=attempt_1429683757595_0321_9_00_000928_0, 
 startTime=1429899061071, finishTime=1429899385498, timeTaken=324427, 
 status=FAILED, errorEnum=TASK_HEARTBEAT_ERROR, 
 diagnostics=AttemptID:attempt_1429683757595_0321_9_00_000928_0 Timed out 
 after 300 secs, counters=Counters: 1, 
 org.apache.tez.common.counters.DAGCounter, RACK_LOCAL_TASKS=1
 {noformat}
 No other lines for this attempt in between.
 However there's this:
 {noformat}
 2015-04-24 11:11:01,074 WARN [Socket Reader #1 for port 59446] ipc.Server: 
 Unable to read call parameters for client 172.19.128.56on connection protocol 
 org.apache.hadoop.hive.llap.protocol.LlapTaskUmbilicalProtocol for rpcKind 
 RPC_WRITABLE
 java.lang.ArrayIndexOutOfBoundsException
 2015-04-24 11:11:01,075 INFO [Socket Reader #1 for port 59446] ipc.Server: 
 Socket Reader #1 for port 59446: readAndProcess from client 172.19.128.56 
 threw exception [org.apache.hadoop.ipc.RpcServerException: IPC server unable 
 to read call parameters: null]
 {noformat}
 On LLAP, the following is logged 
 {noformat}
 2015-04-24 11:11:01,142 [TaskHeartbeatThread()] ERROR 
 org.apache.tez.runtime.task.TezTaskRunner: TaskReporter reported error
 org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RpcServerException):
  IPC server unable to read call parameters: null
 at org.apache.hadoop.ipc.Client.call(Client.java:1492)
 at org.apache.hadoop.ipc.Client.call(Client.java:1423)
 at 
 org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:242)
 at com.sun.proxy.$Proxy19.heartbeat(Unknown Source)
 at 
 org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.heartbeat(LlapTaskReporter.java:258)
 at 
 org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:186)
 at 
 org.apache.hadoop.hive.llap.daemon.impl.LlapTaskReporter$HeartbeatCallable.call(LlapTaskReporter.java:128)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
 {noformat}
 The attempt starts but is then interrupted (not clear by whom)
 {noformat}
 2015-04-24 11:11:01,144 [Initializer 
 0(container_1_0321_01_008943_sershe_20150424110948_86ce1f6f-7cd2-4a40-b9a6-4a6854f010f6:9_Map
  1_928_0)] INFO org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: 
 Initialized Input with src edge: lineitem
 2015-04-24 11:11:01,145 
 [TezTaskRunner_attempt_1429683757595_0321_9_00_000928_0(container_1_0321_01_008943_sershe_20150424110948_86ce1f6f-7cd2-4a40-b9a6-4a6854f010f6:9_Map
  1_928_0)] INFO

[jira] [Updated] (HIVE-10481) ACID table update finishes but values not really updated if column names are not all lower case