[jira] [Commented] (HIVE-13269) Simplify comparison expressions using column stats

2016-03-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194712#comment-15194712
 ] 

Hive QA commented on HIVE-13269:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793290/HIVE-13269.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 200 failed/errored test(s), 9826 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_lineage2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_simple_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_simple_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog_semijoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppd
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_const_type
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input39
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input41
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters_overlap
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_grp_diff_keys
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_view
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3

[jira] [Commented] (HIVE-13239) "java.lang.OutOfMemoryError: unable to create new native thread" occurs at Hive on Tez

2016-03-14 Thread Wataru Yukawa (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194663#comment-15194663
 ] 

Wataru Yukawa commented on HIVE-13239:
--

Maybe This problem seems to be same as 
https://issues.apache.org/jira/browse/HIVE-13273

> "java.lang.OutOfMemoryError: unable to create new native thread" occurs at 
> Hive on Tez
> --
>
> Key: HIVE-13239
> URL: https://issues.apache.org/jira/browse/HIVE-13239
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
> Environment: HDP2.3.4
> JDK1.8
> CentOS 6
>Reporter: Wataru Yukawa
>
> "ps -L $(pgrep -f hiveserver2) | wc -l" is more than 15,000
> HiveServer2 memory leak occurs.
> hive query
> {code}
>  FROM hoge_tmp
>  INSERT INTO TABLE hoge PARTITION (...)
>SELECT ...   WHERE ...
> {code}
> stacktrace
> {code}
> org.apache.hive.service.cli.HiveSQLException: Error while processing 
> statement: FAILED: Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.tez.TezTask. unable to create new native thread
> at 
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:156)
> at 
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:183)
> at 
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:257)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:410)
> at 
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:391)
> at 
> org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:261)
> at 
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486)
> at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
> at 
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> at 
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.OutOfMemoryError: unable to create new native thread
> at java.lang.Thread.start0(Native Method)
> at java.lang.Thread.start(Thread.java:714)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.start(DFSOutputStream.java:2238)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1753)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703)
> at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:444)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:776)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:577)
> at 
> org.apache.tez.common.TezCommonUtils.createFileForAM(TezCommonUtils.java:310)
> at 
> org.apache.tez.client.TezClientUtils.createApplicationSubmissionContext(TezClientUtils.java:559)
> at org.apache.tez.client.TezClient.start(TezClient.java:395)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:196)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:271)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:151)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> 

[jira] [Commented] (HIVE-13273) "java.lang.OutOfMemoryError: unable to create new native thread" occurs at Hive on MapReduce

2016-03-14 Thread Wataru Yukawa (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194659#comment-15194659
 ] 

Wataru Yukawa commented on HIVE-13273:
--

This problem seems to be same as 
https://community.hortonworks.com/questions/20116/logfdscacheflushtimer-thread-increase.html

I upgrade HDP2.4, so it seems that this problem is resolved.

> "java.lang.OutOfMemoryError: unable to create new native thread" occurs at 
> Hive on MapReduce
> 
>
> Key: HIVE-13273
> URL: https://issues.apache.org/jira/browse/HIVE-13273
> Project: Hive
>  Issue Type: Bug
>  Components: HiveServer2
> Environment: HDP2.3.4
> JDK1.8
> CentOS 6
>Reporter: Wataru Yukawa
>Assignee: Vaibhav Gumashta
>
> "ps -L $(pgrep -f hiveserver2) | wc -l" is more than 15,000
> HiveServer2 memory leak occurs.
> heapstats result is https://gyazo.com/27dcaf678fb8d2e4003af55a79c2020e
> hiveserver2.log
> {code}
> 2016-03-13 13:25:06,041 INFO  [HiveServer2-Handler-Pool: Thread-98838]: 
> retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(144)) - 
> Exception while invoking getFileInfo of class Cl
> ientNamenodeProtocolTranslatorPB over ...:8020 after 3 fail over attempts. 
> Trying to fail over immediately.
> java.io.IOException: Failed on local exception: java.io.IOException: Couldn't 
> set up IO streams; Host Details : local host is: "..."; destination host is: 
> "...":8020; 
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)
> at org.apache.hadoop.ipc.Client.call(Client.java:1431)
> at org.apache.hadoop.ipc.Client.call(Client.java:1358)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
> at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771)
> at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
> at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source)
> at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1315)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1311)
> at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1311)
> at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424)
> at 
> org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:85)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:95)
> at 
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:190)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
> at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
> at 
> org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:431)
> at 
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
> at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1237)
> at 

[jira] [Updated] (HIVE-13285) Orc concatenation may drop old files from moving to final path

2016-03-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13285:
-
Attachment: HIVE-13285.2.patch

> Orc concatenation may drop old files from moving to final path
> --
>
> Key: HIVE-13285
> URL: https://issues.apache.org/jira/browse/HIVE-13285
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13285.1.patch, HIVE-13285.2.patch
>
>
> ORC concatenation uses combine hive input format for merging files. Under 
> specific case where all files within a combine split are incompatible for 
> merge (old files without stripe statistics) then these files are added to 
> incompatible file set. But this file set is not processed as closeOp() will 
> not be called (no output file writer will exist which will skip 
> super.closeOp()). As a result, these incompatible files are not moved to 
> final path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13277) Exception "Unable to create serializer 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " occurred during query execution on spark engine when ve

2016-03-14 Thread Rui Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194645#comment-15194645
 ] 

Rui Li commented on HIVE-13277:
---

I built a local snapshot of kryo with latest code and verified the query can 
pass. So the root cause should be the kryo (3.0.3) we use. I'm afraid there's 
not much we can do at the moment because the fix hasn't been released yet.
[~xuefuz] what do you think?

> Exception "Unable to create serializer 
> 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " 
> occurred during query execution on spark engine when vectorized execution is 
> switched on
> -
>
> Key: HIVE-13277
> URL: https://issues.apache.org/jira/browse/HIVE-13277
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Hive Version: Apache Hive 2.0.0
> Spark Version: Apache Spark 1.6.0
>Reporter: Xin Hao
>
> Found when executing TPCx-BB query2 for Hive on Spark engine, and switch on :
> Found during TPCx-BB query2 execution on spark engine when vectorized 
> execution is switched on:
> (1) set hive.vectorized.execution.enabled=true; 
> (2) set hive.vectorized.execution.reduce.enabled=true; (default value for 
> Apache Hive 2.0.0)
> It's OK for spark engine when hive.vectorized.execution.enabled is switched 
> off:
> (1) set hive.vectorized.execution.enabled=false;
> (2) set hive.vectorized.execution.reduce.enabled=true;
> For MR engine, the query could pass and no exception occurred when vectorized 
> execution is either switched on or switched off.
> Detail Error Message is below:
> {noformat}
> 2016-03-14T10:09:33,692 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 INFO 
> spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 154 
> bytes
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 WARN 
> scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 25, bhx3): 
> java.lang.RuntimeException: Failed to load plan: 
> hdfs://bhx3:8020/tmp/hive/root/40b90ebd-32d4-47bc-a5ab-12ff1c05d0d2/hive_2016-03-14_10-08-56_307_7692316402338632647-1/-mr-10002/ab0c0021-0c1a-496e-9703-87d5879353c8/reduce.xml:
>  org.apache.hive.com.esotericsoftware.kryo.KryoException: 
> java.lang.IllegalArgumentException: Unable to create serializer 
> "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for 
> class: org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - Serialization trace:
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - childOperators 
> (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) - reducer 
> (org.apache.hadoop.hive.ql.plan.ReduceWork)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451)
> 2016-03-14T10:09:33,818 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:306)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:117)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:46)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:28)
> 2016-03-14T10:09:33,819 INFO  [stderr-redir-1]: client.SparkClientImpl 
> (SparkClientImpl.java:run(593)) -at 
> org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192)
> 2016-03-14T10:09:33,819 INFO  

[jira] [Commented] (HIVE-13285) Orc concatenation may drop old files from moving to final path

2016-03-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194640#comment-15194640
 ] 

Prasanth Jayachandran commented on HIVE-13285:
--

Yes. That's correct. I will update the patch.

> Orc concatenation may drop old files from moving to final path
> --
>
> Key: HIVE-13285
> URL: https://issues.apache.org/jira/browse/HIVE-13285
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13285.1.patch
>
>
> ORC concatenation uses combine hive input format for merging files. Under 
> specific case where all files within a combine split are incompatible for 
> merge (old files without stripe statistics) then these files are added to 
> incompatible file set. But this file set is not processed as closeOp() will 
> not be called (no output file writer will exist which will skip 
> super.closeOp()). As a result, these incompatible files are not moved to 
> final path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13285) Orc concatenation may drop old files from moving to final path

2016-03-14 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194637#comment-15194637
 ] 

Gopal V commented on HIVE-13285:


As far as I understand, the issue is that relevant super.closeOp() is not 
called in outWriter == null case.

Can you make the patch clearer to imply that super.closeOp() has to be always 
called?

> Orc concatenation may drop old files from moving to final path
> --
>
> Key: HIVE-13285
> URL: https://issues.apache.org/jira/browse/HIVE-13285
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13285.1.patch
>
>
> ORC concatenation uses combine hive input format for merging files. Under 
> specific case where all files within a combine split are incompatible for 
> merge (old files without stripe statistics) then these files are added to 
> incompatible file set. But this file set is not processed as closeOp() will 
> not be called (no output file writer will exist which will skip 
> super.closeOp()). As a result, these incompatible files are not moved to 
> final path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13285) Orc concatenation may drop old files from moving to final path

2016-03-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194633#comment-15194633
 ] 

Prasanth Jayachandran commented on HIVE-13285:
--

RB is not responding. I will upload this patch later to RB.

> Orc concatenation may drop old files from moving to final path
> --
>
> Key: HIVE-13285
> URL: https://issues.apache.org/jira/browse/HIVE-13285
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13285.1.patch
>
>
> ORC concatenation uses combine hive input format for merging files. Under 
> specific case where all files within a combine split are incompatible for 
> merge (old files without stripe statistics) then these files are added to 
> incompatible file set. But this file set is not processed as closeOp() will 
> not be called (no output file writer will exist which will skip 
> super.closeOp()). As a result, these incompatible files are not moved to 
> final path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13285) Orc concatenation may drop old files from moving to final path

2016-03-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13285:
-
Status: Patch Available  (was: Open)

> Orc concatenation may drop old files from moving to final path
> --
>
> Key: HIVE-13285
> URL: https://issues.apache.org/jira/browse/HIVE-13285
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.0.0, 0.14.0, 1.3.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13285.1.patch
>
>
> ORC concatenation uses combine hive input format for merging files. Under 
> specific case where all files within a combine split are incompatible for 
> merge (old files without stripe statistics) then these files are added to 
> incompatible file set. But this file set is not processed as closeOp() will 
> not be called (no output file writer will exist which will skip 
> super.closeOp()). As a result, these incompatible files are not moved to 
> final path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13285) Orc concatenation may drop old files from moving to final path

2016-03-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194628#comment-15194628
 ] 

Prasanth Jayachandran commented on HIVE-13285:
--

[~gopalv]/[~daijy] Could someone please take a look at this patch? This patch 
makes sure closeOp() is called even when outWriter is null as we might have 
some incompatible files to move to final path.

> Orc concatenation may drop old files from moving to final path
> --
>
> Key: HIVE-13285
> URL: https://issues.apache.org/jira/browse/HIVE-13285
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13285.1.patch
>
>
> ORC concatenation uses combine hive input format for merging files. Under 
> specific case where all files within a combine split are incompatible for 
> merge (old files without stripe statistics) then these files are added to 
> incompatible file set. But this file set is not processed as closeOp() will 
> not be called (no output file writer will exist which will skip 
> super.closeOp()). As a result, these incompatible files are not moved to 
> final path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13285) Orc concatenation may drop old files from moving to final path

2016-03-14 Thread Prasanth Jayachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanth Jayachandran updated HIVE-13285:
-
Attachment: HIVE-13285.1.patch

> Orc concatenation may drop old files from moving to final path
> --
>
> Key: HIVE-13285
> URL: https://issues.apache.org/jira/browse/HIVE-13285
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>Priority: Critical
> Attachments: HIVE-13285.1.patch
>
>
> ORC concatenation uses combine hive input format for merging files. Under 
> specific case where all files within a combine split are incompatible for 
> merge (old files without stripe statistics) then these files are added to 
> incompatible file set. But this file set is not processed as closeOp() will 
> not be called (no output file writer will exist which will skip 
> super.closeOp()). As a result, these incompatible files are not moved to 
> final path.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13286) Query ID is being reused across queries

2016-03-14 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194593#comment-15194593
 ] 

Aihua Xu commented on HIVE-13286:
-

I will take a look. That seems to be an issue and not my intention. 




> Query ID is being reused across queries
> ---
>
> Key: HIVE-13286
> URL: https://issues.apache.org/jira/browse/HIVE-13286
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Critical
>
> [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is 
> being reused across queries. This defeats the purpose of a query id. I am not 
> sure what the purpose of the change in that jira is but it breaks the 
> assumption about a query id being unique for each query. Please take a look 
> into this at the earliest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11424) Rule to transform OR clauses into IN clauses in CBO

2016-03-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194516#comment-15194516
 ] 

Hive QA commented on HIVE-11424:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793292/HIVE-11424.03.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 38 failed/errored test(s), 9807 tests 
executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more
 - did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_flatten_and_or
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_fold_eq_with_case_when
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_case
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucketpruning1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_case
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query13
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query34
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query71
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query73
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query85
org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query91
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join19
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_case
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7268/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7268/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7268/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 38 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793292 - PreCommit-HIVE-TRUNK-Build

> Rule to transform OR clauses into IN clauses in CBO
> ---
>
> Key: HIVE-11424
> URL: https://issues.apache.org/jira/browse/HIVE-11424
> Project: Hive
>  Issue Type: Bug
>Reporter: Jesus Camacho Rodriguez
>Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11424.01.patch, HIVE-11424.01.patch, 
> HIVE-11424.03.patch, HIVE-11424.03.patch, HIVE-11424.2.patch, HIVE-11424.patch
>
>
> We create a rule that will transform OR clauses into IN clauses 

[jira] [Commented] (HIVE-13286) Query ID is being reused across queries

2016-03-14 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194509#comment-15194509
 ] 

Vikram Dixit K commented on HIVE-13286:
---

I think it primarily comes down to this: the hive conf object once modified 
with a generated query id, never resets the query id for a subsequent query.

{code}
+String queryId = confOverlay.get(HiveConf.ConfVars.HIVEQUERYID.varname);
+if (queryId == null || queryId.isEmpty()) {
+  queryId = QueryPlan.makeQueryId();
+  confOverlay.put(HiveConf.ConfVars.HIVEQUERYID.varname, queryId);
+  sessionState.getConf().setVar(HiveConf.ConfVars.HIVEQUERYID, queryId);
+}
{code}

Once the query id has been set by a previous query, it never changes. This is 
incorrect behavior. I am not sure about what the change was trying to do but 
this needs to get fixed. Thanks!

> Query ID is being reused across queries
> ---
>
> Key: HIVE-13286
> URL: https://issues.apache.org/jira/browse/HIVE-13286
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Critical
>
> [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is 
> being reused across queries. This defeats the purpose of a query id. I am not 
> sure what the purpose of the change in that jira is but it breaks the 
> assumption about a query id being unique for each query. Please take a look 
> into this at the earliest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13278) Many redundant 'File not found' messages appeared in container log during query execution with Hive on Spark

2016-03-14 Thread Xin Hao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xin Hao updated HIVE-13278:
---
Description: 
Many redundant 'File not found' messages appeared in container log during query 
execution with Hive on Spark.
Certainly, it doesn't prevent the query from running successfully. So mark it 
as Minor currently.

Error message example:
16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: 
/tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565)
at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)

  was:
Many redundant 'File not found' messages appeared in container log during query 
execution with Hive on Spark

Error message example:
16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: 
/tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565)
at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)


> Many redundant 'File not found' messages appeared in container log during 
> query execution with Hive on Spark
> 
>
> Key: HIVE-13278
> 

[jira] [Updated] (HIVE-13286) Query ID is being reused across queries

2016-03-14 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13286:
--
Description: [~aihuaxu] I see this commit made via HIVE-11488. I see that 
query id is being reused across queries. This defeats the purpose of a query 
id. I am not sure what the purpose of the change in that jira is but it breaks 
the assumption about a query id being unique for each query. Please take a look 
into this at the earliest.  (was: [~aihuaxu] I see this commit made via 
HIVE-11488. I see that query id is being reused across queries. This defeats 
the purpose of a query id. I am not sure what the purpose of the change in that 
jira is but it breaks the assumption about a query id being unique for each 
query. Please take a look into this.)

> Query ID is being reused across queries
> ---
>
> Key: HIVE-13286
> URL: https://issues.apache.org/jira/browse/HIVE-13286
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Critical
>
> [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is 
> being reused across queries. This defeats the purpose of a query id. I am not 
> sure what the purpose of the change in that jira is but it breaks the 
> assumption about a query id being unique for each query. Please take a look 
> into this at the earliest.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13278) Many redundant 'File not found' messages appeared in container log during query execution with Hive on Spark

2016-03-14 Thread Xin Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194503#comment-15194503
 ] 

Xin Hao commented on HIVE-13278:


Yes, this problem doesn't prevent the query from running successfully.

> Many redundant 'File not found' messages appeared in container log during 
> query execution with Hive on Spark
> 
>
> Key: HIVE-13278
> URL: https://issues.apache.org/jira/browse/HIVE-13278
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Found based on :
> Apache Hive 2.0.0
> Apache Spark 1.6.0
>Reporter: Xin Hao
>Priority: Minor
>
> Many redundant 'File not found' messages appeared in container log during 
> query execution with Hive on Spark
> Error message example:
> 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: 
> /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13286) Query ID is being reused across queries

2016-03-14 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-13286:
--
Assignee: Aihua Xu  (was: Pengcheng Xiong)

> Query ID is being reused across queries
> ---
>
> Key: HIVE-13286
> URL: https://issues.apache.org/jira/browse/HIVE-13286
> Project: Hive
>  Issue Type: Bug
>  Components: Parser
>Affects Versions: 2.0.0
>Reporter: Vikram Dixit K
>Assignee: Aihua Xu
>Priority: Critical
>
> [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is 
> being reused across queries. This defeats the purpose of a query id. I am not 
> sure what the purpose of the change in that jira is but it breaks the 
> assumption about a query id being unique for each query. Please take a look 
> into this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13278) Many redundant 'File not found' messages appeared in container log during query execution with Hive on Spark

2016-03-14 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194499#comment-15194499
 ] 

Xuefu Zhang commented on HIVE-13278:


[~xhao1], to clarify, this problem doesn't prevent the query from running 
successfully, right? Thanks.

> Many redundant 'File not found' messages appeared in container log during 
> query execution with Hive on Spark
> 
>
> Key: HIVE-13278
> URL: https://issues.apache.org/jira/browse/HIVE-13278
> Project: Hive
>  Issue Type: Bug
> Environment: Hive on Spark engine
> Found based on :
> Apache Hive 2.0.0
> Apache Spark 1.6.0
>Reporter: Xin Hao
>Priority: Minor
>
> Many redundant 'File not found' messages appeared in container log during 
> query execution with Hive on Spark
> Error message example:
> 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: 
> /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions

2016-03-14 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194464#comment-15194464
 ] 

Wei Zheng commented on HIVE-13249:
--

OK, one more concern, is it a good idea to have performTimeOuts() and 
countOpenTxns() share the same check intervals? We may want to do it more 
frequent for the latter one.

> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194446#comment-15194446
 ] 

Prasanth Jayachandran edited comment on HIVE-13226 at 3/15/16 12:22 AM:


Test failures are unrelated. [~gopalv] Can you please review the latest patch? 


was (Author: prasanth_j):
Test failures are unrealted. [~gopalv] Can you please review the latest patch? 

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, 
> HIVE-13226.3.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown

2016-03-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194446#comment-15194446
 ] 

Prasanth Jayachandran commented on HIVE-13226:
--

Test failures are unrealted. [~gopalv] Can you please review the latest patch? 

> Improve tez print summary to print query execution breakdown
> 
>
> Key: HIVE-13226
> URL: https://issues.apache.org/jira/browse/HIVE-13226
> Project: Hive
>  Issue Type: Improvement
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
> Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, 
> HIVE-13226.3.patch, sampleoutput.png
>
>
> When tez print summary is enabled, methods summary is printed which are 
> difficult to correlate with the actual execution time. We can improve that to 
> print  the execution times in the sequence of operations that happens behind 
> the scenes.
> Instead of printing the methods name it will be useful to print something 
> like below
> 1) Query Compilation time
> 2) Query Submit to DAG Submit time
> 3) DAG Submit to DAG Accept time
> 4) DAG Accept to DAG Start time
> 5) DAG Start to DAG End time
> With this it will be easier to find out where the actual time is spent. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR

2016-03-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13084:

Attachment: HIVE-13084.03.patch

> Vectorization add support for PROJECTION Multi-AND/OR
> -
>
> Key: HIVE-13084
> URL: https://issues.apache.org/jira/browse/HIVE-13084
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Rajesh Balamohan
>Assignee: Matt McCline
> Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, 
> HIVE-13084.03.patch, vector_between_date.q
>
>
> When there is case statement in group by, hive throws unable to vectorize 
> exception.
> e.g query just to demonstrate the problem
> {noformat}
> explain select l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts 
> group by l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END;
> org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize 
> expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2
>   File Output Operator [FS_7]
> Group By Operator [GBY_5] (rows=888777234 width=108)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE]
>   SHUFFLE [RS_4]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_3] (rows=1777554469 width=108)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_1] (rows=1777554469 width=108)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=1777554469 width=108)
>   
> rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"]
> {noformat}
> \cc [~mmccline], [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR

2016-03-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13084:

Attachment: (was: HIVE-13084.03.patch)

> Vectorization add support for PROJECTION Multi-AND/OR
> -
>
> Key: HIVE-13084
> URL: https://issues.apache.org/jira/browse/HIVE-13084
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Rajesh Balamohan
>Assignee: Matt McCline
> Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, 
> HIVE-13084.03.patch, vector_between_date.q
>
>
> When there is case statement in group by, hive throws unable to vectorize 
> exception.
> e.g query just to demonstrate the problem
> {noformat}
> explain select l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts 
> group by l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END;
> org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize 
> expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2
>   File Output Operator [FS_7]
> Group By Operator [GBY_5] (rows=888777234 width=108)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE]
>   SHUFFLE [RS_4]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_3] (rows=1777554469 width=108)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_1] (rows=1777554469 width=108)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=1777554469 width=108)
>   
> rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"]
> {noformat}
> \cc [~mmccline], [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR

2016-03-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13084:

Attachment: HIVE-13084.03.patch

> Vectorization add support for PROJECTION Multi-AND/OR
> -
>
> Key: HIVE-13084
> URL: https://issues.apache.org/jira/browse/HIVE-13084
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Rajesh Balamohan
>Assignee: Matt McCline
> Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, 
> HIVE-13084.03.patch, vector_between_date.q
>
>
> When there is case statement in group by, hive throws unable to vectorize 
> exception.
> e.g query just to demonstrate the problem
> {noformat}
> explain select l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts 
> group by l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END;
> org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize 
> expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2
>   File Output Operator [FS_7]
> Group By Operator [GBY_5] (rows=888777234 width=108)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE]
>   SHUFFLE [RS_4]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_3] (rows=1777554469 width=108)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_1] (rows=1777554469 width=108)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=1777554469 width=108)
>   
> rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"]
> {noformat}
> \cc [~mmccline], [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR

2016-03-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13084:

Attachment: (was: HIVE-13084.03.patch)

> Vectorization add support for PROJECTION Multi-AND/OR
> -
>
> Key: HIVE-13084
> URL: https://issues.apache.org/jira/browse/HIVE-13084
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Rajesh Balamohan
>Assignee: Matt McCline
> Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, 
> vector_between_date.q
>
>
> When there is case statement in group by, hive throws unable to vectorize 
> exception.
> e.g query just to demonstrate the problem
> {noformat}
> explain select l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts 
> group by l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END;
> org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize 
> expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2
>   File Output Operator [FS_7]
> Group By Operator [GBY_5] (rows=888777234 width=108)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE]
>   SHUFFLE [RS_4]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_3] (rows=1777554469 width=108)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_1] (rows=1777554469 width=108)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=1777554469 width=108)
>   
> rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"]
> {noformat}
> \cc [~mmccline], [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13185) orc.ReaderImp.ensureOrcFooter() method fails on small text files with IndexOutOfBoundsException

2016-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13185:

   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

Committed to master.

> orc.ReaderImp.ensureOrcFooter() method fails on small text files with 
> IndexOutOfBoundsException
> ---
>
> Key: HIVE-13185
> URL: https://issues.apache.org/jira/browse/HIVE-13185
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Affects Versions: 2.1.0
>Reporter: Illya Yalovyy
>Assignee: Illya Yalovyy
> Fix For: 2.1.0
>
> Attachments: HIVE-13185.1.patch
>
>
> Steps to reproduce:
> 1. Create a Text source table with one line of data:
> {code}
> create table src (id int);
> insert overwrite table src values (1);
> {code}
> 2. Create a target table:
> {code}
> create table trg (id int);
> {code}
> 3. Try to load small text file to the target table:
> {code}
> load data inpath 'user/hive/warehouse/src/00_0' into table trg;
> {code}
> *Error message:*
> {quote}
> FAILED: SemanticException Unable to load data to destination table. Error: 
> java.lang.IndexOutOfBoundsException
> {quote}
> *Stack trace:*
> {noformat}
> org.apache.hadoop.hive.ql.parse.SemanticException: Unable to load data to 
> destination table. Error: java.lang.IndexOutOfBoundsException
>   at 
> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.ensureFileFormatsMatch(LoadSemanticAnalyzer.java:340)
>   at 
> org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:224)
>   at 
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:242)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:481)
>   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317)
>   at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1190)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1285)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1104)
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-13284) Make ORC Reader resilient to 0 length files

2016-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-13284.
-
Resolution: Duplicate

Actually I just realized it's a dup of HIVE-13185

> Make ORC Reader resilient to 0 length files
> ---
>
> Key: HIVE-13284
> URL: https://issues.apache.org/jira/browse/HIVE-13284
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> HIVE-13040 creates 0 length ORC files. Reading such files will throw 
> following exception. ORC is resilient to corrupt footers but not 0 length 
> files.
> {code}
> Processing data file file:/app/warehouse/concat_incompat/00_0 [length: 0]
> Exception in thread "main" java.lang.IndexOutOfBoundsException
>   at java.nio.Buffer.checkIndex(Buffer.java:540)
>   at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:510)
>   at 
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:361)
>   at 
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:83)
>   at 
> org.apache.hadoop.hive.ql.io.orc.FileDump.getReader(FileDump.java:239)
>   at 
> org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaDataImpl(FileDump.java:312)
>   at 
> org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaData(FileDump.java:291)
>   at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:138)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9457) Fix obsolete parameter name in HiveConf description of hive.hashtable.initialCapacity

2016-03-14 Thread Shannon Ladymon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shannon Ladymon updated HIVE-9457:
--
Attachment: HIVE-9457.2.patch

Rebased patch attached

> Fix obsolete parameter name in HiveConf description of 
> hive.hashtable.initialCapacity
> -
>
> Key: HIVE-9457
> URL: https://issues.apache.org/jira/browse/HIVE-9457
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.14.0
>Reporter: Lefty Leverenz
>Assignee: Shannon Ladymon
>Priority: Minor
> Attachments: HIVE-9457.2.patch, HIVE-9457.patch
>
>
> The description of *hive.hashtable.initialCapacity* in HiveConf.java refers 
> to a parameter that existed in an early patch for HIVE-7616 
> ("hive.hashtable.stats.key.estimate.adjustment") but was renamed in later 
> patches.  So change *hive.hashtable.stats.key.estimate.adjustment* to 
> *hive.hashtable.key.count.adjustment* in this parameter definition in 
> HiveConf.java:
> {code}
> HIVEHASHTABLETHRESHOLD("hive.hashtable.initialCapacity", 10, "Initial 
> capacity of " +
> "mapjoin hashtable if statistics are absent, or if 
> hive.hashtable.stats.key.estimate.adjustment is set to 0"),
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13281) Update some default configs for LLAP

2016-03-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194398#comment-15194398
 ] 

Siddharth Seth commented on HIVE-13281:
---

uber has not been tested enough for it to be default - in fact I think it is 
broken rightnow. If running with containers, container would be faster.

> Update some default configs for LLAP
> 
>
> Key: HIVE-13281
> URL: https://issues.apache.org/jira/browse/HIVE-13281
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>
> Disable uber mode.
> Enable llap.io by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13281) Update some default configs for LLAP

2016-03-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194378#comment-15194378
 ] 

Sergey Shelukhin commented on HIVE-13281:
-

Hmm. Why? Isn't uber still faster than container, although slower than LLAP? 
Right now, uber is only disabled in "all" mode.

> Update some default configs for LLAP
> 
>
> Key: HIVE-13281
> URL: https://issues.apache.org/jira/browse/HIVE-13281
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>
> Disable uber mode.
> Enable llap.io by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-13241) LLAP: Incremental Caching marks some small chunks as "incomplete CB"

2016-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-13241:
---

Assignee: Sergey Shelukhin

> LLAP: Incremental Caching marks some small chunks as "incomplete CB"
> 
>
> Key: HIVE-13241
> URL: https://issues.apache.org/jira/browse/HIVE-13241
> Project: Hive
>  Issue Type: Bug
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
>
> Run #3 of a query with 1 node still has cache misses.
> {code}
> LLAP IO Summary
> --
>   VERTICES ROWGROUPS  META_HIT  META_MISS  DATA_HIT  DATA_MISS  ALLOCATION
>  USED  TOTAL_IO
> --
>  Map 111  1116  01.65GB93.61MB  0B
>0B32.72s
> --
> {code}
> {code}
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(695)) - Locking 
> 0x1c44401d(1) due to reuse
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(701)) - Adding an 
> already-uncompressed buffer 0x1c44401d(2)
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(695)) - Locking 
> 0x4e51b032(1) due to reuse
> 2016-03-08T21:05:39,417 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:prepareRangesForCompressedRead(701)) - Adding an 
> already-uncompressed buffer 0x4e51b032(2)
> 2016-03-08T21:05:39,418 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:addOneCompressionBuffer(1161)) - Found CB at 1373931, 
> chunk length 86587, total 86590, compressed
> 2016-03-08T21:05:39,418 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:addIncompleteCompressionBuffer(1241)) - Replacing 
> data range [1373931, 1408408), size: 34474(!) type: direct (and 0 previous 
> chunks) with incomplete CB start: 1373931 end: 1408408 in the buffers
> 2016-03-08T21:05:39,418 INFO  
> [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.EncodedReaderImpl 
> (EncodedReaderImpl.java:createRgColumnStreamData(441)) - Getting data for 
> column 7 RG 14 stream DATA at 1460521, 319811 index position 0: compressed 
> [1626961, 1780332)
> {code}
> {code}
> 2016-03-08T21:05:38,925 INFO  
> [IO-Elevator-Thread-7[attempt_1455662455106_2688_3_00_01_0]]: 
> encoded.OrcEncodedDataReader (OrcEncodedDataReader.java:readFileData(878)) - 
> Disk ranges after disk read (file 5372745, base offset 3): [{start: 18986 
> end: 20660 cache buffer: 0x660faf7c(1)}, {start: 20660 end: 35775 cache 
> buffer: 0x1dcb1d97(1)}, {start: 318852 end: 422353 cache buffer: 
> 0x6c7f9a05(1)}, {start: 1148616 end: 1262468 cache buffer: 0x196e1d41(1)}, 
> {start: 1262468 end: 1376342 cache buffer: 0x201255f(1)}, {data range 
> [1376342, 1410766), size: 34424 type: direct}, {start: 1631359 end: 1714694 
> cache buffer: 0x47e3a72d(1)}, {start: 1714694 end: 1785770 cache buffer: 
> 0x57dca266(1)}, {start: 4975035 end: 5095215 cache buffer: 0x3e3139c9(1)}, 
> {start: 5095215 end: 5197863 cache buffer: 0x3511c88d(1)}, {start: 7448387 
> end: 7572268 cache buffer: 0x6f11dbcd(1)}, {start: 7572268 end: 7696182 cache 
> buffer: 0x5d6c9bdb(1)}, {data range [7696182, 7710537), size: 14355 type: 
> direct}, {start: 8235756 end: 8345367 cache buffer: 0x6a241ece(1)}, {start: 
> 8345367 end: 8455009 cache buffer: 0x51caf6a7(1)}, {data range [8455009, 
> 8497906), size: 42897 type: direct}, {start: 9035815 end: 9159708 cache 
> buffer: 0x306480e0(1)}, {start: 9159708 end: 9283629 cache buffer: 
> 0x9ef7774(1)}, {data range [9283629, 9297965), size: 14336 type: direct}, 
> {start: 9989884 end: 10113731 cache buffer: 0x43f7cae9(1)}, {start: 10113731 
> end: 10237589 cache buffer: 0x458e63fe(1)}, {data range [10237589, 10252034), 
> size: 14445 type: direct}, {start: 11897896 end: 12021787 cache buffer: 
> 0x51f9982f(1)}, {start: 12021787 end: 12145656 cache buffer: 0x23df01b3(1)}, 
> {data range [12145656, 12160046), size: 14390 type: direct}, {start: 12851928 
> end: 12975795 cache buffer: 0x5e0237a3(1)}, {start: 12975795 end: 13099664 
> 

[jira] [Resolved] (HIVE-13281) Update some default configs for LLAP

2016-03-14 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth resolved HIVE-13281.
---
Resolution: Duplicate

> Update some default configs for LLAP
> 
>
> Key: HIVE-13281
> URL: https://issues.apache.org/jira/browse/HIVE-13281
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>
> Disable uber mode.
> Enable llap.io by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13281) Update some default configs for LLAP

2016-03-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194374#comment-15194374
 ] 

Siddharth Seth commented on HIVE-13281:
---

The config property needs to be set to false for uber. Could you please do that 
in 12283 - I'll close this.

> Update some default configs for LLAP
> 
>
> Key: HIVE-13281
> URL: https://issues.apache.org/jira/browse/HIVE-13281
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>
> Disable uber mode.
> Enable llap.io by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-4144) Add "select database()" command to show the current database

2016-03-14 Thread Shannon Ladymon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shannon Ladymon updated HIVE-4144:
--
Labels:   (was: TODOC13)

> Add "select database()" command to show the current database
> 
>
> Key: HIVE-4144
> URL: https://issues.apache.org/jira/browse/HIVE-4144
> Project: Hive
>  Issue Type: Bug
>  Components: SQL
>Reporter: Mark Grover
>Assignee: Navis
> Fix For: 0.13.0
>
> Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, 
> HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.13.patch.txt, 
> HIVE-4144.14.patch.txt, HIVE-4144.6.patch.txt, HIVE-4144.7.patch.txt, 
> HIVE-4144.8.patch.txt, HIVE-4144.9.patch.txt, HIVE-4144.D9597.1.patch, 
> HIVE-4144.D9597.2.patch, HIVE-4144.D9597.3.patch, HIVE-4144.D9597.4.patch
>
>
> A recent hive-user mailing list conversation asked about having a command to 
> show the current database.
> http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E
> MySQL seems to have a command to do so:
> {code}
> select database();
> {code}
> http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database
> We should look into having something similar in Hive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13281) Update some default configs for LLAP

2016-03-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194366#comment-15194366
 ] 

Sergey Shelukhin commented on HIVE-13281:
-

That is already done in HIVE-13283 and HIVE-13218

> Update some default configs for LLAP
> 
>
> Key: HIVE-13281
> URL: https://issues.apache.org/jira/browse/HIVE-13281
> Project: Hive
>  Issue Type: Task
>Reporter: Siddharth Seth
>
> Disable uber mode.
> Enable llap.io by default



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194364#comment-15194364
 ] 

Sergey Shelukhin commented on HIVE-13283:
-

Enabling io by default has to be done carefully. It would enable IO outside of 
the daemon.

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13283.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194364#comment-15194364
 ] 

Sergey Shelukhin edited comment on HIVE-13283 at 3/14/16 11:17 PM:
---

Enabling io by default has to be done carefully. It would enable IO outside of 
the daemon. Hence this JIRA


was (Author: sershe):
Enabling io by default has to be done carefully. It would enable IO outside of 
the daemon.

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13283.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC

2016-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-9660:
---
Attachment: HIVE-9660.WIP1.patch

> store end offset of compressed data for RG in RowIndex in ORC
> -
>
> Key: HIVE-9660
> URL: https://issues.apache.org/jira/browse/HIVE-9660
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-9660.WIP0.patch, HIVE-9660.WIP1.patch
>
>
> Right now the end offset is estimated, which in some cases results in tons of 
> extra data being read.
> We can add a separate array to RowIndex (positions_v2?) that stores number of 
> compressed buffers for each RG, or end offset, or something, to remove this 
> estimation magic



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194281#comment-15194281
 ] 

Sergey Shelukhin edited comment on HIVE-13283 at 3/14/16 11:15 PM:
---

[~hagleitn] [~vikram.dixit] -can you take a look- -actually, nm, this will not 
do what is needed- ok, now it's ready


was (Author: sershe):
[~hagleitn] [~vikram.dixit] -can you take a look- actually, nm, this will not 
do what is needed

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13283.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions

2016-03-14 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194362#comment-15194362
 ] 

Alan Gates commented on HIVE-13249:
---

I think the problem with having the thread in TxnHandler instead of 
AcidHousekeeper is that every client will be independently deciding whether the 
system has too many transactions.  This has a couple of problems.  One, it's 
inefficient as every client is running the count query.  But two, it means 
differing configurations could result in some clients seeing the system as 
overloaded and being locked out while others are not.  In fact, a malicious 
client could game the system and set his config high so that he can continue to 
open transactions when other clients cannot.

On the 90% what I'm suggesting is this:
# In the initial state it accepts new transactions until it hits X number total 
transactions open, where X is the configured value
# When it hits X, a full flag is set
# Once the full flag is set no new transactions are allowed in
# The full flag is not unset until the number of open transactions hits X * 0.9.

This is standard procedure with thresholds so that you give the system some 
time to drain off rather than building up a set of clients retrying on opening 
their transactions that all race to get that one available transaction once one 
transaction commits or aborts.

> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194361#comment-15194361
 ] 

Siddharth Seth commented on HIVE-13283:
---

HIVE-13281

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13283.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13283:

Attachment: HIVE-13283.patch

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13283.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions

2016-03-14 Thread Wei Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194349#comment-15194349
 ] 

Wei Zheng commented on HIVE-13249:
--

Thanks [~alangates] for review. I had discussion with [~ekoifman] last week and 
thought it would be better to have a separate threadpool for that. We can 
revisit this and see if AcidHouseKeeperService is a good reuse.

For the open transactions limit, we just want to fail the incoming open 
transaction requests if the number from background counter is above threshold. 
So even if e.g. we let it in when under 90%, it will still lurch in and out 
around that 90% line.

> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13261) Can not compute column stats for partition when schema evolves

2016-03-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13261:
---
Status: Patch Available  (was: Open)

> Can not compute column stats for partition when schema evolves
> --
>
> Key: HIVE-13261
> URL: https://issues.apache.org/jira/browse/HIVE-13261
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13261.01.patch
>
>
> To repro
> {code}
> CREATE TABLE partitioned1(a INT, b STRING) PARTITIONED BY(part INT) STORED AS 
> TEXTFILE;
> insert into table partitioned1 partition(part=1) values(1, 'original'),(2, 
> 'original'), (3, 'original'),(4, 'original');
> -- Table-Non-Cascade ADD COLUMNS ...
> alter table partitioned1 add columns(c int, d string);
> insert into table partitioned1 partition(part=2) values(1, 'new', 10, 
> 'ten'),(2, 'new', 20, 'twenty'), (3, 'new', 30, 'thirty'),(4, 'new', 40, 
> 'forty');
> insert into table partitioned1 partition(part=1) values(5, 'new', 100, 
> 'hundred'),(6, 'new', 200, 'two hundred');
> analyze table partitioned1 compute statistics for columns;
> {code}
> Error msg:
> {code}
> 2016-03-10T14:55:43,205 ERROR [abc3eb8d-7432-47ae-b76f-54c8d7020312 main[]]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(177)) - 
> NoSuchObjectException(message:Column c for which stats gathering is requested 
> doesn't exist.)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.writeMPartitionColumnStatistics(ObjectStore.java:6492)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:6574)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13261) Can not compute column stats for partition when schema evolves

2016-03-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13261:
---
Attachment: (was: HIVE-13261.01.patch)

> Can not compute column stats for partition when schema evolves
> --
>
> Key: HIVE-13261
> URL: https://issues.apache.org/jira/browse/HIVE-13261
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13261.01.patch
>
>
> To repro
> {code}
> CREATE TABLE partitioned1(a INT, b STRING) PARTITIONED BY(part INT) STORED AS 
> TEXTFILE;
> insert into table partitioned1 partition(part=1) values(1, 'original'),(2, 
> 'original'), (3, 'original'),(4, 'original');
> -- Table-Non-Cascade ADD COLUMNS ...
> alter table partitioned1 add columns(c int, d string);
> insert into table partitioned1 partition(part=2) values(1, 'new', 10, 
> 'ten'),(2, 'new', 20, 'twenty'), (3, 'new', 30, 'thirty'),(4, 'new', 40, 
> 'forty');
> insert into table partitioned1 partition(part=1) values(5, 'new', 100, 
> 'hundred'),(6, 'new', 200, 'two hundred');
> analyze table partitioned1 compute statistics for columns;
> {code}
> Error msg:
> {code}
> 2016-03-10T14:55:43,205 ERROR [abc3eb8d-7432-47ae-b76f-54c8d7020312 main[]]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(177)) - 
> NoSuchObjectException(message:Column c for which stats gathering is requested 
> doesn't exist.)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.writeMPartitionColumnStatistics(ObjectStore.java:6492)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:6574)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13261) Can not compute column stats for partition when schema evolves

2016-03-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13261:
---
Attachment: HIVE-13261.01.patch

> Can not compute column stats for partition when schema evolves
> --
>
> Key: HIVE-13261
> URL: https://issues.apache.org/jira/browse/HIVE-13261
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13261.01.patch
>
>
> To repro
> {code}
> CREATE TABLE partitioned1(a INT, b STRING) PARTITIONED BY(part INT) STORED AS 
> TEXTFILE;
> insert into table partitioned1 partition(part=1) values(1, 'original'),(2, 
> 'original'), (3, 'original'),(4, 'original');
> -- Table-Non-Cascade ADD COLUMNS ...
> alter table partitioned1 add columns(c int, d string);
> insert into table partitioned1 partition(part=2) values(1, 'new', 10, 
> 'ten'),(2, 'new', 20, 'twenty'), (3, 'new', 30, 'thirty'),(4, 'new', 40, 
> 'forty');
> insert into table partitioned1 partition(part=1) values(5, 'new', 100, 
> 'hundred'),(6, 'new', 200, 'two hundred');
> analyze table partitioned1 compute statistics for columns;
> {code}
> Error msg:
> {code}
> 2016-03-10T14:55:43,205 ERROR [abc3eb8d-7432-47ae-b76f-54c8d7020312 main[]]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(177)) - 
> NoSuchObjectException(message:Column c for which stats gathering is requested 
> doesn't exist.)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.writeMPartitionColumnStatistics(ObjectStore.java:6492)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:6574)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13261) Can not compute column stats for partition when schema evolves

2016-03-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13261:
---
Attachment: (was: HIVE-13261.01.patch)

> Can not compute column stats for partition when schema evolves
> --
>
> Key: HIVE-13261
> URL: https://issues.apache.org/jira/browse/HIVE-13261
> Project: Hive
>  Issue Type: Bug
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13261.01.patch
>
>
> To repro
> {code}
> CREATE TABLE partitioned1(a INT, b STRING) PARTITIONED BY(part INT) STORED AS 
> TEXTFILE;
> insert into table partitioned1 partition(part=1) values(1, 'original'),(2, 
> 'original'), (3, 'original'),(4, 'original');
> -- Table-Non-Cascade ADD COLUMNS ...
> alter table partitioned1 add columns(c int, d string);
> insert into table partitioned1 partition(part=2) values(1, 'new', 10, 
> 'ten'),(2, 'new', 20, 'twenty'), (3, 'new', 30, 'thirty'),(4, 'new', 40, 
> 'forty');
> insert into table partitioned1 partition(part=1) values(5, 'new', 100, 
> 'hundred'),(6, 'new', 200, 'two hundred');
> analyze table partitioned1 compute statistics for columns;
> {code}
> Error msg:
> {code}
> 2016-03-10T14:55:43,205 ERROR [abc3eb8d-7432-47ae-b76f-54c8d7020312 main[]]: 
> metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(177)) - 
> NoSuchObjectException(message:Column c for which stats gathering is requested 
> doesn't exist.)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.writeMPartitionColumnStatistics(ObjectStore.java:6492)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:6574)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13223) HoS may hang for queries that run on 0 splits

2016-03-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194334#comment-15194334
 ] 

Sergey Shelukhin commented on HIVE-13223:
-

You should test if it did and file a separate JIRA assigned to [~ashutoshc] if 
so ;)

> HoS  may hang for queries that run on 0 splits 
> ---
>
> Key: HIVE-13223
> URL: https://issues.apache.org/jira/browse/HIVE-13223
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13223.1.patch, HIVE-13223.patch
>
>
> Can be seen on all timed out tests after HIVE-13040 went in



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13125) Support masking and filtering of rows/columns

2016-03-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13125:
---
Status: Open  (was: Patch Available)

> Support masking and filtering of rows/columns
> -
>
> Key: HIVE-13125
> URL: https://issues.apache.org/jira/browse/HIVE-13125
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13125.01.patch, HIVE-13125.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13125) Support masking and filtering of rows/columns

2016-03-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13125:
---
Status: Patch Available  (was: Open)

> Support masking and filtering of rows/columns
> -
>
> Key: HIVE-13125
> URL: https://issues.apache.org/jira/browse/HIVE-13125
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13125.01.patch, HIVE-13125.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13125) Support masking and filtering of rows/columns

2016-03-14 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-13125:
---
Attachment: HIVE-13125.02.patch

> Support masking and filtering of rows/columns
> -
>
> Key: HIVE-13125
> URL: https://issues.apache.org/jira/browse/HIVE-13125
> Project: Hive
>  Issue Type: New Feature
>Reporter: Pengcheng Xiong
>Assignee: Pengcheng Xiong
> Attachments: HIVE-13125.01.patch, HIVE-13125.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13223) HoS may hang for queries that run on 0 splits

2016-03-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194320#comment-15194320
 ] 

Prasanth Jayachandran commented on HIVE-13223:
--

AFAIK, filedump worked before this empty file bucket changes (HIVE-13040).  

> HoS  may hang for queries that run on 0 splits 
> ---
>
> Key: HIVE-13223
> URL: https://issues.apache.org/jira/browse/HIVE-13223
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13223.1.patch, HIVE-13223.patch
>
>
> Can be seen on all timed out tests after HIVE-13040 went in



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13223) HoS may hang for queries that run on 0 splits

2016-03-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194318#comment-15194318
 ] 

Sergey Shelukhin commented on HIVE-13223:
-

+1 pending tests. [~prasanth_j] wrong JIRA? :)

> HoS  may hang for queries that run on 0 splits 
> ---
>
> Key: HIVE-13223
> URL: https://issues.apache.org/jira/browse/HIVE-13223
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13223.1.patch, HIVE-13223.patch
>
>
> Can be seen on all timed out tests after HIVE-13040 went in



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13223) HoS may hang for queries that run on 0 splits

2016-03-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194305#comment-15194305
 ] 

Prasanth Jayachandran commented on HIVE-13223:
--

I created an empty ORC file and ran orcfiledump on it. It threw exception

{code}
create table concat_incompat(key string, value string) stored as orc;
insert overwrite table concat_incompat select * from src where key > 1000; // 
return 0 rows

hive --orcfiledump file:///app/warehouse/concat_incompat/00_0
Processing data file file:/app/warehouse/concat_incompat/00_0 [length: 0]
Exception in thread "main" java.lang.IndexOutOfBoundsException
at java.nio.Buffer.checkIndex(Buffer.java:540)
at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139)
at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:510)
at 
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:361)
at 
org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:83)
at 
org.apache.hadoop.hive.ql.io.orc.FileDump.getReader(FileDump.java:239)
at 
org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaDataImpl(FileDump.java:312)
at 
org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaData(FileDump.java:291)
at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:138)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{code}

> HoS  may hang for queries that run on 0 splits 
> ---
>
> Key: HIVE-13223
> URL: https://issues.apache.org/jira/browse/HIVE-13223
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.1.0
>Reporter: Ashutosh Chauhan
>Assignee: Ashutosh Chauhan
> Attachments: HIVE-13223.1.patch, HIVE-13223.patch
>
>
> Can be seen on all timed out tests after HIVE-13040 went in



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194281#comment-15194281
 ] 

Sergey Shelukhin edited comment on HIVE-13283 at 3/14/16 10:20 PM:
---

[~hagleitn] [~vikram.dixit] -can you take a look- actually, nm, this will not 
do what is needed


was (Author: sershe):
[~hagleitn] [~vikram.dixit] can you take a look?

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13283:

Attachment: (was: HIVE-13283.patch)

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13283:

Attachment: HIVE-13283.patch

[~hagleitn] [~vikram.dixit] can you take a look?

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13283.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13283:

Status: Patch Available  (was: Open)

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-13283.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10176) skip.header.line.count causes values to be skipped when performing insert values

2016-03-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194279#comment-15194279
 ] 

Hive QA commented on HIVE-10176:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793284/HIVE-10176.3.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7267/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7267/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7267/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7267/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
>From https://github.com/apache/hive
   214e4b6..1c44f4c  branch-1   -> origin/branch-1
   d4c1fdc..b6af012  master -> origin/master
+ git reset --hard HEAD
HEAD is now at d4c1fdc HIVE-13251: hive can't read the decimal in AVRO file 
generated from previous version (Reviewed by Szehon Ho)
+ git clean -f -d
+ git checkout master
Already on 'master'
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
+ git reset --hard origin/master
HEAD is now at b6af012 HIVE-13201 : Compaction shouldn't be allowed on non-ACID 
table (Wei Zheng, reviewed by Alan Gates)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793284 - PreCommit-HIVE-TRUNK-Build

> skip.header.line.count causes values to be skipped when performing insert 
> values
> 
>
> Key: HIVE-10176
> URL: https://issues.apache.org/jira/browse/HIVE-10176
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Wenbo Wang
>Assignee: Vladyslav Pavlenko
> Attachments: HIVE-10176.1.patch, HIVE-10176.2.patch, 
> HIVE-10176.3.patch, data
>
>
> When inserting values in to tables with TBLPROPERTIES 
> ("skip.header.line.count"="1") the first value listed is also skipped. 
> create table test (row int, name string) TBLPROPERTIES 
> ("skip.header.line.count"="1"); 
> load data local inpath '/root/data' into table test;
> insert into table test values (1, 'a'), (2, 'b'), (3, 'c');
> (1, 'a') isn't inserted into the table. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9457) Fix obsolete parameter name in HiveConf description of hive.hashtable.initialCapacity

2016-03-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194277#comment-15194277
 ] 

Sergey Shelukhin commented on HIVE-9457:


+1

> Fix obsolete parameter name in HiveConf description of 
> hive.hashtable.initialCapacity
> -
>
> Key: HIVE-9457
> URL: https://issues.apache.org/jira/browse/HIVE-9457
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.14.0
>Reporter: Lefty Leverenz
>Assignee: Shannon Ladymon
>Priority: Minor
> Attachments: HIVE-9457.patch
>
>
> The description of *hive.hashtable.initialCapacity* in HiveConf.java refers 
> to a parameter that existed in an early patch for HIVE-7616 
> ("hive.hashtable.stats.key.estimate.adjustment") but was renamed in later 
> patches.  So change *hive.hashtable.stats.key.estimate.adjustment* to 
> *hive.hashtable.key.count.adjustment* in this parameter definition in 
> HiveConf.java:
> {code}
> HIVEHASHTABLETHRESHOLD("hive.hashtable.initialCapacity", 10, "Initial 
> capacity of " +
> "mapjoin hashtable if statistics are absent, or if 
> hive.hashtable.stats.key.estimate.adjustment is set to 0"),
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13183) More logs in operation logs

2016-03-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194272#comment-15194272
 ] 

Hive QA commented on HIVE-13183:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793283/HIVE-13183.02.patch

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9820 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7266/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7266/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7266/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 7 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793283 - PreCommit-HIVE-TRUNK-Build

> More logs in operation logs
> ---
>
> Key: HIVE-13183
> URL: https://issues.apache.org/jira/browse/HIVE-13183
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajat Khandelwal
>Assignee: Rajat Khandelwal
> Attachments: HIVE-13183.02.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-9457) Fix obsolete parameter name in HiveConf description of hive.hashtable.initialCapacity

2016-03-14 Thread Shannon Ladymon (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-9457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194268#comment-15194268
 ] 

Shannon Ladymon commented on HIVE-9457:
---

[~sershe], if you get a chance could you review this?  It's a small patch 
updating the description of a parameter in HiveConf.

> Fix obsolete parameter name in HiveConf description of 
> hive.hashtable.initialCapacity
> -
>
> Key: HIVE-9457
> URL: https://issues.apache.org/jira/browse/HIVE-9457
> Project: Hive
>  Issue Type: Bug
>  Components: Documentation
>Affects Versions: 0.14.0
>Reporter: Lefty Leverenz
>Assignee: Shannon Ladymon
>Priority: Minor
> Attachments: HIVE-9457.patch
>
>
> The description of *hive.hashtable.initialCapacity* in HiveConf.java refers 
> to a parameter that existed in an early patch for HIVE-7616 
> ("hive.hashtable.stats.key.estimate.adjustment") but was renamed in later 
> patches.  So change *hive.hashtable.stats.key.estimate.adjustment* to 
> *hive.hashtable.key.count.adjustment* in this parameter definition in 
> HiveConf.java:
> {code}
> HIVEHASHTABLETHRESHOLD("hive.hashtable.initialCapacity", 10, "Initial 
> capacity of " +
> "mapjoin hashtable if statistics are absent, or if 
> hive.hashtable.stats.key.estimate.adjustment is set to 0"),
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons

2016-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin reassigned HIVE-13283:
---

Assignee: Sergey Shelukhin

> LLAP: make sure IO elevator is enabled by default in the daemons
> 
>
> Key: HIVE-13283
> URL: https://issues.apache.org/jira/browse/HIVE-13283
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13235) Insert from select generates incorrect result when hive.optimize.constant.propagation is on

2016-03-14 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13235:

Status: Patch Available  (was: Open)

Attached patch-2: for the cases when the column has both name and alias, we 
will use NamedColumnInfo which will match against column name during comparison 
rather than alias since alias is not visible yet for such cases.

> Insert from select generates incorrect result when 
> hive.optimize.constant.propagation is on
> ---
>
> Key: HIVE-13235
> URL: https://issues.apache.org/jira/browse/HIVE-13235
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13235.1.patch, HIVE-13235.2.patch
>
>
> The following query returns incorrect result when constant optimization is 
> turned on. The subquery happens to have an alias p1 to be the same as the 
> input partition name. Constant optimizer will optimize it incorrectly as the 
> constant.
> When constant optimizer is turned off, we will get the correct result.
> {noformat}
> set hive.cbo.enable=false;
> set hive.optimize.constant.propagation = true;
> create table t1(c1 string, c2 double) partitioned by (p1 string, p2 string);
> create table t2(p1 double, c2 string);
> insert into table t1 partition(p1='40', p2='p2') values('c1', 0.0);
> INSERT OVERWRITE TABLE t2  select if((c2 = 0.0), c2, '0') as p1, 2 as p2 from 
> t1 where c1 = 'c1' and p1 = '40';
> select * from t2;
> 40   2
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12977) Pass credentials in the current UGI while creating Tez session

2016-03-14 Thread Vinoth Sathappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Sathappan updated HIVE-12977:

Attachment: HIVE-12977.1.patch

> Pass credentials in the current UGI while creating Tez session
> --
>
> Key: HIVE-12977
> URL: https://issues.apache.org/jira/browse/HIVE-12977
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Vinoth Sathappan
>Assignee: Vinoth Sathappan
> Attachments: HIVE-12977.1.patch, HIVE-12977.1.patch
>
>
> The credentials present in the current UGI i.e. 
> UserGroupInformation.getCurrentUser().getCredentials() isn't passed to the 
> Tez session. It is instantiated with null credentials 
> session = TezClient.create("HIVE-" + sessionId, tezConfig, true,
> commonLocalResources, null);
> In this case, Tez fails to access resources even if the tokens are available 
> in memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13201) Compaction shouldn't be allowed on non-ACID table

2016-03-14 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13201:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to master and branch-1

> Compaction shouldn't be allowed on non-ACID table
> -
>
> Key: HIVE-13201
> URL: https://issues.apache.org/jira/browse/HIVE-13201
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13201.1.patch, HIVE-13201.2.patch
>
>
> Looks like compaction is allowed on non-ACID table, although that's of no 
> sense and does nothing. Moreover the compaction request will be enqueued into 
> COMPACTION_QUEUE metastore table, which brings unnecessary overhead.
> We should prevent compaction commands being allowed on non-ACID tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13235) Insert from select generates incorrect result when hive.optimize.constant.propagation is on

2016-03-14 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13235:

Attachment: HIVE-13235.2.patch

> Insert from select generates incorrect result when 
> hive.optimize.constant.propagation is on
> ---
>
> Key: HIVE-13235
> URL: https://issues.apache.org/jira/browse/HIVE-13235
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13235.1.patch, HIVE-13235.2.patch
>
>
> The following query returns incorrect result when constant optimization is 
> turned on. The subquery happens to have an alias p1 to be the same as the 
> input partition name. Constant optimizer will optimize it incorrectly as the 
> constant.
> When constant optimizer is turned off, we will get the correct result.
> {noformat}
> set hive.cbo.enable=false;
> set hive.optimize.constant.propagation = true;
> create table t1(c1 string, c2 double) partitioned by (p1 string, p2 string);
> create table t2(p1 double, c2 string);
> insert into table t1 partition(p1='40', p2='p2') values('c1', 0.0);
> INSERT OVERWRITE TABLE t2  select if((c2 = 0.0), c2, '0') as p1, 2 as p2 from 
> t1 where c1 = 'c1' and p1 = '40';
> select * from t2;
> 40   2
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13201) Compaction shouldn't be allowed on non-ACID table

2016-03-14 Thread Wei Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei Zheng updated HIVE-13201:
-
Attachment: HIVE-13201.2.patch

Thanks [~alangates]. I've updated those 3 tests to make the tables ACID.

> Compaction shouldn't be allowed on non-ACID table
> -
>
> Key: HIVE-13201
> URL: https://issues.apache.org/jira/browse/HIVE-13201
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13201.1.patch, HIVE-13201.2.patch
>
>
> Looks like compaction is allowed on non-ACID table, although that's of no 
> sense and does nothing. Moreover the compaction request will be enqueued into 
> COMPACTION_QUEUE metastore table, which brings unnecessary overhead.
> We should prevent compaction commands being allowed on non-ACID tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11292) MiniLlapCliDriver for running tests in llap

2016-03-14 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K resolved HIVE-11292.
---
   Resolution: Fixed
Fix Version/s: 2.0.0

> MiniLlapCliDriver for running tests in llap
> ---
>
> Key: HIVE-11292
> URL: https://issues.apache.org/jira/browse/HIVE-11292
> Project: Hive
>  Issue Type: Bug
>  Components: Test
>Affects Versions: llap
>Reporter: Vikram Dixit K
>Assignee: Vikram Dixit K
> Fix For: 2.0.0
>
>
> Create MiniLlapCliDriver for running unit tests in llap mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR

2016-03-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13084:

Status: Patch Available  (was: In Progress)

> Vectorization add support for PROJECTION Multi-AND/OR
> -
>
> Key: HIVE-13084
> URL: https://issues.apache.org/jira/browse/HIVE-13084
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Rajesh Balamohan
>Assignee: Matt McCline
> Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, 
> HIVE-13084.03.patch, vector_between_date.q
>
>
> When there is case statement in group by, hive throws unable to vectorize 
> exception.
> e.g query just to demonstrate the problem
> {noformat}
> explain select l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts 
> group by l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END;
> org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize 
> expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2
>   File Output Operator [FS_7]
> Group By Operator [GBY_5] (rows=888777234 width=108)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE]
>   SHUFFLE [RS_4]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_3] (rows=1777554469 width=108)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_1] (rows=1777554469 width=108)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=1777554469 width=108)
>   
> rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"]
> {noformat}
> \cc [~mmccline], [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR

2016-03-14 Thread Matt McCline (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt McCline updated HIVE-13084:

Status: In Progress  (was: Patch Available)

> Vectorization add support for PROJECTION Multi-AND/OR
> -
>
> Key: HIVE-13084
> URL: https://issues.apache.org/jira/browse/HIVE-13084
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Rajesh Balamohan
>Assignee: Matt McCline
> Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, 
> vector_between_date.q
>
>
> When there is case statement in group by, hive throws unable to vectorize 
> exception.
> e.g query just to demonstrate the problem
> {noformat}
> explain select l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts 
> group by l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END;
> org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize 
> expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2
>   File Output Operator [FS_7]
> Group By Operator [GBY_5] (rows=888777234 width=108)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE]
>   SHUFFLE [RS_4]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_3] (rows=1777554469 width=108)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_1] (rows=1777554469 width=108)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=1777554469 width=108)
>   
> rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"]
> {noformat}
> \cc [~mmccline], [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions

2016-03-14 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194098#comment-15194098
 ] 

Alan Gates commented on HIVE-13249:
---

There's already a threadpool in AcidHouseKeeperService.  This should use that 
rather than having its own separate threadpool.

Once the number of open transactions exceeds the threshold you should require 
it to drain a ways below that (maybe 90% of the threshold) before allowing new 
transactions.  This avoid it constantly lurching in and out of trouble.



> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13249) Hard upper bound on number of open transactions

2016-03-14 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-13249:
--
Status: Open  (was: Patch Available)

> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12977) Pass credentials in the current UGI while creating Tez session

2016-03-14 Thread Vinoth Sathappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Sathappan updated HIVE-12977:

Status: Patch Available  (was: Open)

> Pass credentials in the current UGI while creating Tez session
> --
>
> Key: HIVE-12977
> URL: https://issues.apache.org/jira/browse/HIVE-12977
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Vinoth Sathappan
>Assignee: Vinoth Sathappan
> Attachments: HIVE-12977.1.patch
>
>
> The credentials present in the current UGI i.e. 
> UserGroupInformation.getCurrentUser().getCredentials() isn't passed to the 
> Tez session. It is instantiated with null credentials 
> session = TezClient.create("HIVE-" + sessionId, tezConfig, true,
> commonLocalResources, null);
> In this case, Tez fails to access resources even if the tokens are available 
> in memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12977) Pass credentials in the current UGI while creating Tez session

2016-03-14 Thread Vinoth Sathappan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinoth Sathappan updated HIVE-12977:

Status: Open  (was: Patch Available)

> Pass credentials in the current UGI while creating Tez session
> --
>
> Key: HIVE-12977
> URL: https://issues.apache.org/jira/browse/HIVE-12977
> Project: Hive
>  Issue Type: Bug
>  Components: Tez
>Reporter: Vinoth Sathappan
>Assignee: Vinoth Sathappan
> Attachments: HIVE-12977.1.patch
>
>
> The credentials present in the current UGI i.e. 
> UserGroupInformation.getCurrentUser().getCredentials() isn't passed to the 
> Tez session. It is instantiated with null credentials 
> session = TezClient.create("HIVE-" + sessionId, tezConfig, true,
> commonLocalResources, null);
> In this case, Tez fails to access resources even if the tokens are available 
> in memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12481) Occasionally "Request is a replay" will be thrown from HS2

2016-03-14 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12481:

   Resolution: Fixed
Fix Version/s: 2.1.0
 Release Note: Added a new JDBC connection property "retries" so if any 
transport connection fails, JDBC client will retry for the times specified by 
this parameter. 
   Status: Resolved  (was: Patch Available)

> Occasionally "Request is a replay" will be thrown from HS2
> --
>
> Key: HIVE-12481
> URL: https://issues.apache.org/jira/browse/HIVE-12481
> Project: Hive
>  Issue Type: Improvement
>  Components: Authentication
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-12481.2.patch, HIVE-12481.3.patch, HIVE-12481.patch
>
>
> We have seen the following exception thrown from HS2 in secured cluster when 
> many queries are running simultaneously on single HS2 instance.
> The cause I can guess is that it happens that two queries are submitted at 
> the same time and have the same timestamp. For such case, we can add a retry 
> for the query.
>  
> {noformat}
> 2015-11-18 16:12:33,117 ERROR org.apache.thrift.transport.TSaslTransport: 
> SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: Failure unspecified at GSS-API level (Mechanism level: Request 
> is a replay (34))]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:177)
> at 
> org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539)
> at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283)
> at 
> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
> at 
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism 
> level: Request is a replay (34))
> at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:788)
> at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342)
> at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:155)
> ... 14 more
> Caused by: KrbException: Request is a replay (34)
> at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:308)
> at sun.security.krb5.KrbApReq.(KrbApReq.java:144)
> at 
> sun.security.jgss.krb5.InitSecContextToken.(InitSecContextToken.java:108)
> at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:771)
> ... 17 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12481) Occasionally "Request is a replay" will be thrown from HS2

2016-03-14 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-12481:

Labels: TODOC2.1  (was: )

> Occasionally "Request is a replay" will be thrown from HS2
> --
>
> Key: HIVE-12481
> URL: https://issues.apache.org/jira/browse/HIVE-12481
> Project: Hive
>  Issue Type: Improvement
>  Components: Authentication
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
>  Labels: TODOC2.1
> Fix For: 2.1.0
>
> Attachments: HIVE-12481.2.patch, HIVE-12481.3.patch, HIVE-12481.patch
>
>
> We have seen the following exception thrown from HS2 in secured cluster when 
> many queries are running simultaneously on single HS2 instance.
> The cause I can guess is that it happens that two queries are submitted at 
> the same time and have the same timestamp. For such case, we can add a retry 
> for the query.
>  
> {noformat}
> 2015-11-18 16:12:33,117 ERROR org.apache.thrift.transport.TSaslTransport: 
> SASL negotiation failure
> javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: Failure unspecified at GSS-API level (Mechanism level: Request 
> is a replay (34))]
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:177)
> at 
> org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539)
> at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283)
> at 
> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
> at 
> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:356)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651)
> at 
> org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism 
> level: Request is a replay (34))
> at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:788)
> at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342)
> at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285)
> at 
> com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:155)
> ... 14 more
> Caused by: KrbException: Request is a replay (34)
> at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:308)
> at sun.security.krb5.KrbApReq.(KrbApReq.java:144)
> at 
> sun.security.jgss.krb5.InitSecContextToken.(InitSecContextToken.java:108)
> at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:771)
> ... 17 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions

2016-03-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193937#comment-15193937
 ] 

Hive QA commented on HIVE-13249:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793243/HIVE-13249.1.patch

{color:red}ERROR:{color} -1 due to build exiting with an error

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7265/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7265/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7265/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Tests exited with: NonZeroExitCodeException
Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit 
status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]]
+ export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera
+ export 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ 
PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin
+ export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m '
+ ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m '
+ export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost 
-Dhttp.proxyPort=3128'
+ cd /data/hive-ptest/working/
+ tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7265/source-prep.txt
+ [[ false == \t\r\u\e ]]
+ mkdir -p maven ivy
+ [[ git = \s\v\n ]]
+ [[ git = \g\i\t ]]
+ [[ -z master ]]
+ [[ -d apache-github-source-source ]]
+ [[ ! -d apache-github-source-source/.git ]]
+ [[ ! -d apache-github-source-source ]]
+ cd apache-github-source-source
+ git fetch origin
+ git reset --hard HEAD
HEAD is now at d4c1fdc HIVE-13251: hive can't read the decimal in AVRO file 
generated from previous version (Reviewed by Szehon Ho)
+ git clean -f -d
Removing 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ColMultiAndCol.java
Removing 
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ColMultiOrCol.java
Removing ql/src/test/queries/clientpositive/vector_multi_and.q
Removing ql/src/test/queries/clientpositive/vector_multi_or.q
Removing ql/src/test/results/clientpositive/vector_multi_and.q.out
Removing ql/src/test/results/clientpositive/vector_multi_or.q.out
+ git checkout master
Already on 'master'
+ git reset --hard origin/master
HEAD is now at d4c1fdc HIVE-13251: hive can't read the decimal in AVRO file 
generated from previous version (Reviewed by Szehon Ho)
+ git merge --ff-only origin/master
Already up-to-date.
+ git gc
+ patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh
+ patchFilePath=/data/hive-ptest/working/scratch/build.patch
+ [[ -f /data/hive-ptest/working/scratch/build.patch ]]
+ chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh
+ /data/hive-ptest/working/scratch/smart-apply-patch.sh 
/data/hive-ptest/working/scratch/build.patch
The patch does not appear to apply with p0, p1, or p2
+ exit 1
'
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793243 - PreCommit-HIVE-TRUNK-Build

> Hard upper bound on number of open transactions
> ---
>
> Key: HIVE-13249
> URL: https://issues.apache.org/jira/browse/HIVE-13249
> Project: Hive
>  Issue Type: Improvement
>  Components: Transactions
>Affects Versions: 2.0.0
>Reporter: Wei Zheng
>Assignee: Wei Zheng
> Attachments: HIVE-13249.1.patch
>
>
> We need to have a safeguard by adding an upper bound for open transactions to 
> avoid huge number of open-transaction requests, usually due to improper 
> configuration of clients such as Storm.
> Once that limit is reached, clients will start failing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR

2016-03-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193929#comment-15193929
 ] 

Hive QA commented on HIVE-13084:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12793246/HIVE-13084.02.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9807 tests executed
*Failed tests:*
{noformat}
TestMiniTezCliDriver-dynpart_sort_optimization2.q-cte_mat_1.q-tez_bmj_schema_evolution.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testVectorizeAndOrProjectionExpression
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7264/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7264/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7264/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12793246 - PreCommit-HIVE-TRUNK-Build

> Vectorization add support for PROJECTION Multi-AND/OR
> -
>
> Key: HIVE-13084
> URL: https://issues.apache.org/jira/browse/HIVE-13084
> Project: Hive
>  Issue Type: Bug
>  Components: Vectorization
>Reporter: Rajesh Balamohan
>Assignee: Matt McCline
> Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, 
> vector_between_date.q
>
>
> When there is case statement in group by, hive throws unable to vectorize 
> exception.
> e.g query just to demonstrate the problem
> {noformat}
> explain select l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts 
> group by l_partkey, case when l_commitdate between '2015-06-30' AND 
> '2015-07-06' THEN '2015-06-30' END;
> org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize 
> expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc
> Vertex dependency in root stage
> Reducer 2 <- Map 1 (SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
> limit:-1
> Stage-1
>   Reducer 2
>   File Output Operator [FS_7]
> Group By Operator [GBY_5] (rows=888777234 width=108)
>   Output:["_col0","_col1"],keys:KEY._col0, KEY._col1
> <-Map 1 [SIMPLE_EDGE]
>   SHUFFLE [RS_4]
> PartitionCols:_col0, _col1
> Group By Operator [GBY_3] (rows=1777554469 width=108)
>   Output:["_col0","_col1"],keys:_col0, _col1
>   Select Operator [SEL_1] (rows=1777554469 width=108)
> Output:["_col0","_col1"]
> TableScan [TS_0] (rows=1777554469 width=108)
>   
> rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"]
> {noformat}
> \cc [~mmccline], [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13258) LLAP: Add hdfs bytes read and spilled bytes to tez print summary

2016-03-14 Thread Prasanth Jayachandran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193915#comment-15193915
 ] 

Prasanth Jayachandran commented on HIVE-13258:
--

FileSystemCounters does not show up for LLAP. [~sseth] Do we need any tez side 
changes for these? Can we hold on to FileSystem reference somewhere and when we 
unregister the task read FS counters, update the TezCounters (that we add in 
registerTask()) and remove the references?

> LLAP: Add hdfs bytes read and spilled bytes to tez print summary
> 
>
> Key: HIVE-13258
> URL: https://issues.apache.org/jira/browse/HIVE-13258
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Prasanth Jayachandran
>Assignee: Prasanth Jayachandran
>
> When printing counters to console it will be useful to print hdfs bytes read 
> and spilled bytes which will help with debugging issues faster. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12995) LLAP: Synthetic file ids need collision checks

2016-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-12995:

Attachment: HIVE-12995.04.patch

The same patch... looks like it disappeared from the queue

> LLAP: Synthetic file ids need collision checks
> --
>
> Key: HIVE-12995
> URL: https://issues.apache.org/jira/browse/HIVE-12995
> Project: Hive
>  Issue Type: Bug
>  Components: llap
>Affects Versions: 2.1.0
>Reporter: Gopal V
>Assignee: Sergey Shelukhin
> Attachments: HIVE-12995.01.patch, HIVE-12995.02.patch, 
> HIVE-12995.03.patch, HIVE-12995.04.patch, HIVE-12995.patch
>
>
> LLAP synthetic file ids do not have any way of checking whether a collision 
> occurs other than a data-error.
> Synthetic file-ids have only been used with unit tests so far - but they will 
> be needed to add cache mechanisms to non-HDFS filesystems.
> In case of Synthetic file-ids, it is recommended that we track the full-tuple 
> (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id 
> can be compared against the parameters & only accepted if those match.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13218) LLAP: better configs part 1

2016-03-14 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-13218:

   Resolution: Fixed
Fix Version/s: 2.1.0
   Status: Resolved  (was: Patch Available)

> LLAP: better configs part 1
> ---
>
> Key: HIVE-13218
> URL: https://issues.apache.org/jira/browse/HIVE-13218
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.1.0
>
> Attachments: HIVE-13218.01.patch, HIVE-13218.patch
>
>
> 1) IO threads need to be settable when creating the package, and should be 
> equal to the number of executors by default.
> 2) uber should be disabled in "all" mode as it's slower than running in LLAP.
> Maybe others.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13280) Error when more than 1 mapper for HBase storage handler

2016-03-14 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-13280:

Description: 
With a simple query (select from orc table and insert into HBase external 
table):
{code:sql}
insert into table register.register  select * from aa_temp
{code}
The aa_temp table have 45 orc files. It generate 45 mappers.
Some mappers fail with this error:
{noformat}
Caused by: java.lang.IllegalArgumentException: Must specify table name
at 
org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
at 
org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
at 
org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
... 25 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed due 
to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 
killedVertices:0 (state=08S01,code=2)
{noformat}

If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is 
fine because there are only one mapper.



  was:
With a simple query (select from orc table and insert into HBase external 
table):
{code:sql}
insert into table register.register  select * from aa_temp
{code}
Some mapper fail with this error:
{noformat}
Caused by: java.lang.IllegalArgumentException: Must specify table name
at 
org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
at 
org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
at 
org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
... 25 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed due 
to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 
killedVertices:0 (state=08S01,code=2)
{noformat}



> Error when more than 1 mapper for HBase storage handler
> ---
>
> Key: HIVE-13280
> URL: https://issues.apache.org/jira/browse/HIVE-13280
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.0.0
>Reporter: Damien Carol
>
> With a simple query (select from orc table and insert into HBase external 
> table):
> {code:sql}
> insert into table register.register  select * from aa_temp
> {code}
> The aa_temp table have 45 orc files. It generate 45 mappers.
> Some mappers fail with this error:
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at 
> org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
> at 
> org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
> ... 25 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
> killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed 
> due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. 
> failedVertices:1 killedVertices:0 (state=08S01,code=2)
> {noformat}
> If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is 
> fine because there are only one mapper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13280) Error when more than 1 mapper for HBase storage handler

2016-03-14 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-13280:

Component/s: HBase Handler

> Error when more than 1 mapper for HBase storage handler
> ---
>
> Key: HIVE-13280
> URL: https://issues.apache.org/jira/browse/HIVE-13280
> Project: Hive
>  Issue Type: Bug
>  Components: HBase Handler
>Affects Versions: 2.0.0
>Reporter: Damien Carol
>
> With a simple query (select from orc table and insert into HBase external 
> table):
> {code:sql}
> insert into table register.register  select * from aa_temp
> {code}
> Some mapper fail with this error:
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at 
> org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
> at 
> org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
> ... 25 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
> killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed 
> due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. 
> failedVertices:1 killedVertices:0 (state=08S01,code=2)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13280) Error when more than 1 mapper for HBase storage handler

2016-03-14 Thread Damien Carol (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damien Carol updated HIVE-13280:

Description: 


{noformat}
Caused by: java.lang.IllegalArgumentException: Must specify table name
at 
org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
at 
org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
at 
org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
at 
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
at 
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
... 25 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed due 
to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 
killedVertices:0 (state=08S01,code=2)
{noformat}


> Error when more than 1 mapper for HBase storage handler
> ---
>
> Key: HIVE-13280
> URL: https://issues.apache.org/jira/browse/HIVE-13280
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Damien Carol
>
> {noformat}
> Caused by: java.lang.IllegalArgumentException: Must specify table name
> at 
> org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188)
> at 
> org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101)
> at 
> org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300)
> at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290)
> at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126)
> ... 25 more
> ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 
> killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed 
> due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. 
> failedVertices:1 killedVertices:0 (state=08S01,code=2)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13243) Hive drop table on encyption zone fails for external tables

2016-03-14 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193582#comment-15193582
 ] 

Chaoyu Tang commented on HIVE-13243:


The four test failures seem not related to this patch. [~spena], could you help 
to review the patch? Thanks

> Hive drop table on encyption zone fails for external tables
> ---
>
> Key: HIVE-13243
> URL: https://issues.apache.org/jira/browse/HIVE-13243
> Project: Hive
>  Issue Type: Bug
>  Components: Encryption, Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-13243.1.patch, HIVE-13243.2.patch, HIVE-13243.patch
>
>
> When dropping an external table with its data located in an encryption zone, 
> hive should not throw out MetaException(message:Unable to drop table because 
> it is in an encryption zone and trash is enabled. Use PURGE option to skip 
> trash.) in checkTrashPurgeCombination since the data should not get deleted 
> (or trashed) anyway regardless HDFS Trash is enabled or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13243) Hive drop table on encyption zone fails for external tables

2016-03-14 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193583#comment-15193583
 ] 

Chaoyu Tang commented on HIVE-13243:


[~spena] Thanks!

> Hive drop table on encyption zone fails for external tables
> ---
>
> Key: HIVE-13243
> URL: https://issues.apache.org/jira/browse/HIVE-13243
> Project: Hive
>  Issue Type: Bug
>  Components: Encryption, Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-13243.1.patch, HIVE-13243.2.patch, HIVE-13243.patch
>
>
> When dropping an external table with its data located in an encryption zone, 
> hive should not throw out MetaException(message:Unable to drop table because 
> it is in an encryption zone and trash is enabled. Use PURGE option to skip 
> trash.) in checkTrashPurgeCombination since the data should not get deleted 
> (or trashed) anyway regardless HDFS Trash is enabled or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13243) Hive drop table on encyption zone fails for external tables

2016-03-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HIVE-13243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193574#comment-15193574
 ] 

Sergio Peña commented on HIVE-13243:


Great. Tests are good.
+1

> Hive drop table on encyption zone fails for external tables
> ---
>
> Key: HIVE-13243
> URL: https://issues.apache.org/jira/browse/HIVE-13243
> Project: Hive
>  Issue Type: Bug
>  Components: Encryption, Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-13243.1.patch, HIVE-13243.2.patch, HIVE-13243.patch
>
>
> When dropping an external table with its data located in an encryption zone, 
> hive should not throw out MetaException(message:Unable to drop table because 
> it is in an encryption zone and trash is enabled. Use PURGE option to skip 
> trash.) in checkTrashPurgeCombination since the data should not get deleted 
> (or trashed) anyway regardless HDFS Trash is enabled or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-03-14 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13149:

Attachment: HIVE-13149.4.patch

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-03-14 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13149:

Attachment: (was: HIVE-13149.4.patch)

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2

2016-03-14 Thread Aihua Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aihua Xu updated HIVE-13149:

Attachment: HIVE-13149.4.patch

> Remove some unnecessary HMS connections from HS2 
> -
>
> Key: HIVE-13149
> URL: https://issues.apache.org/jira/browse/HIVE-13149
> Project: Hive
>  Issue Type: Sub-task
>  Components: HiveServer2
>Affects Versions: 2.0.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, 
> HIVE-13149.3.patch, HIVE-13149.4.patch
>
>
> In SessionState class, currently we will always try to get a HMS connection 
> in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} 
> regardless of if the connection will be used later or not. 
> When SessionState is accessed by the tasks in TaskRunner.java, although most 
> of the tasks other than some like StatsTask, don't need to access HMS. 
> Currently a new HMS connection will be established for each Task thread. If 
> HiveServer2 is configured to run in parallel and the query involves many 
> tasks, then the connections are created but unused.
> {noformat}
>   @Override
>   public void run() {
> runner = Thread.currentThread();
> try {
>   OperationLog.setCurrentOperationLog(operationLog);
>   SessionState.start(ss);
>   runSequential();
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13243) Hive drop table on encyption zone fails for external tables

2016-03-14 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193532#comment-15193532
 ] 

Hive QA commented on HIVE-13243:




Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12792596/HIVE-13243.2.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9818 tests executed
*Failed tests:*
{noformat}
TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not 
produce a TEST-*.xml file
TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more
 - did not produce a TEST-*.xml file
TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more 
- did not produce a TEST-*.xml file
TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not 
produce a TEST-*.xml file
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7263/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7263/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7263/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12792596 - PreCommit-HIVE-TRUNK-Build

> Hive drop table on encyption zone fails for external tables
> ---
>
> Key: HIVE-13243
> URL: https://issues.apache.org/jira/browse/HIVE-13243
> Project: Hive
>  Issue Type: Bug
>  Components: Encryption, Metastore
>Reporter: Chaoyu Tang
>Assignee: Chaoyu Tang
> Attachments: HIVE-13243.1.patch, HIVE-13243.2.patch, HIVE-13243.patch
>
>
> When dropping an external table with its data located in an encryption zone, 
> hive should not throw out MetaException(message:Unable to drop table because 
> it is in an encryption zone and trash is enabled. Use PURGE option to skip 
> trash.) in checkTrashPurgeCombination since the data should not get deleted 
> (or trashed) anyway regardless HDFS Trash is enabled or not.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13232) Aggressively drop compression buffers in ORC OutStreams

2016-03-14 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-13232:
-
Attachment: HIVE-13232.patch

At first I didn't think that I could unit test this change, but then I realized 
that I could use the OutStream.getBufferSize to observe the change. This patch 
just adds the new test.

> Aggressively drop compression buffers in ORC OutStreams
> ---
>
> Key: HIVE-13232
> URL: https://issues.apache.org/jira/browse/HIVE-13232
> Project: Hive
>  Issue Type: Bug
>  Components: ORC
>Reporter: Owen O'Malley
>Assignee: Owen O'Malley
> Fix For: 0.14.1, 1.3.0, 2.1.0
>
> Attachments: HIVE-13232.patch, HIVE-13232.patch, HIVE-13232.patch
>
>
> In Hive 0.11, when ORC's OutStream's were flushed they dropped all of the 
> their buffers. In the patch for HIVE-4324, we inadvertently changed that 
> behavior so that one of the buffers is held on to. For queries with a lot of 
> writers and thus under significant memory pressure this can have a 
> significant impact on the memory usage. 
> Note that "hive.optimize.sort.dynamic.partition" avoids this problem by 
> sorting on the dynamic partition key and thus only a single ORC writer is 
> open at once. This will use memory more effectively and avoid creating ORC 
> files with very small stripes, which will produce better downstream 
> performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HIVE-13279) SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's file system

2016-03-14 Thread Aleksey Vovchenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-13279 started by Aleksey Vovchenko.

> SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's 
> file system
> --
>
> Key: HIVE-13279
> URL: https://issues.apache.org/jira/browse/HIVE-13279
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Aleksey Vovchenko
>Assignee: Aleksey Vovchenko
>
> h2. STEP 1. Create test Tables
> Execute in command line:
> {noformat} 
> nano test.data
> {noformat} 
> Add to file:
> {noformat}
> 1,aa
> 2,aa
> 3,ff
> 4,sad
> 5,adsf
> 6,adsf
> 7,affss
> {noformat}
> {noformat}
> hadoop fs -put test.data /
> {noformat} 
> {noformat}
> hive> create table test (x int, y string, z string) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',';
> hive> create table ptest(x int, y string) partitioned by(z string); 
> hive> LOAD DATA  INPATH '/test.data' OVERWRITE INTO TABLE test;
> hive> insert overwrite table ptest partition(z=65) select * from test;
> hive> insert overwrite table ptest partition(z=67) select * from test;
> {noformat}
> h2. STEP 2. Compare lastUpdateTime
> Execute in Hive shell:
> {noformat}
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65');
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67');
> {noformat}
> lastUpdateTime should be different.
> h2. STEP 3. Put data into hdfs and compare lastUpdateTime
> Execute in command line:
> {noformat}
> hadoop fs -put test.data /user/hive/warehouse/ptest
> {noformat}
> Execute in Hive shell:
> {noformat}
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65');
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67');
> {noformat}
> lastUpdateTime should be different but they are same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13279) SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's file system

2016-03-14 Thread Aleksey Vovchenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Vovchenko updated HIVE-13279:
-
Status: Open  (was: Patch Available)

> SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's 
> file system
> --
>
> Key: HIVE-13279
> URL: https://issues.apache.org/jira/browse/HIVE-13279
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Aleksey Vovchenko
>Assignee: Aleksey Vovchenko
>
> h2. STEP 1. Create test Tables
> Execute in command line:
> {noformat} 
> nano test.data
> {noformat} 
> Add to file:
> {noformat}
> 1,aa
> 2,aa
> 3,ff
> 4,sad
> 5,adsf
> 6,adsf
> 7,affss
> {noformat}
> {noformat}
> hadoop fs -put test.data /
> {noformat} 
> {noformat}
> hive> create table test (x int, y string, z string) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',';
> hive> create table ptest(x int, y string) partitioned by(z string); 
> hive> LOAD DATA  INPATH '/test.data' OVERWRITE INTO TABLE test;
> hive> insert overwrite table ptest partition(z=65) select * from test;
> hive> insert overwrite table ptest partition(z=67) select * from test;
> {noformat}
> h2. STEP 2. Compare lastUpdateTime
> Execute in Hive shell:
> {noformat}
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65');
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67');
> {noformat}
> lastUpdateTime should be different.
> h2. STEP 3. Put data into hdfs and compare lastUpdateTime
> Execute in command line:
> {noformat}
> hadoop fs -put test.data /user/hive/warehouse/ptest
> {noformat}
> Execute in Hive shell:
> {noformat}
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65');
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67');
> {noformat}
> lastUpdateTime should be different but they are same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13279) SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's file system

2016-03-14 Thread Aleksey Vovchenko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Vovchenko updated HIVE-13279:
-
Status: Patch Available  (was: Open)

> SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's 
> file system
> --
>
> Key: HIVE-13279
> URL: https://issues.apache.org/jira/browse/HIVE-13279
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Aleksey Vovchenko
>Assignee: Aleksey Vovchenko
>
> h2. STEP 1. Create test Tables
> Execute in command line:
> {noformat} 
> nano test.data
> {noformat} 
> Add to file:
> {noformat}
> 1,aa
> 2,aa
> 3,ff
> 4,sad
> 5,adsf
> 6,adsf
> 7,affss
> {noformat}
> {noformat}
> hadoop fs -put test.data /
> {noformat} 
> {noformat}
> hive> create table test (x int, y string, z string) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',';
> hive> create table ptest(x int, y string) partitioned by(z string); 
> hive> LOAD DATA  INPATH '/test.data' OVERWRITE INTO TABLE test;
> hive> insert overwrite table ptest partition(z=65) select * from test;
> hive> insert overwrite table ptest partition(z=67) select * from test;
> {noformat}
> h2. STEP 2. Compare lastUpdateTime
> Execute in Hive shell:
> {noformat}
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65');
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67');
> {noformat}
> lastUpdateTime should be different.
> h2. STEP 3. Put data into hdfs and compare lastUpdateTime
> Execute in command line:
> {noformat}
> hadoop fs -put test.data /user/hive/warehouse/ptest
> {noformat}
> Execute in Hive shell:
> {noformat}
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65');
> hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67');
> {noformat}
> lastUpdateTime should be different but they are same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-13251) hive can't read the decimal in AVRO file generated from previous version

2016-03-14 Thread Aihua Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-13251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193423#comment-15193423
 ] 

Aihua Xu commented on HIVE-13251:
-

dec_old.avro is binary. Seems we had issue to apply patch for such file as I 
found from other jira in HIVE-5823.

I will do the same. I have verified the avro_decimal_old.q passed locally.

Commit instructions:
1. add attachment dec.avro to data/files folder
2. apply attached patch.
3. commit




> hive can't read the decimal in AVRO file generated from previous version
> 
>
> Key: HIVE-13251
> URL: https://issues.apache.org/jira/browse/HIVE-13251
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 2.1.0
>Reporter: Aihua Xu
>Assignee: Aihua Xu
> Attachments: HIVE-13251.1.patch
>
>
> HIVE-7174 makes the avro schema change to match avro definition, while it 
> breaks the compatibility if the file is generated from the previous Hive 
> although the file schema from the file for such decimal is not correct based 
> on avro definition. We should allow to read old file format "precision" : 
> "4", "scale": "8", but when we write, we should write in the new format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-13276) Hive on Spark doesn't work when spark.master=local

2016-03-14 Thread Xuefu Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-13276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuefu Zhang updated HIVE-13276:
---
Assignee: (was: Xuefu Zhang)

> Hive on Spark doesn't work when spark.master=local
> --
>
> Key: HIVE-13276
> URL: https://issues.apache.org/jira/browse/HIVE-13276
> Project: Hive
>  Issue Type: Bug
>  Components: Spark
>Affects Versions: 2.1.0
>Reporter: Xuefu Zhang
>
> The following problem occurs with latest Hive master and Spark 1.6.1. I'm 
> using hive CLI on mac.
> {code}
>   set mapreduce.job.reduces=
> java.lang.NoClassDefFoundError: Could not initialize class 
> org.apache.spark.rdd.RDDOperationScope$
>   at org.apache.spark.SparkContext.withScope(SparkContext.scala:714)
>   at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:991)
>   at 
> org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:419)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:205)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:145)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:117)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.execute(LocalHiveSparkClient.java:130)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:71)
>   at 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:94)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:156)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:101)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1837)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1578)
>   at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1351)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1122)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1110)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
>   at 
> org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778)
>   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> FAILED: Execution Error, return code -101 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask. Could not initialize class 
> org.apache.spark.rdd.RDDOperationScope$
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-12619) Switching the field order within an array of structs causes the query to fail

2016-03-14 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HIVE-12619:
---
Status: Patch Available  (was: Open)

> Switching the field order within an array of structs causes the query to fail
> -
>
> Key: HIVE-12619
> URL: https://issues.apache.org/jira/browse/HIVE-12619
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Ang Zhang
>Assignee: Mohammad Kamrul Islam
>Priority: Minor
> Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch
>
>
> Switching the field order within an array of structs causes the query to fail 
> or return the wrong data for the fields, but switching the field order within 
> just a struct works.
> How to reproduce:
> Case1 if the two fields have the same type, query will return wrong data for 
> the fields
> drop table if exists schema_test;
> create table schema_test (msg array) stored 
> as parquet;
> insert into table schema_test select stack(2, array(named_struct('f1', 'abc', 
> 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one 
> limit 2;
> select * from schema_test;
> --returns
> --[{"f1":"efg","f2":"efg2"}]
> --[{"f1":"abc","f2":"abc2"}]
> alter table schema_test change msg msg array;
> select * from schema_test;
> --returns
> --[{"f2":"efg","f1":"efg2"}]
> --[{"f2":"abc","f1":"abc2"}]
> Case2: if the two fields have different type, the query will fail
> drop table if exists schema_test;
> create table schema_test (msg array) stored as 
> parquet;
> insert into table schema_test select stack(2, array(named_struct('f1', 'abc', 
> 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2;
> select * from schema_test;
> --returns
> --[{"f1":"efg","f2":2}]
> --[{"f1":"abc","f2":1}]
> alter table schema_test change msg msg array;
> select * from schema_test;
> Failed with exception 
> java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to 
> org.apache.hadoop.io.IntWritable



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >