[jira] [Commented] (HIVE-13269) Simplify comparison expressions using column stats
[ https://issues.apache.org/jira/browse/HIVE-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194712#comment-15194712 ] Hive QA commented on HIVE-13269: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12793290/HIVE-13269.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 200 failed/errored test(s), 9826 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_join_pkfk org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_lineage2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_simple_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_simple_select org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constprog_semijoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_decimal_udf org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_rdd_cache org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynpart_sort_optimization org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_ppd org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_infer_const_type org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input39 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input41 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join42 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_cond_pushdown_unqual3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters_overlap org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_grp_diff_keys org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_view org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_lineage3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_3
[jira] [Commented] (HIVE-13239) "java.lang.OutOfMemoryError: unable to create new native thread" occurs at Hive on Tez
[ https://issues.apache.org/jira/browse/HIVE-13239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194663#comment-15194663 ] Wataru Yukawa commented on HIVE-13239: -- Maybe This problem seems to be same as https://issues.apache.org/jira/browse/HIVE-13273 > "java.lang.OutOfMemoryError: unable to create new native thread" occurs at > Hive on Tez > -- > > Key: HIVE-13239 > URL: https://issues.apache.org/jira/browse/HIVE-13239 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Environment: HDP2.3.4 > JDK1.8 > CentOS 6 >Reporter: Wataru Yukawa > > "ps -L $(pgrep -f hiveserver2) | wc -l" is more than 15,000 > HiveServer2 memory leak occurs. > hive query > {code} > FROM hoge_tmp > INSERT INTO TABLE hoge PARTITION (...) >SELECT ... WHERE ... > {code} > stacktrace > {code} > org.apache.hive.service.cli.HiveSQLException: Error while processing > statement: FAILED: Execution Error, return code -101 from > org.apache.hadoop.hive.ql.exec.tez.TezTask. unable to create new native thread > at > org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315) > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:156) > at > org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:183) > at > org.apache.hive.service.cli.operation.Operation.run(Operation.java:257) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:410) > at > org.apache.hive.service.cli.session.HiveSessionImpl.executeStatement(HiveSessionImpl.java:391) > at > org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:261) > at > org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:486) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313) > at > org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:714) > at > org.apache.hadoop.hdfs.DFSOutputStream.start(DFSOutputStream.java:2238) > at > org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1753) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703) > at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638) > at > org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448) > at > org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:444) > at > org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:776) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:577) > at > org.apache.tez.common.TezCommonUtils.createFileForAM(TezCommonUtils.java:310) > at > org.apache.tez.client.TezClientUtils.createApplicationSubmissionContext(TezClientUtils.java:559) > at org.apache.tez.client.TezClient.start(TezClient.java:395) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:196) > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.updateSession(TezTask.java:271) > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:151) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at >
[jira] [Commented] (HIVE-13273) "java.lang.OutOfMemoryError: unable to create new native thread" occurs at Hive on MapReduce
[ https://issues.apache.org/jira/browse/HIVE-13273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194659#comment-15194659 ] Wataru Yukawa commented on HIVE-13273: -- This problem seems to be same as https://community.hortonworks.com/questions/20116/logfdscacheflushtimer-thread-increase.html I upgrade HDP2.4, so it seems that this problem is resolved. > "java.lang.OutOfMemoryError: unable to create new native thread" occurs at > Hive on MapReduce > > > Key: HIVE-13273 > URL: https://issues.apache.org/jira/browse/HIVE-13273 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Environment: HDP2.3.4 > JDK1.8 > CentOS 6 >Reporter: Wataru Yukawa >Assignee: Vaibhav Gumashta > > "ps -L $(pgrep -f hiveserver2) | wc -l" is more than 15,000 > HiveServer2 memory leak occurs. > heapstats result is https://gyazo.com/27dcaf678fb8d2e4003af55a79c2020e > hiveserver2.log > {code} > 2016-03-13 13:25:06,041 INFO [HiveServer2-Handler-Pool: Thread-98838]: > retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(144)) - > Exception while invoking getFileInfo of class Cl > ientNamenodeProtocolTranslatorPB over ...:8020 after 3 fail over attempts. > Trying to fail over immediately. > java.io.IOException: Failed on local exception: java.io.IOException: Couldn't > set up IO streams; Host Details : local host is: "..."; destination host is: > "...":8020; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773) > at org.apache.hadoop.ipc.Client.call(Client.java:1431) > at org.apache.hadoop.ipc.Client.call(Client.java:1358) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) > at com.sun.proxy.$Proxy15.getFileInfo(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:771) > at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2116) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1315) > at > org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1311) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1311) > at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1424) > at > org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:85) > at > org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:95) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:190) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) > at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561) > at > org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:431) > at > org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1237) > at
[jira] [Updated] (HIVE-13285) Orc concatenation may drop old files from moving to final path
[ https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-13285: - Attachment: HIVE-13285.2.patch > Orc concatenation may drop old files from moving to final path > -- > > Key: HIVE-13285 > URL: https://issues.apache.org/jira/browse/HIVE-13285 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-13285.1.patch, HIVE-13285.2.patch > > > ORC concatenation uses combine hive input format for merging files. Under > specific case where all files within a combine split are incompatible for > merge (old files without stripe statistics) then these files are added to > incompatible file set. But this file set is not processed as closeOp() will > not be called (no output file writer will exist which will skip > super.closeOp()). As a result, these incompatible files are not moved to > final path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13277) Exception "Unable to create serializer 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " occurred during query execution on spark engine when ve
[ https://issues.apache.org/jira/browse/HIVE-13277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194645#comment-15194645 ] Rui Li commented on HIVE-13277: --- I built a local snapshot of kryo with latest code and verified the query can pass. So the root cause should be the kryo (3.0.3) we use. I'm afraid there's not much we can do at the moment because the fix hasn't been released yet. [~xuefuz] what do you think? > Exception "Unable to create serializer > 'org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer' " > occurred during query execution on spark engine when vectorized execution is > switched on > - > > Key: HIVE-13277 > URL: https://issues.apache.org/jira/browse/HIVE-13277 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Hive Version: Apache Hive 2.0.0 > Spark Version: Apache Spark 1.6.0 >Reporter: Xin Hao > > Found when executing TPCx-BB query2 for Hive on Spark engine, and switch on : > Found during TPCx-BB query2 execution on spark engine when vectorized > execution is switched on: > (1) set hive.vectorized.execution.enabled=true; > (2) set hive.vectorized.execution.reduce.enabled=true; (default value for > Apache Hive 2.0.0) > It's OK for spark engine when hive.vectorized.execution.enabled is switched > off: > (1) set hive.vectorized.execution.enabled=false; > (2) set hive.vectorized.execution.reduce.enabled=true; > For MR engine, the query could pass and no exception occurred when vectorized > execution is either switched on or switched off. > Detail Error Message is below: > {noformat} > 2016-03-14T10:09:33,692 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 INFO > spark.MapOutputTrackerMaster: Size of output statuses for shuffle 0 is 154 > bytes > 2016-03-14T10:09:33,818 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) - 16/03/14 10:09:33 WARN > scheduler.TaskSetManager: Lost task 0.0 in stage 4.0 (TID 25, bhx3): > java.lang.RuntimeException: Failed to load plan: > hdfs://bhx3:8020/tmp/hive/root/40b90ebd-32d4-47bc-a5ab-12ff1c05d0d2/hive_2016-03-14_10-08-56_307_7692316402338632647-1/-mr-10002/ab0c0021-0c1a-496e-9703-87d5879353c8/reduce.xml: > org.apache.hive.com.esotericsoftware.kryo.KryoException: > java.lang.IllegalArgumentException: Unable to create serializer > "org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer" for > class: org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator > 2016-03-14T10:09:33,818 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) - Serialization trace: > 2016-03-14T10:09:33,818 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) - childOperators > (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator) > 2016-03-14T10:09:33,818 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) - childOperators > (org.apache.hadoop.hive.ql.exec.vector.VectorLimitOperator) > 2016-03-14T10:09:33,818 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) - childOperators > (org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator) > 2016-03-14T10:09:33,818 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) - reducer > (org.apache.hadoop.hive.ql.plan.ReduceWork) > 2016-03-14T10:09:33,818 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) -at > org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:451) > 2016-03-14T10:09:33,818 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) -at > org.apache.hadoop.hive.ql.exec.Utilities.getReduceWork(Utilities.java:306) > 2016-03-14T10:09:33,819 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) -at > org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.init(SparkReduceRecordHandler.java:117) > 2016-03-14T10:09:33,819 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) -at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:46) > 2016-03-14T10:09:33,819 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) -at > org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunction.call(HiveReduceFunction.java:28) > 2016-03-14T10:09:33,819 INFO [stderr-redir-1]: client.SparkClientImpl > (SparkClientImpl.java:run(593)) -at > org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:192) > 2016-03-14T10:09:33,819 INFO
[jira] [Commented] (HIVE-13285) Orc concatenation may drop old files from moving to final path
[ https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194640#comment-15194640 ] Prasanth Jayachandran commented on HIVE-13285: -- Yes. That's correct. I will update the patch. > Orc concatenation may drop old files from moving to final path > -- > > Key: HIVE-13285 > URL: https://issues.apache.org/jira/browse/HIVE-13285 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-13285.1.patch > > > ORC concatenation uses combine hive input format for merging files. Under > specific case where all files within a combine split are incompatible for > merge (old files without stripe statistics) then these files are added to > incompatible file set. But this file set is not processed as closeOp() will > not be called (no output file writer will exist which will skip > super.closeOp()). As a result, these incompatible files are not moved to > final path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13285) Orc concatenation may drop old files from moving to final path
[ https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194637#comment-15194637 ] Gopal V commented on HIVE-13285: As far as I understand, the issue is that relevant super.closeOp() is not called in outWriter == null case. Can you make the patch clearer to imply that super.closeOp() has to be always called? > Orc concatenation may drop old files from moving to final path > -- > > Key: HIVE-13285 > URL: https://issues.apache.org/jira/browse/HIVE-13285 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-13285.1.patch > > > ORC concatenation uses combine hive input format for merging files. Under > specific case where all files within a combine split are incompatible for > merge (old files without stripe statistics) then these files are added to > incompatible file set. But this file set is not processed as closeOp() will > not be called (no output file writer will exist which will skip > super.closeOp()). As a result, these incompatible files are not moved to > final path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13285) Orc concatenation may drop old files from moving to final path
[ https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194633#comment-15194633 ] Prasanth Jayachandran commented on HIVE-13285: -- RB is not responding. I will upload this patch later to RB. > Orc concatenation may drop old files from moving to final path > -- > > Key: HIVE-13285 > URL: https://issues.apache.org/jira/browse/HIVE-13285 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-13285.1.patch > > > ORC concatenation uses combine hive input format for merging files. Under > specific case where all files within a combine split are incompatible for > merge (old files without stripe statistics) then these files are added to > incompatible file set. But this file set is not processed as closeOp() will > not be called (no output file writer will exist which will skip > super.closeOp()). As a result, these incompatible files are not moved to > final path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13285) Orc concatenation may drop old files from moving to final path
[ https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-13285: - Status: Patch Available (was: Open) > Orc concatenation may drop old files from moving to final path > -- > > Key: HIVE-13285 > URL: https://issues.apache.org/jira/browse/HIVE-13285 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.0.0, 1.1.0, 1.2.0, 1.0.0, 0.14.0, 1.3.0, 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-13285.1.patch > > > ORC concatenation uses combine hive input format for merging files. Under > specific case where all files within a combine split are incompatible for > merge (old files without stripe statistics) then these files are added to > incompatible file set. But this file set is not processed as closeOp() will > not be called (no output file writer will exist which will skip > super.closeOp()). As a result, these incompatible files are not moved to > final path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13285) Orc concatenation may drop old files from moving to final path
[ https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194628#comment-15194628 ] Prasanth Jayachandran commented on HIVE-13285: -- [~gopalv]/[~daijy] Could someone please take a look at this patch? This patch makes sure closeOp() is called even when outWriter is null as we might have some incompatible files to move to final path. > Orc concatenation may drop old files from moving to final path > -- > > Key: HIVE-13285 > URL: https://issues.apache.org/jira/browse/HIVE-13285 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-13285.1.patch > > > ORC concatenation uses combine hive input format for merging files. Under > specific case where all files within a combine split are incompatible for > merge (old files without stripe statistics) then these files are added to > incompatible file set. But this file set is not processed as closeOp() will > not be called (no output file writer will exist which will skip > super.closeOp()). As a result, these incompatible files are not moved to > final path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13285) Orc concatenation may drop old files from moving to final path
[ https://issues.apache.org/jira/browse/HIVE-13285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-13285: - Attachment: HIVE-13285.1.patch > Orc concatenation may drop old files from moving to final path > -- > > Key: HIVE-13285 > URL: https://issues.apache.org/jira/browse/HIVE-13285 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.3.0, 2.0.0, 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-13285.1.patch > > > ORC concatenation uses combine hive input format for merging files. Under > specific case where all files within a combine split are incompatible for > merge (old files without stripe statistics) then these files are added to > incompatible file set. But this file set is not processed as closeOp() will > not be called (no output file writer will exist which will skip > super.closeOp()). As a result, these incompatible files are not moved to > final path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194593#comment-15194593 ] Aihua Xu commented on HIVE-13286: - I will take a look. That seems to be an issue and not my intention. > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11424) Rule to transform OR clauses into IN clauses in CBO
[ https://issues.apache.org/jira/browse/HIVE-11424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194516#comment-15194516 ] Hive QA commented on HIVE-11424: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12793292/HIVE-11424.03.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 38 failed/errored test(s), 9807 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-timestamp_lazy.q-bucketsortoptimize_insert_4.q-date_udf.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_deep_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_filter org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_dynamic_partition_skip_default org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_cond_pushdown org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_flatten_and_or org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_fold_eq_with_case_when org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_query_multiskew_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pointlookup4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_case org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_bucketpruning1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_case org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query13 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query34 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query71 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query73 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query85 org.apache.hadoop.hive.cli.TestPerfCliDriver.testPerfCliDriver_query91 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join19 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_bucketsortoptimize_insert_7 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_pcr org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_case {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7268/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7268/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7268/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 38 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12793292 - PreCommit-HIVE-TRUNK-Build > Rule to transform OR clauses into IN clauses in CBO > --- > > Key: HIVE-11424 > URL: https://issues.apache.org/jira/browse/HIVE-11424 > Project: Hive > Issue Type: Bug >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-11424.01.patch, HIVE-11424.01.patch, > HIVE-11424.03.patch, HIVE-11424.03.patch, HIVE-11424.2.patch, HIVE-11424.patch > > > We create a rule that will transform OR clauses into IN clauses
[jira] [Commented] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194509#comment-15194509 ] Vikram Dixit K commented on HIVE-13286: --- I think it primarily comes down to this: the hive conf object once modified with a generated query id, never resets the query id for a subsequent query. {code} +String queryId = confOverlay.get(HiveConf.ConfVars.HIVEQUERYID.varname); +if (queryId == null || queryId.isEmpty()) { + queryId = QueryPlan.makeQueryId(); + confOverlay.put(HiveConf.ConfVars.HIVEQUERYID.varname, queryId); + sessionState.getConf().setVar(HiveConf.ConfVars.HIVEQUERYID, queryId); +} {code} Once the query id has been set by a previous query, it never changes. This is incorrect behavior. I am not sure about what the change was trying to do but this needs to get fixed. Thanks! > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13278) Many redundant 'File not found' messages appeared in container log during query execution with Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xin Hao updated HIVE-13278: --- Description: Many redundant 'File not found' messages appeared in container log during query execution with Hive on Spark. Certainly, it doesn't prevent the query from running successfully. So mark it as Minor currently. Error message example: 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) was: Many redundant 'File not found' messages appeared in container log during query execution with Hive on Spark Error message example: 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark > > > Key: HIVE-13278 >
[jira] [Updated] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-13286: -- Description: [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is being reused across queries. This defeats the purpose of a query id. I am not sure what the purpose of the change in that jira is but it breaks the assumption about a query id being unique for each query. Please take a look into this at the earliest. (was: [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is being reused across queries. This defeats the purpose of a query id. I am not sure what the purpose of the change in that jira is but it breaks the assumption about a query id being unique for each query. Please take a look into this.) > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this at the earliest. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13278) Many redundant 'File not found' messages appeared in container log during query execution with Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194503#comment-15194503 ] Xin Hao commented on HIVE-13278: Yes, this problem doesn't prevent the query from running successfully. > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Priority: Minor > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark > Error message example: > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13286) Query ID is being reused across queries
[ https://issues.apache.org/jira/browse/HIVE-13286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-13286: -- Assignee: Aihua Xu (was: Pengcheng Xiong) > Query ID is being reused across queries > --- > > Key: HIVE-13286 > URL: https://issues.apache.org/jira/browse/HIVE-13286 > Project: Hive > Issue Type: Bug > Components: Parser >Affects Versions: 2.0.0 >Reporter: Vikram Dixit K >Assignee: Aihua Xu >Priority: Critical > > [~aihuaxu] I see this commit made via HIVE-11488. I see that query id is > being reused across queries. This defeats the purpose of a query id. I am not > sure what the purpose of the change in that jira is but it breaks the > assumption about a query id being unique for each query. Please take a look > into this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13278) Many redundant 'File not found' messages appeared in container log during query execution with Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-13278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194499#comment-15194499 ] Xuefu Zhang commented on HIVE-13278: [~xhao1], to clarify, this problem doesn't prevent the query from running successfully, right? Thanks. > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark > > > Key: HIVE-13278 > URL: https://issues.apache.org/jira/browse/HIVE-13278 > Project: Hive > Issue Type: Bug > Environment: Hive on Spark engine > Found based on : > Apache Hive 2.0.0 > Apache Spark 1.6.0 >Reporter: Xin Hao >Priority: Minor > > Many redundant 'File not found' messages appeared in container log during > query execution with Hive on Spark > Error message example: > 16/03/14 01:45:06 INFO exec.Utilities: File not found: File does not exist: > /tmp/hive/hadoop/2d378538-f5d3-493c-9276-c62dd6634fb4/hive_2016-03-14_01-44-16_835_623058724409492515-6/-mr-10010/0a6d0cae-1eb3-448c-883b-590b3b198a73/reduce.xml > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1932) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1873) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1853) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1825) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:565) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(AuthorizationProviderProxyClientProtocol.java:87) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:363) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194464#comment-15194464 ] Wei Zheng commented on HIVE-13249: -- OK, one more concern, is it a good idea to have performTimeOuts() and countOpenTxns() share the same check intervals? We may want to do it more frequent for the latter one. > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194446#comment-15194446 ] Prasanth Jayachandran edited comment on HIVE-13226 at 3/15/16 12:22 AM: Test failures are unrelated. [~gopalv] Can you please review the latest patch? was (Author: prasanth_j): Test failures are unrealted. [~gopalv] Can you please review the latest patch? > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, > HIVE-13226.3.patch, sampleoutput.png > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13226) Improve tez print summary to print query execution breakdown
[ https://issues.apache.org/jira/browse/HIVE-13226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194446#comment-15194446 ] Prasanth Jayachandran commented on HIVE-13226: -- Test failures are unrealted. [~gopalv] Can you please review the latest patch? > Improve tez print summary to print query execution breakdown > > > Key: HIVE-13226 > URL: https://issues.apache.org/jira/browse/HIVE-13226 > Project: Hive > Issue Type: Improvement >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-13226.1.patch, HIVE-13226.2.patch, > HIVE-13226.3.patch, sampleoutput.png > > > When tez print summary is enabled, methods summary is printed which are > difficult to correlate with the actual execution time. We can improve that to > print the execution times in the sequence of operations that happens behind > the scenes. > Instead of printing the methods name it will be useful to print something > like below > 1) Query Compilation time > 2) Query Submit to DAG Submit time > 3) DAG Submit to DAG Accept time > 4) DAG Accept to DAG Start time > 5) DAG Start to DAG End time > With this it will be easier to find out where the actual time is spent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR
[ https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13084: Attachment: HIVE-13084.03.patch > Vectorization add support for PROJECTION Multi-AND/OR > - > > Key: HIVE-13084 > URL: https://issues.apache.org/jira/browse/HIVE-13084 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan >Assignee: Matt McCline > Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, > HIVE-13084.03.patch, vector_between_date.q > > > When there is case statement in group by, hive throws unable to vectorize > exception. > e.g query just to demonstrate the problem > {noformat} > explain select l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts > group by l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END; > org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize > expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc > Vertex dependency in root stage > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 2 > File Output Operator [FS_7] > Group By Operator [GBY_5] (rows=888777234 width=108) > Output:["_col0","_col1"],keys:KEY._col0, KEY._col1 > <-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_4] > PartitionCols:_col0, _col1 > Group By Operator [GBY_3] (rows=1777554469 width=108) > Output:["_col0","_col1"],keys:_col0, _col1 > Select Operator [SEL_1] (rows=1777554469 width=108) > Output:["_col0","_col1"] > TableScan [TS_0] (rows=1777554469 width=108) > > rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"] > {noformat} > \cc [~mmccline], [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR
[ https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13084: Attachment: (was: HIVE-13084.03.patch) > Vectorization add support for PROJECTION Multi-AND/OR > - > > Key: HIVE-13084 > URL: https://issues.apache.org/jira/browse/HIVE-13084 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan >Assignee: Matt McCline > Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, > HIVE-13084.03.patch, vector_between_date.q > > > When there is case statement in group by, hive throws unable to vectorize > exception. > e.g query just to demonstrate the problem > {noformat} > explain select l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts > group by l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END; > org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize > expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc > Vertex dependency in root stage > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 2 > File Output Operator [FS_7] > Group By Operator [GBY_5] (rows=888777234 width=108) > Output:["_col0","_col1"],keys:KEY._col0, KEY._col1 > <-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_4] > PartitionCols:_col0, _col1 > Group By Operator [GBY_3] (rows=1777554469 width=108) > Output:["_col0","_col1"],keys:_col0, _col1 > Select Operator [SEL_1] (rows=1777554469 width=108) > Output:["_col0","_col1"] > TableScan [TS_0] (rows=1777554469 width=108) > > rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"] > {noformat} > \cc [~mmccline], [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR
[ https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13084: Attachment: HIVE-13084.03.patch > Vectorization add support for PROJECTION Multi-AND/OR > - > > Key: HIVE-13084 > URL: https://issues.apache.org/jira/browse/HIVE-13084 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan >Assignee: Matt McCline > Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, > HIVE-13084.03.patch, vector_between_date.q > > > When there is case statement in group by, hive throws unable to vectorize > exception. > e.g query just to demonstrate the problem > {noformat} > explain select l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts > group by l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END; > org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize > expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc > Vertex dependency in root stage > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 2 > File Output Operator [FS_7] > Group By Operator [GBY_5] (rows=888777234 width=108) > Output:["_col0","_col1"],keys:KEY._col0, KEY._col1 > <-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_4] > PartitionCols:_col0, _col1 > Group By Operator [GBY_3] (rows=1777554469 width=108) > Output:["_col0","_col1"],keys:_col0, _col1 > Select Operator [SEL_1] (rows=1777554469 width=108) > Output:["_col0","_col1"] > TableScan [TS_0] (rows=1777554469 width=108) > > rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"] > {noformat} > \cc [~mmccline], [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR
[ https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13084: Attachment: (was: HIVE-13084.03.patch) > Vectorization add support for PROJECTION Multi-AND/OR > - > > Key: HIVE-13084 > URL: https://issues.apache.org/jira/browse/HIVE-13084 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan >Assignee: Matt McCline > Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, > vector_between_date.q > > > When there is case statement in group by, hive throws unable to vectorize > exception. > e.g query just to demonstrate the problem > {noformat} > explain select l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts > group by l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END; > org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize > expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc > Vertex dependency in root stage > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 2 > File Output Operator [FS_7] > Group By Operator [GBY_5] (rows=888777234 width=108) > Output:["_col0","_col1"],keys:KEY._col0, KEY._col1 > <-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_4] > PartitionCols:_col0, _col1 > Group By Operator [GBY_3] (rows=1777554469 width=108) > Output:["_col0","_col1"],keys:_col0, _col1 > Select Operator [SEL_1] (rows=1777554469 width=108) > Output:["_col0","_col1"] > TableScan [TS_0] (rows=1777554469 width=108) > > rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"] > {noformat} > \cc [~mmccline], [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13185) orc.ReaderImp.ensureOrcFooter() method fails on small text files with IndexOutOfBoundsException
[ https://issues.apache.org/jira/browse/HIVE-13185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13185: Resolution: Fixed Fix Version/s: 2.1.0 Status: Resolved (was: Patch Available) Committed to master. > orc.ReaderImp.ensureOrcFooter() method fails on small text files with > IndexOutOfBoundsException > --- > > Key: HIVE-13185 > URL: https://issues.apache.org/jira/browse/HIVE-13185 > Project: Hive > Issue Type: Bug > Components: ORC >Affects Versions: 2.1.0 >Reporter: Illya Yalovyy >Assignee: Illya Yalovyy > Fix For: 2.1.0 > > Attachments: HIVE-13185.1.patch > > > Steps to reproduce: > 1. Create a Text source table with one line of data: > {code} > create table src (id int); > insert overwrite table src values (1); > {code} > 2. Create a target table: > {code} > create table trg (id int); > {code} > 3. Try to load small text file to the target table: > {code} > load data inpath 'user/hive/warehouse/src/00_0' into table trg; > {code} > *Error message:* > {quote} > FAILED: SemanticException Unable to load data to destination table. Error: > java.lang.IndexOutOfBoundsException > {quote} > *Stack trace:* > {noformat} > org.apache.hadoop.hive.ql.parse.SemanticException: Unable to load data to > destination table. Error: java.lang.IndexOutOfBoundsException > at > org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.ensureFileFormatsMatch(LoadSemanticAnalyzer.java:340) > at > org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:224) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:242) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:481) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317) > at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1190) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1285) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1116) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1104) > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-13284) Make ORC Reader resilient to 0 length files
[ https://issues.apache.org/jira/browse/HIVE-13284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-13284. - Resolution: Duplicate Actually I just realized it's a dup of HIVE-13185 > Make ORC Reader resilient to 0 length files > --- > > Key: HIVE-13284 > URL: https://issues.apache.org/jira/browse/HIVE-13284 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > HIVE-13040 creates 0 length ORC files. Reading such files will throw > following exception. ORC is resilient to corrupt footers but not 0 length > files. > {code} > Processing data file file:/app/warehouse/concat_incompat/00_0 [length: 0] > Exception in thread "main" java.lang.IndexOutOfBoundsException > at java.nio.Buffer.checkIndex(Buffer.java:540) > at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139) > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:510) > at > org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:361) > at > org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:83) > at > org.apache.hadoop.hive.ql.io.orc.FileDump.getReader(FileDump.java:239) > at > org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaDataImpl(FileDump.java:312) > at > org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaData(FileDump.java:291) > at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:138) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9457) Fix obsolete parameter name in HiveConf description of hive.hashtable.initialCapacity
[ https://issues.apache.org/jira/browse/HIVE-9457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shannon Ladymon updated HIVE-9457: -- Attachment: HIVE-9457.2.patch Rebased patch attached > Fix obsolete parameter name in HiveConf description of > hive.hashtable.initialCapacity > - > > Key: HIVE-9457 > URL: https://issues.apache.org/jira/browse/HIVE-9457 > Project: Hive > Issue Type: Bug > Components: Documentation >Affects Versions: 0.14.0 >Reporter: Lefty Leverenz >Assignee: Shannon Ladymon >Priority: Minor > Attachments: HIVE-9457.2.patch, HIVE-9457.patch > > > The description of *hive.hashtable.initialCapacity* in HiveConf.java refers > to a parameter that existed in an early patch for HIVE-7616 > ("hive.hashtable.stats.key.estimate.adjustment") but was renamed in later > patches. So change *hive.hashtable.stats.key.estimate.adjustment* to > *hive.hashtable.key.count.adjustment* in this parameter definition in > HiveConf.java: > {code} > HIVEHASHTABLETHRESHOLD("hive.hashtable.initialCapacity", 10, "Initial > capacity of " + > "mapjoin hashtable if statistics are absent, or if > hive.hashtable.stats.key.estimate.adjustment is set to 0"), > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13281) Update some default configs for LLAP
[ https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194398#comment-15194398 ] Siddharth Seth commented on HIVE-13281: --- uber has not been tested enough for it to be default - in fact I think it is broken rightnow. If running with containers, container would be faster. > Update some default configs for LLAP > > > Key: HIVE-13281 > URL: https://issues.apache.org/jira/browse/HIVE-13281 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth > > Disable uber mode. > Enable llap.io by default -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13281) Update some default configs for LLAP
[ https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194378#comment-15194378 ] Sergey Shelukhin commented on HIVE-13281: - Hmm. Why? Isn't uber still faster than container, although slower than LLAP? Right now, uber is only disabled in "all" mode. > Update some default configs for LLAP > > > Key: HIVE-13281 > URL: https://issues.apache.org/jira/browse/HIVE-13281 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth > > Disable uber mode. > Enable llap.io by default -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-13241) LLAP: Incremental Caching marks some small chunks as "incomplete CB"
[ https://issues.apache.org/jira/browse/HIVE-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-13241: --- Assignee: Sergey Shelukhin > LLAP: Incremental Caching marks some small chunks as "incomplete CB" > > > Key: HIVE-13241 > URL: https://issues.apache.org/jira/browse/HIVE-13241 > Project: Hive > Issue Type: Bug >Reporter: Gopal V >Assignee: Sergey Shelukhin > > Run #3 of a query with 1 node still has cache misses. > {code} > LLAP IO Summary > -- > VERTICES ROWGROUPS META_HIT META_MISS DATA_HIT DATA_MISS ALLOCATION > USED TOTAL_IO > -- > Map 111 1116 01.65GB93.61MB 0B >0B32.72s > -- > {code} > {code} > 2016-03-08T21:05:39,417 INFO > [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: > encoded.EncodedReaderImpl > (EncodedReaderImpl.java:prepareRangesForCompressedRead(695)) - Locking > 0x1c44401d(1) due to reuse > 2016-03-08T21:05:39,417 INFO > [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: > encoded.EncodedReaderImpl > (EncodedReaderImpl.java:prepareRangesForCompressedRead(701)) - Adding an > already-uncompressed buffer 0x1c44401d(2) > 2016-03-08T21:05:39,417 INFO > [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: > encoded.EncodedReaderImpl > (EncodedReaderImpl.java:prepareRangesForCompressedRead(695)) - Locking > 0x4e51b032(1) due to reuse > 2016-03-08T21:05:39,417 INFO > [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: > encoded.EncodedReaderImpl > (EncodedReaderImpl.java:prepareRangesForCompressedRead(701)) - Adding an > already-uncompressed buffer 0x4e51b032(2) > 2016-03-08T21:05:39,418 INFO > [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: > encoded.EncodedReaderImpl > (EncodedReaderImpl.java:addOneCompressionBuffer(1161)) - Found CB at 1373931, > chunk length 86587, total 86590, compressed > 2016-03-08T21:05:39,418 INFO > [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: > encoded.EncodedReaderImpl > (EncodedReaderImpl.java:addIncompleteCompressionBuffer(1241)) - Replacing > data range [1373931, 1408408), size: 34474(!) type: direct (and 0 previous > chunks) with incomplete CB start: 1373931 end: 1408408 in the buffers > 2016-03-08T21:05:39,418 INFO > [IO-Elevator-Thread-9[attempt_1455662455106_2688_3_00_01_0]]: > encoded.EncodedReaderImpl > (EncodedReaderImpl.java:createRgColumnStreamData(441)) - Getting data for > column 7 RG 14 stream DATA at 1460521, 319811 index position 0: compressed > [1626961, 1780332) > {code} > {code} > 2016-03-08T21:05:38,925 INFO > [IO-Elevator-Thread-7[attempt_1455662455106_2688_3_00_01_0]]: > encoded.OrcEncodedDataReader (OrcEncodedDataReader.java:readFileData(878)) - > Disk ranges after disk read (file 5372745, base offset 3): [{start: 18986 > end: 20660 cache buffer: 0x660faf7c(1)}, {start: 20660 end: 35775 cache > buffer: 0x1dcb1d97(1)}, {start: 318852 end: 422353 cache buffer: > 0x6c7f9a05(1)}, {start: 1148616 end: 1262468 cache buffer: 0x196e1d41(1)}, > {start: 1262468 end: 1376342 cache buffer: 0x201255f(1)}, {data range > [1376342, 1410766), size: 34424 type: direct}, {start: 1631359 end: 1714694 > cache buffer: 0x47e3a72d(1)}, {start: 1714694 end: 1785770 cache buffer: > 0x57dca266(1)}, {start: 4975035 end: 5095215 cache buffer: 0x3e3139c9(1)}, > {start: 5095215 end: 5197863 cache buffer: 0x3511c88d(1)}, {start: 7448387 > end: 7572268 cache buffer: 0x6f11dbcd(1)}, {start: 7572268 end: 7696182 cache > buffer: 0x5d6c9bdb(1)}, {data range [7696182, 7710537), size: 14355 type: > direct}, {start: 8235756 end: 8345367 cache buffer: 0x6a241ece(1)}, {start: > 8345367 end: 8455009 cache buffer: 0x51caf6a7(1)}, {data range [8455009, > 8497906), size: 42897 type: direct}, {start: 9035815 end: 9159708 cache > buffer: 0x306480e0(1)}, {start: 9159708 end: 9283629 cache buffer: > 0x9ef7774(1)}, {data range [9283629, 9297965), size: 14336 type: direct}, > {start: 9989884 end: 10113731 cache buffer: 0x43f7cae9(1)}, {start: 10113731 > end: 10237589 cache buffer: 0x458e63fe(1)}, {data range [10237589, 10252034), > size: 14445 type: direct}, {start: 11897896 end: 12021787 cache buffer: > 0x51f9982f(1)}, {start: 12021787 end: 12145656 cache buffer: 0x23df01b3(1)}, > {data range [12145656, 12160046), size: 14390 type: direct}, {start: 12851928 > end: 12975795 cache buffer: 0x5e0237a3(1)}, {start: 12975795 end: 13099664 >
[jira] [Resolved] (HIVE-13281) Update some default configs for LLAP
[ https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved HIVE-13281. --- Resolution: Duplicate > Update some default configs for LLAP > > > Key: HIVE-13281 > URL: https://issues.apache.org/jira/browse/HIVE-13281 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth > > Disable uber mode. > Enable llap.io by default -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13281) Update some default configs for LLAP
[ https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194374#comment-15194374 ] Siddharth Seth commented on HIVE-13281: --- The config property needs to be set to false for uber. Could you please do that in 12283 - I'll close this. > Update some default configs for LLAP > > > Key: HIVE-13281 > URL: https://issues.apache.org/jira/browse/HIVE-13281 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth > > Disable uber mode. > Enable llap.io by default -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4144) Add "select database()" command to show the current database
[ https://issues.apache.org/jira/browse/HIVE-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shannon Ladymon updated HIVE-4144: -- Labels: (was: TODOC13) > Add "select database()" command to show the current database > > > Key: HIVE-4144 > URL: https://issues.apache.org/jira/browse/HIVE-4144 > Project: Hive > Issue Type: Bug > Components: SQL >Reporter: Mark Grover >Assignee: Navis > Fix For: 0.13.0 > > Attachments: D9597.5.patch, HIVE-4144.10.patch.txt, > HIVE-4144.11.patch.txt, HIVE-4144.12.patch.txt, HIVE-4144.13.patch.txt, > HIVE-4144.14.patch.txt, HIVE-4144.6.patch.txt, HIVE-4144.7.patch.txt, > HIVE-4144.8.patch.txt, HIVE-4144.9.patch.txt, HIVE-4144.D9597.1.patch, > HIVE-4144.D9597.2.patch, HIVE-4144.D9597.3.patch, HIVE-4144.D9597.4.patch > > > A recent hive-user mailing list conversation asked about having a command to > show the current database. > http://mail-archives.apache.org/mod_mbox/hive-user/201303.mbox/%3CCAMGr+0i+CRY69m3id=DxthmUCWLf0NxpKMCtROb=uauh2va...@mail.gmail.com%3E > MySQL seems to have a command to do so: > {code} > select database(); > {code} > http://dev.mysql.com/doc/refman/5.0/en/information-functions.html#function_database > We should look into having something similar in Hive. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13281) Update some default configs for LLAP
[ https://issues.apache.org/jira/browse/HIVE-13281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194366#comment-15194366 ] Sergey Shelukhin commented on HIVE-13281: - That is already done in HIVE-13283 and HIVE-13218 > Update some default configs for LLAP > > > Key: HIVE-13281 > URL: https://issues.apache.org/jira/browse/HIVE-13281 > Project: Hive > Issue Type: Task >Reporter: Siddharth Seth > > Disable uber mode. > Enable llap.io by default -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons
[ https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194364#comment-15194364 ] Sergey Shelukhin commented on HIVE-13283: - Enabling io by default has to be done carefully. It would enable IO outside of the daemon. > LLAP: make sure IO elevator is enabled by default in the daemons > > > Key: HIVE-13283 > URL: https://issues.apache.org/jira/browse/HIVE-13283 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13283.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons
[ https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194364#comment-15194364 ] Sergey Shelukhin edited comment on HIVE-13283 at 3/14/16 11:17 PM: --- Enabling io by default has to be done carefully. It would enable IO outside of the daemon. Hence this JIRA was (Author: sershe): Enabling io by default has to be done carefully. It would enable IO outside of the daemon. > LLAP: make sure IO elevator is enabled by default in the daemons > > > Key: HIVE-13283 > URL: https://issues.apache.org/jira/browse/HIVE-13283 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13283.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9660) store end offset of compressed data for RG in RowIndex in ORC
[ https://issues.apache.org/jira/browse/HIVE-9660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-9660: --- Attachment: HIVE-9660.WIP1.patch > store end offset of compressed data for RG in RowIndex in ORC > - > > Key: HIVE-9660 > URL: https://issues.apache.org/jira/browse/HIVE-9660 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-9660.WIP0.patch, HIVE-9660.WIP1.patch > > > Right now the end offset is estimated, which in some cases results in tons of > extra data being read. > We can add a separate array to RowIndex (positions_v2?) that stores number of > compressed buffers for each RG, or end offset, or something, to remove this > estimation magic -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons
[ https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194281#comment-15194281 ] Sergey Shelukhin edited comment on HIVE-13283 at 3/14/16 11:15 PM: --- [~hagleitn] [~vikram.dixit] -can you take a look- -actually, nm, this will not do what is needed- ok, now it's ready was (Author: sershe): [~hagleitn] [~vikram.dixit] -can you take a look- actually, nm, this will not do what is needed > LLAP: make sure IO elevator is enabled by default in the daemons > > > Key: HIVE-13283 > URL: https://issues.apache.org/jira/browse/HIVE-13283 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13283.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194362#comment-15194362 ] Alan Gates commented on HIVE-13249: --- I think the problem with having the thread in TxnHandler instead of AcidHousekeeper is that every client will be independently deciding whether the system has too many transactions. This has a couple of problems. One, it's inefficient as every client is running the count query. But two, it means differing configurations could result in some clients seeing the system as overloaded and being locked out while others are not. In fact, a malicious client could game the system and set his config high so that he can continue to open transactions when other clients cannot. On the 90% what I'm suggesting is this: # In the initial state it accepts new transactions until it hits X number total transactions open, where X is the configured value # When it hits X, a full flag is set # Once the full flag is set no new transactions are allowed in # The full flag is not unset until the number of open transactions hits X * 0.9. This is standard procedure with thresholds so that you give the system some time to drain off rather than building up a set of clients retrying on opening their transactions that all race to get that one available transaction once one transaction commits or aborts. > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons
[ https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194361#comment-15194361 ] Siddharth Seth commented on HIVE-13283: --- HIVE-13281 > LLAP: make sure IO elevator is enabled by default in the daemons > > > Key: HIVE-13283 > URL: https://issues.apache.org/jira/browse/HIVE-13283 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13283.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons
[ https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13283: Attachment: HIVE-13283.patch > LLAP: make sure IO elevator is enabled by default in the daemons > > > Key: HIVE-13283 > URL: https://issues.apache.org/jira/browse/HIVE-13283 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13283.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194349#comment-15194349 ] Wei Zheng commented on HIVE-13249: -- Thanks [~alangates] for review. I had discussion with [~ekoifman] last week and thought it would be better to have a separate threadpool for that. We can revisit this and see if AcidHouseKeeperService is a good reuse. For the open transactions limit, we just want to fail the incoming open transaction requests if the number from background counter is above threshold. So even if e.g. we let it in when under 90%, it will still lurch in and out around that 90% line. > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13261) Can not compute column stats for partition when schema evolves
[ https://issues.apache.org/jira/browse/HIVE-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13261: --- Status: Patch Available (was: Open) > Can not compute column stats for partition when schema evolves > -- > > Key: HIVE-13261 > URL: https://issues.apache.org/jira/browse/HIVE-13261 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13261.01.patch > > > To repro > {code} > CREATE TABLE partitioned1(a INT, b STRING) PARTITIONED BY(part INT) STORED AS > TEXTFILE; > insert into table partitioned1 partition(part=1) values(1, 'original'),(2, > 'original'), (3, 'original'),(4, 'original'); > -- Table-Non-Cascade ADD COLUMNS ... > alter table partitioned1 add columns(c int, d string); > insert into table partitioned1 partition(part=2) values(1, 'new', 10, > 'ten'),(2, 'new', 20, 'twenty'), (3, 'new', 30, 'thirty'),(4, 'new', 40, > 'forty'); > insert into table partitioned1 partition(part=1) values(5, 'new', 100, > 'hundred'),(6, 'new', 200, 'two hundred'); > analyze table partitioned1 compute statistics for columns; > {code} > Error msg: > {code} > 2016-03-10T14:55:43,205 ERROR [abc3eb8d-7432-47ae-b76f-54c8d7020312 main[]]: > metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(177)) - > NoSuchObjectException(message:Column c for which stats gathering is requested > doesn't exist.) > at > org.apache.hadoop.hive.metastore.ObjectStore.writeMPartitionColumnStatistics(ObjectStore.java:6492) > at > org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:6574) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13261) Can not compute column stats for partition when schema evolves
[ https://issues.apache.org/jira/browse/HIVE-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13261: --- Attachment: (was: HIVE-13261.01.patch) > Can not compute column stats for partition when schema evolves > -- > > Key: HIVE-13261 > URL: https://issues.apache.org/jira/browse/HIVE-13261 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13261.01.patch > > > To repro > {code} > CREATE TABLE partitioned1(a INT, b STRING) PARTITIONED BY(part INT) STORED AS > TEXTFILE; > insert into table partitioned1 partition(part=1) values(1, 'original'),(2, > 'original'), (3, 'original'),(4, 'original'); > -- Table-Non-Cascade ADD COLUMNS ... > alter table partitioned1 add columns(c int, d string); > insert into table partitioned1 partition(part=2) values(1, 'new', 10, > 'ten'),(2, 'new', 20, 'twenty'), (3, 'new', 30, 'thirty'),(4, 'new', 40, > 'forty'); > insert into table partitioned1 partition(part=1) values(5, 'new', 100, > 'hundred'),(6, 'new', 200, 'two hundred'); > analyze table partitioned1 compute statistics for columns; > {code} > Error msg: > {code} > 2016-03-10T14:55:43,205 ERROR [abc3eb8d-7432-47ae-b76f-54c8d7020312 main[]]: > metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(177)) - > NoSuchObjectException(message:Column c for which stats gathering is requested > doesn't exist.) > at > org.apache.hadoop.hive.metastore.ObjectStore.writeMPartitionColumnStatistics(ObjectStore.java:6492) > at > org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:6574) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13261) Can not compute column stats for partition when schema evolves
[ https://issues.apache.org/jira/browse/HIVE-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13261: --- Attachment: HIVE-13261.01.patch > Can not compute column stats for partition when schema evolves > -- > > Key: HIVE-13261 > URL: https://issues.apache.org/jira/browse/HIVE-13261 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13261.01.patch > > > To repro > {code} > CREATE TABLE partitioned1(a INT, b STRING) PARTITIONED BY(part INT) STORED AS > TEXTFILE; > insert into table partitioned1 partition(part=1) values(1, 'original'),(2, > 'original'), (3, 'original'),(4, 'original'); > -- Table-Non-Cascade ADD COLUMNS ... > alter table partitioned1 add columns(c int, d string); > insert into table partitioned1 partition(part=2) values(1, 'new', 10, > 'ten'),(2, 'new', 20, 'twenty'), (3, 'new', 30, 'thirty'),(4, 'new', 40, > 'forty'); > insert into table partitioned1 partition(part=1) values(5, 'new', 100, > 'hundred'),(6, 'new', 200, 'two hundred'); > analyze table partitioned1 compute statistics for columns; > {code} > Error msg: > {code} > 2016-03-10T14:55:43,205 ERROR [abc3eb8d-7432-47ae-b76f-54c8d7020312 main[]]: > metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(177)) - > NoSuchObjectException(message:Column c for which stats gathering is requested > doesn't exist.) > at > org.apache.hadoop.hive.metastore.ObjectStore.writeMPartitionColumnStatistics(ObjectStore.java:6492) > at > org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:6574) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13261) Can not compute column stats for partition when schema evolves
[ https://issues.apache.org/jira/browse/HIVE-13261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13261: --- Attachment: (was: HIVE-13261.01.patch) > Can not compute column stats for partition when schema evolves > -- > > Key: HIVE-13261 > URL: https://issues.apache.org/jira/browse/HIVE-13261 > Project: Hive > Issue Type: Bug >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13261.01.patch > > > To repro > {code} > CREATE TABLE partitioned1(a INT, b STRING) PARTITIONED BY(part INT) STORED AS > TEXTFILE; > insert into table partitioned1 partition(part=1) values(1, 'original'),(2, > 'original'), (3, 'original'),(4, 'original'); > -- Table-Non-Cascade ADD COLUMNS ... > alter table partitioned1 add columns(c int, d string); > insert into table partitioned1 partition(part=2) values(1, 'new', 10, > 'ten'),(2, 'new', 20, 'twenty'), (3, 'new', 30, 'thirty'),(4, 'new', 40, > 'forty'); > insert into table partitioned1 partition(part=1) values(5, 'new', 100, > 'hundred'),(6, 'new', 200, 'two hundred'); > analyze table partitioned1 compute statistics for columns; > {code} > Error msg: > {code} > 2016-03-10T14:55:43,205 ERROR [abc3eb8d-7432-47ae-b76f-54c8d7020312 main[]]: > metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(177)) - > NoSuchObjectException(message:Column c for which stats gathering is requested > doesn't exist.) > at > org.apache.hadoop.hive.metastore.ObjectStore.writeMPartitionColumnStatistics(ObjectStore.java:6492) > at > org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:6574) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13223) HoS may hang for queries that run on 0 splits
[ https://issues.apache.org/jira/browse/HIVE-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194334#comment-15194334 ] Sergey Shelukhin commented on HIVE-13223: - You should test if it did and file a separate JIRA assigned to [~ashutoshc] if so ;) > HoS may hang for queries that run on 0 splits > --- > > Key: HIVE-13223 > URL: https://issues.apache.org/jira/browse/HIVE-13223 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 2.1.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-13223.1.patch, HIVE-13223.patch > > > Can be seen on all timed out tests after HIVE-13040 went in -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13125) Support masking and filtering of rows/columns
[ https://issues.apache.org/jira/browse/HIVE-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13125: --- Status: Open (was: Patch Available) > Support masking and filtering of rows/columns > - > > Key: HIVE-13125 > URL: https://issues.apache.org/jira/browse/HIVE-13125 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13125.01.patch, HIVE-13125.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13125) Support masking and filtering of rows/columns
[ https://issues.apache.org/jira/browse/HIVE-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13125: --- Status: Patch Available (was: Open) > Support masking and filtering of rows/columns > - > > Key: HIVE-13125 > URL: https://issues.apache.org/jira/browse/HIVE-13125 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13125.01.patch, HIVE-13125.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13125) Support masking and filtering of rows/columns
[ https://issues.apache.org/jira/browse/HIVE-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-13125: --- Attachment: HIVE-13125.02.patch > Support masking and filtering of rows/columns > - > > Key: HIVE-13125 > URL: https://issues.apache.org/jira/browse/HIVE-13125 > Project: Hive > Issue Type: New Feature >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-13125.01.patch, HIVE-13125.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13223) HoS may hang for queries that run on 0 splits
[ https://issues.apache.org/jira/browse/HIVE-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194320#comment-15194320 ] Prasanth Jayachandran commented on HIVE-13223: -- AFAIK, filedump worked before this empty file bucket changes (HIVE-13040). > HoS may hang for queries that run on 0 splits > --- > > Key: HIVE-13223 > URL: https://issues.apache.org/jira/browse/HIVE-13223 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 2.1.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-13223.1.patch, HIVE-13223.patch > > > Can be seen on all timed out tests after HIVE-13040 went in -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13223) HoS may hang for queries that run on 0 splits
[ https://issues.apache.org/jira/browse/HIVE-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194318#comment-15194318 ] Sergey Shelukhin commented on HIVE-13223: - +1 pending tests. [~prasanth_j] wrong JIRA? :) > HoS may hang for queries that run on 0 splits > --- > > Key: HIVE-13223 > URL: https://issues.apache.org/jira/browse/HIVE-13223 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 2.1.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-13223.1.patch, HIVE-13223.patch > > > Can be seen on all timed out tests after HIVE-13040 went in -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13223) HoS may hang for queries that run on 0 splits
[ https://issues.apache.org/jira/browse/HIVE-13223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194305#comment-15194305 ] Prasanth Jayachandran commented on HIVE-13223: -- I created an empty ORC file and ran orcfiledump on it. It threw exception {code} create table concat_incompat(key string, value string) stored as orc; insert overwrite table concat_incompat select * from src where key > 1000; // return 0 rows hive --orcfiledump file:///app/warehouse/concat_incompat/00_0 Processing data file file:/app/warehouse/concat_incompat/00_0 [length: 0] Exception in thread "main" java.lang.IndexOutOfBoundsException at java.nio.Buffer.checkIndex(Buffer.java:540) at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139) at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.extractMetaInfoFromFooter(ReaderImpl.java:510) at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.(ReaderImpl.java:361) at org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:83) at org.apache.hadoop.hive.ql.io.orc.FileDump.getReader(FileDump.java:239) at org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaDataImpl(FileDump.java:312) at org.apache.hadoop.hive.ql.io.orc.FileDump.printMetaData(FileDump.java:291) at org.apache.hadoop.hive.ql.io.orc.FileDump.main(FileDump.java:138) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} > HoS may hang for queries that run on 0 splits > --- > > Key: HIVE-13223 > URL: https://issues.apache.org/jira/browse/HIVE-13223 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 2.1.0 >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-13223.1.patch, HIVE-13223.patch > > > Can be seen on all timed out tests after HIVE-13040 went in -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons
[ https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194281#comment-15194281 ] Sergey Shelukhin edited comment on HIVE-13283 at 3/14/16 10:20 PM: --- [~hagleitn] [~vikram.dixit] -can you take a look- actually, nm, this will not do what is needed was (Author: sershe): [~hagleitn] [~vikram.dixit] can you take a look? > LLAP: make sure IO elevator is enabled by default in the daemons > > > Key: HIVE-13283 > URL: https://issues.apache.org/jira/browse/HIVE-13283 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons
[ https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13283: Attachment: (was: HIVE-13283.patch) > LLAP: make sure IO elevator is enabled by default in the daemons > > > Key: HIVE-13283 > URL: https://issues.apache.org/jira/browse/HIVE-13283 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons
[ https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13283: Attachment: HIVE-13283.patch [~hagleitn] [~vikram.dixit] can you take a look? > LLAP: make sure IO elevator is enabled by default in the daemons > > > Key: HIVE-13283 > URL: https://issues.apache.org/jira/browse/HIVE-13283 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13283.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons
[ https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13283: Status: Patch Available (was: Open) > LLAP: make sure IO elevator is enabled by default in the daemons > > > Key: HIVE-13283 > URL: https://issues.apache.org/jira/browse/HIVE-13283 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-13283.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10176) skip.header.line.count causes values to be skipped when performing insert values
[ https://issues.apache.org/jira/browse/HIVE-10176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194279#comment-15194279 ] Hive QA commented on HIVE-10176: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12793284/HIVE-10176.3.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7267/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7267/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7267/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7267/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin >From https://github.com/apache/hive 214e4b6..1c44f4c branch-1 -> origin/branch-1 d4c1fdc..b6af012 master -> origin/master + git reset --hard HEAD HEAD is now at d4c1fdc HIVE-13251: hive can't read the decimal in AVRO file generated from previous version (Reviewed by Szehon Ho) + git clean -f -d + git checkout master Already on 'master' Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded. + git reset --hard origin/master HEAD is now at b6af012 HIVE-13201 : Compaction shouldn't be allowed on non-ACID table (Wei Zheng, reviewed by Alan Gates) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12793284 - PreCommit-HIVE-TRUNK-Build > skip.header.line.count causes values to be skipped when performing insert > values > > > Key: HIVE-10176 > URL: https://issues.apache.org/jira/browse/HIVE-10176 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Wenbo Wang >Assignee: Vladyslav Pavlenko > Attachments: HIVE-10176.1.patch, HIVE-10176.2.patch, > HIVE-10176.3.patch, data > > > When inserting values in to tables with TBLPROPERTIES > ("skip.header.line.count"="1") the first value listed is also skipped. > create table test (row int, name string) TBLPROPERTIES > ("skip.header.line.count"="1"); > load data local inpath '/root/data' into table test; > insert into table test values (1, 'a'), (2, 'b'), (3, 'c'); > (1, 'a') isn't inserted into the table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9457) Fix obsolete parameter name in HiveConf description of hive.hashtable.initialCapacity
[ https://issues.apache.org/jira/browse/HIVE-9457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194277#comment-15194277 ] Sergey Shelukhin commented on HIVE-9457: +1 > Fix obsolete parameter name in HiveConf description of > hive.hashtable.initialCapacity > - > > Key: HIVE-9457 > URL: https://issues.apache.org/jira/browse/HIVE-9457 > Project: Hive > Issue Type: Bug > Components: Documentation >Affects Versions: 0.14.0 >Reporter: Lefty Leverenz >Assignee: Shannon Ladymon >Priority: Minor > Attachments: HIVE-9457.patch > > > The description of *hive.hashtable.initialCapacity* in HiveConf.java refers > to a parameter that existed in an early patch for HIVE-7616 > ("hive.hashtable.stats.key.estimate.adjustment") but was renamed in later > patches. So change *hive.hashtable.stats.key.estimate.adjustment* to > *hive.hashtable.key.count.adjustment* in this parameter definition in > HiveConf.java: > {code} > HIVEHASHTABLETHRESHOLD("hive.hashtable.initialCapacity", 10, "Initial > capacity of " + > "mapjoin hashtable if statistics are absent, or if > hive.hashtable.stats.key.estimate.adjustment is set to 0"), > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13183) More logs in operation logs
[ https://issues.apache.org/jira/browse/HIVE-13183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194272#comment-15194272 ] Hive QA commented on HIVE-13183: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12793283/HIVE-13183.02.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9820 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8 org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver_tez_join_hash org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_join_hash {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7266/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7266/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7266/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12793283 - PreCommit-HIVE-TRUNK-Build > More logs in operation logs > --- > > Key: HIVE-13183 > URL: https://issues.apache.org/jira/browse/HIVE-13183 > Project: Hive > Issue Type: Improvement >Reporter: Rajat Khandelwal >Assignee: Rajat Khandelwal > Attachments: HIVE-13183.02.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9457) Fix obsolete parameter name in HiveConf description of hive.hashtable.initialCapacity
[ https://issues.apache.org/jira/browse/HIVE-9457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194268#comment-15194268 ] Shannon Ladymon commented on HIVE-9457: --- [~sershe], if you get a chance could you review this? It's a small patch updating the description of a parameter in HiveConf. > Fix obsolete parameter name in HiveConf description of > hive.hashtable.initialCapacity > - > > Key: HIVE-9457 > URL: https://issues.apache.org/jira/browse/HIVE-9457 > Project: Hive > Issue Type: Bug > Components: Documentation >Affects Versions: 0.14.0 >Reporter: Lefty Leverenz >Assignee: Shannon Ladymon >Priority: Minor > Attachments: HIVE-9457.patch > > > The description of *hive.hashtable.initialCapacity* in HiveConf.java refers > to a parameter that existed in an early patch for HIVE-7616 > ("hive.hashtable.stats.key.estimate.adjustment") but was renamed in later > patches. So change *hive.hashtable.stats.key.estimate.adjustment* to > *hive.hashtable.key.count.adjustment* in this parameter definition in > HiveConf.java: > {code} > HIVEHASHTABLETHRESHOLD("hive.hashtable.initialCapacity", 10, "Initial > capacity of " + > "mapjoin hashtable if statistics are absent, or if > hive.hashtable.stats.key.estimate.adjustment is set to 0"), > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-13283) LLAP: make sure IO elevator is enabled by default in the daemons
[ https://issues.apache.org/jira/browse/HIVE-13283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-13283: --- Assignee: Sergey Shelukhin > LLAP: make sure IO elevator is enabled by default in the daemons > > > Key: HIVE-13283 > URL: https://issues.apache.org/jira/browse/HIVE-13283 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13235) Insert from select generates incorrect result when hive.optimize.constant.propagation is on
[ https://issues.apache.org/jira/browse/HIVE-13235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13235: Status: Patch Available (was: Open) Attached patch-2: for the cases when the column has both name and alias, we will use NamedColumnInfo which will match against column name during comparison rather than alias since alias is not visible yet for such cases. > Insert from select generates incorrect result when > hive.optimize.constant.propagation is on > --- > > Key: HIVE-13235 > URL: https://issues.apache.org/jira/browse/HIVE-13235 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13235.1.patch, HIVE-13235.2.patch > > > The following query returns incorrect result when constant optimization is > turned on. The subquery happens to have an alias p1 to be the same as the > input partition name. Constant optimizer will optimize it incorrectly as the > constant. > When constant optimizer is turned off, we will get the correct result. > {noformat} > set hive.cbo.enable=false; > set hive.optimize.constant.propagation = true; > create table t1(c1 string, c2 double) partitioned by (p1 string, p2 string); > create table t2(p1 double, c2 string); > insert into table t1 partition(p1='40', p2='p2') values('c1', 0.0); > INSERT OVERWRITE TABLE t2 select if((c2 = 0.0), c2, '0') as p1, 2 as p2 from > t1 where c1 = 'c1' and p1 = '40'; > select * from t2; > 40 2 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12977) Pass credentials in the current UGI while creating Tez session
[ https://issues.apache.org/jira/browse/HIVE-12977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Sathappan updated HIVE-12977: Attachment: HIVE-12977.1.patch > Pass credentials in the current UGI while creating Tez session > -- > > Key: HIVE-12977 > URL: https://issues.apache.org/jira/browse/HIVE-12977 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Vinoth Sathappan >Assignee: Vinoth Sathappan > Attachments: HIVE-12977.1.patch, HIVE-12977.1.patch > > > The credentials present in the current UGI i.e. > UserGroupInformation.getCurrentUser().getCredentials() isn't passed to the > Tez session. It is instantiated with null credentials > session = TezClient.create("HIVE-" + sessionId, tezConfig, true, > commonLocalResources, null); > In this case, Tez fails to access resources even if the tokens are available > in memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13201) Compaction shouldn't be allowed on non-ACID table
[ https://issues.apache.org/jira/browse/HIVE-13201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13201: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed to master and branch-1 > Compaction shouldn't be allowed on non-ACID table > - > > Key: HIVE-13201 > URL: https://issues.apache.org/jira/browse/HIVE-13201 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13201.1.patch, HIVE-13201.2.patch > > > Looks like compaction is allowed on non-ACID table, although that's of no > sense and does nothing. Moreover the compaction request will be enqueued into > COMPACTION_QUEUE metastore table, which brings unnecessary overhead. > We should prevent compaction commands being allowed on non-ACID tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13235) Insert from select generates incorrect result when hive.optimize.constant.propagation is on
[ https://issues.apache.org/jira/browse/HIVE-13235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13235: Attachment: HIVE-13235.2.patch > Insert from select generates incorrect result when > hive.optimize.constant.propagation is on > --- > > Key: HIVE-13235 > URL: https://issues.apache.org/jira/browse/HIVE-13235 > Project: Hive > Issue Type: Bug > Components: Query Planning >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13235.1.patch, HIVE-13235.2.patch > > > The following query returns incorrect result when constant optimization is > turned on. The subquery happens to have an alias p1 to be the same as the > input partition name. Constant optimizer will optimize it incorrectly as the > constant. > When constant optimizer is turned off, we will get the correct result. > {noformat} > set hive.cbo.enable=false; > set hive.optimize.constant.propagation = true; > create table t1(c1 string, c2 double) partitioned by (p1 string, p2 string); > create table t2(p1 double, c2 string); > insert into table t1 partition(p1='40', p2='p2') values('c1', 0.0); > INSERT OVERWRITE TABLE t2 select if((c2 = 0.0), c2, '0') as p1, 2 as p2 from > t1 where c1 = 'c1' and p1 = '40'; > select * from t2; > 40 2 > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13201) Compaction shouldn't be allowed on non-ACID table
[ https://issues.apache.org/jira/browse/HIVE-13201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei Zheng updated HIVE-13201: - Attachment: HIVE-13201.2.patch Thanks [~alangates]. I've updated those 3 tests to make the tables ACID. > Compaction shouldn't be allowed on non-ACID table > - > > Key: HIVE-13201 > URL: https://issues.apache.org/jira/browse/HIVE-13201 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13201.1.patch, HIVE-13201.2.patch > > > Looks like compaction is allowed on non-ACID table, although that's of no > sense and does nothing. Moreover the compaction request will be enqueued into > COMPACTION_QUEUE metastore table, which brings unnecessary overhead. > We should prevent compaction commands being allowed on non-ACID tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-11292) MiniLlapCliDriver for running tests in llap
[ https://issues.apache.org/jira/browse/HIVE-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K resolved HIVE-11292. --- Resolution: Fixed Fix Version/s: 2.0.0 > MiniLlapCliDriver for running tests in llap > --- > > Key: HIVE-11292 > URL: https://issues.apache.org/jira/browse/HIVE-11292 > Project: Hive > Issue Type: Bug > Components: Test >Affects Versions: llap >Reporter: Vikram Dixit K >Assignee: Vikram Dixit K > Fix For: 2.0.0 > > > Create MiniLlapCliDriver for running unit tests in llap mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR
[ https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13084: Status: Patch Available (was: In Progress) > Vectorization add support for PROJECTION Multi-AND/OR > - > > Key: HIVE-13084 > URL: https://issues.apache.org/jira/browse/HIVE-13084 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan >Assignee: Matt McCline > Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, > HIVE-13084.03.patch, vector_between_date.q > > > When there is case statement in group by, hive throws unable to vectorize > exception. > e.g query just to demonstrate the problem > {noformat} > explain select l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts > group by l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END; > org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize > expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc > Vertex dependency in root stage > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 2 > File Output Operator [FS_7] > Group By Operator [GBY_5] (rows=888777234 width=108) > Output:["_col0","_col1"],keys:KEY._col0, KEY._col1 > <-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_4] > PartitionCols:_col0, _col1 > Group By Operator [GBY_3] (rows=1777554469 width=108) > Output:["_col0","_col1"],keys:_col0, _col1 > Select Operator [SEL_1] (rows=1777554469 width=108) > Output:["_col0","_col1"] > TableScan [TS_0] (rows=1777554469 width=108) > > rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"] > {noformat} > \cc [~mmccline], [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR
[ https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-13084: Status: In Progress (was: Patch Available) > Vectorization add support for PROJECTION Multi-AND/OR > - > > Key: HIVE-13084 > URL: https://issues.apache.org/jira/browse/HIVE-13084 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan >Assignee: Matt McCline > Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, > vector_between_date.q > > > When there is case statement in group by, hive throws unable to vectorize > exception. > e.g query just to demonstrate the problem > {noformat} > explain select l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts > group by l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END; > org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize > expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc > Vertex dependency in root stage > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 2 > File Output Operator [FS_7] > Group By Operator [GBY_5] (rows=888777234 width=108) > Output:["_col0","_col1"],keys:KEY._col0, KEY._col1 > <-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_4] > PartitionCols:_col0, _col1 > Group By Operator [GBY_3] (rows=1777554469 width=108) > Output:["_col0","_col1"],keys:_col0, _col1 > Select Operator [SEL_1] (rows=1777554469 width=108) > Output:["_col0","_col1"] > TableScan [TS_0] (rows=1777554469 width=108) > > rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"] > {noformat} > \cc [~mmccline], [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194098#comment-15194098 ] Alan Gates commented on HIVE-13249: --- There's already a threadpool in AcidHouseKeeperService. This should use that rather than having its own separate threadpool. Once the number of open transactions exceeds the threshold you should require it to drain a ways below that (maybe 90% of the threshold) before allowing new transactions. This avoid it constantly lurching in and out of trouble. > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Gates updated HIVE-13249: -- Status: Open (was: Patch Available) > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12977) Pass credentials in the current UGI while creating Tez session
[ https://issues.apache.org/jira/browse/HIVE-12977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Sathappan updated HIVE-12977: Status: Patch Available (was: Open) > Pass credentials in the current UGI while creating Tez session > -- > > Key: HIVE-12977 > URL: https://issues.apache.org/jira/browse/HIVE-12977 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Vinoth Sathappan >Assignee: Vinoth Sathappan > Attachments: HIVE-12977.1.patch > > > The credentials present in the current UGI i.e. > UserGroupInformation.getCurrentUser().getCredentials() isn't passed to the > Tez session. It is instantiated with null credentials > session = TezClient.create("HIVE-" + sessionId, tezConfig, true, > commonLocalResources, null); > In this case, Tez fails to access resources even if the tokens are available > in memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12977) Pass credentials in the current UGI while creating Tez session
[ https://issues.apache.org/jira/browse/HIVE-12977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinoth Sathappan updated HIVE-12977: Status: Open (was: Patch Available) > Pass credentials in the current UGI while creating Tez session > -- > > Key: HIVE-12977 > URL: https://issues.apache.org/jira/browse/HIVE-12977 > Project: Hive > Issue Type: Bug > Components: Tez >Reporter: Vinoth Sathappan >Assignee: Vinoth Sathappan > Attachments: HIVE-12977.1.patch > > > The credentials present in the current UGI i.e. > UserGroupInformation.getCurrentUser().getCredentials() isn't passed to the > Tez session. It is instantiated with null credentials > session = TezClient.create("HIVE-" + sessionId, tezConfig, true, > commonLocalResources, null); > In this case, Tez fails to access resources even if the tokens are available > in memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12481) Occasionally "Request is a replay" will be thrown from HS2
[ https://issues.apache.org/jira/browse/HIVE-12481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-12481: Resolution: Fixed Fix Version/s: 2.1.0 Release Note: Added a new JDBC connection property "retries" so if any transport connection fails, JDBC client will retry for the times specified by this parameter. Status: Resolved (was: Patch Available) > Occasionally "Request is a replay" will be thrown from HS2 > -- > > Key: HIVE-12481 > URL: https://issues.apache.org/jira/browse/HIVE-12481 > Project: Hive > Issue Type: Improvement > Components: Authentication >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Labels: TODOC2.1 > Fix For: 2.1.0 > > Attachments: HIVE-12481.2.patch, HIVE-12481.3.patch, HIVE-12481.patch > > > We have seen the following exception thrown from HS2 in secured cluster when > many queries are running simultaneously on single HS2 instance. > The cause I can guess is that it happens that two queries are submitted at > the same time and have the same timestamp. For such case, we can add a retry > for the query. > > {noformat} > 2015-11-18 16:12:33,117 ERROR org.apache.thrift.transport.TSaslTransport: > SASL negotiation failure > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: Failure unspecified at GSS-API level (Mechanism level: Request > is a replay (34))] > at > com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:177) > at > org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539) > at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:356) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism > level: Request is a replay (34)) > at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:788) > at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342) > at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285) > at > com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:155) > ... 14 more > Caused by: KrbException: Request is a replay (34) > at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:308) > at sun.security.krb5.KrbApReq.(KrbApReq.java:144) > at > sun.security.jgss.krb5.InitSecContextToken.(InitSecContextToken.java:108) > at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:771) > ... 17 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12481) Occasionally "Request is a replay" will be thrown from HS2
[ https://issues.apache.org/jira/browse/HIVE-12481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-12481: Labels: TODOC2.1 (was: ) > Occasionally "Request is a replay" will be thrown from HS2 > -- > > Key: HIVE-12481 > URL: https://issues.apache.org/jira/browse/HIVE-12481 > Project: Hive > Issue Type: Improvement > Components: Authentication >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Labels: TODOC2.1 > Fix For: 2.1.0 > > Attachments: HIVE-12481.2.patch, HIVE-12481.3.patch, HIVE-12481.patch > > > We have seen the following exception thrown from HS2 in secured cluster when > many queries are running simultaneously on single HS2 instance. > The cause I can guess is that it happens that two queries are submitted at > the same time and have the same timestamp. For such case, we can add a retry > for the query. > > {noformat} > 2015-11-18 16:12:33,117 ERROR org.apache.thrift.transport.TSaslTransport: > SASL negotiation failure > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: Failure unspecified at GSS-API level (Mechanism level: Request > is a replay (34))] > at > com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:177) > at > org.apache.thrift.transport.TSaslTransport$SaslParticipant.evaluateChallengeOrResponse(TSaslTransport.java:539) > at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:283) > at > org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41) > at > org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:356) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1651) > at > org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: GSSException: Failure unspecified at GSS-API level (Mechanism > level: Request is a replay (34)) > at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:788) > at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:342) > at sun.security.jgss.GSSContextImpl.acceptSecContext(GSSContextImpl.java:285) > at > com.sun.security.sasl.gsskerb.GssKrb5Server.evaluateResponse(GssKrb5Server.java:155) > ... 14 more > Caused by: KrbException: Request is a replay (34) > at sun.security.krb5.KrbApReq.authenticate(KrbApReq.java:308) > at sun.security.krb5.KrbApReq.(KrbApReq.java:144) > at > sun.security.jgss.krb5.InitSecContextToken.(InitSecContextToken.java:108) > at sun.security.jgss.krb5.Krb5Context.acceptSecContext(Krb5Context.java:771) > ... 17 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13249) Hard upper bound on number of open transactions
[ https://issues.apache.org/jira/browse/HIVE-13249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193937#comment-15193937 ] Hive QA commented on HIVE-13249: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12793243/HIVE-13249.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7265/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7265/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7265/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n /usr/java/jdk1.7.0_45-cloudera ]] + export JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + JAVA_HOME=/usr/java/jdk1.7.0_45-cloudera + export PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + PATH=/usr/java/jdk1.7.0_45-cloudera/bin/:/usr/local/apache-maven-3.0.5/bin:/usr/java/jdk1.7.0_45-cloudera/bin:/usr/local/apache-ant-1.9.1/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hiveptest/bin + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'M2_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + M2_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-TRUNK-Build-7265/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at d4c1fdc HIVE-13251: hive can't read the decimal in AVRO file generated from previous version (Reviewed by Szehon Ho) + git clean -f -d Removing ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ColMultiAndCol.java Removing ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/ColMultiOrCol.java Removing ql/src/test/queries/clientpositive/vector_multi_and.q Removing ql/src/test/queries/clientpositive/vector_multi_or.q Removing ql/src/test/results/clientpositive/vector_multi_and.q.out Removing ql/src/test/results/clientpositive/vector_multi_or.q.out + git checkout master Already on 'master' + git reset --hard origin/master HEAD is now at d4c1fdc HIVE-13251: hive can't read the decimal in AVRO file generated from previous version (Reviewed by Szehon Ho) + git merge --ff-only origin/master Already up-to-date. + git gc + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12793243 - PreCommit-HIVE-TRUNK-Build > Hard upper bound on number of open transactions > --- > > Key: HIVE-13249 > URL: https://issues.apache.org/jira/browse/HIVE-13249 > Project: Hive > Issue Type: Improvement > Components: Transactions >Affects Versions: 2.0.0 >Reporter: Wei Zheng >Assignee: Wei Zheng > Attachments: HIVE-13249.1.patch > > > We need to have a safeguard by adding an upper bound for open transactions to > avoid huge number of open-transaction requests, usually due to improper > configuration of clients such as Storm. > Once that limit is reached, clients will start failing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13084) Vectorization add support for PROJECTION Multi-AND/OR
[ https://issues.apache.org/jira/browse/HIVE-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193929#comment-15193929 ] Hive QA commented on HIVE-13084: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12793246/HIVE-13084.02.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 9807 tests executed *Failed tests:* {noformat} TestMiniTezCliDriver-dynpart_sort_optimization2.q-cte_mat_1.q-tez_bmj_schema_evolution.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file org.apache.hadoop.hive.ql.exec.vector.TestVectorizationContext.testVectorizeAndOrProjectionExpression {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7264/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7264/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7264/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12793246 - PreCommit-HIVE-TRUNK-Build > Vectorization add support for PROJECTION Multi-AND/OR > - > > Key: HIVE-13084 > URL: https://issues.apache.org/jira/browse/HIVE-13084 > Project: Hive > Issue Type: Bug > Components: Vectorization >Reporter: Rajesh Balamohan >Assignee: Matt McCline > Attachments: HIVE-13084.01.patch, HIVE-13084.02.patch, > vector_between_date.q > > > When there is case statement in group by, hive throws unable to vectorize > exception. > e.g query just to demonstrate the problem > {noformat} > explain select l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END as wk from lineitem_test_l_shipdate_ts > group by l_partkey, case when l_commitdate between '2015-06-30' AND > '2015-07-06' THEN '2015-06-30' END; > org.apache.hadoop.hive.ql.metadata.HiveException: Could not vectorize > expression: org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc > Vertex dependency in root stage > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Stage-0 > Fetch Operator > limit:-1 > Stage-1 > Reducer 2 > File Output Operator [FS_7] > Group By Operator [GBY_5] (rows=888777234 width=108) > Output:["_col0","_col1"],keys:KEY._col0, KEY._col1 > <-Map 1 [SIMPLE_EDGE] > SHUFFLE [RS_4] > PartitionCols:_col0, _col1 > Group By Operator [GBY_3] (rows=1777554469 width=108) > Output:["_col0","_col1"],keys:_col0, _col1 > Select Operator [SEL_1] (rows=1777554469 width=108) > Output:["_col0","_col1"] > TableScan [TS_0] (rows=1777554469 width=108) > > rajesh@lineitem_test_l_shipdate_ts,lineitem_test_l_shipdate_ts,Tbl:COMPLETE,Col:NONE,Output:["l_partkey","l_commitdate"] > {noformat} > \cc [~mmccline], [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13258) LLAP: Add hdfs bytes read and spilled bytes to tez print summary
[ https://issues.apache.org/jira/browse/HIVE-13258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193915#comment-15193915 ] Prasanth Jayachandran commented on HIVE-13258: -- FileSystemCounters does not show up for LLAP. [~sseth] Do we need any tez side changes for these? Can we hold on to FileSystem reference somewhere and when we unregister the task read FS counters, update the TezCounters (that we add in registerTask()) and remove the references? > LLAP: Add hdfs bytes read and spilled bytes to tez print summary > > > Key: HIVE-13258 > URL: https://issues.apache.org/jira/browse/HIVE-13258 > Project: Hive > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > When printing counters to console it will be useful to print hdfs bytes read > and spilled bytes which will help with debugging issues faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12995) LLAP: Synthetic file ids need collision checks
[ https://issues.apache.org/jira/browse/HIVE-12995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12995: Attachment: HIVE-12995.04.patch The same patch... looks like it disappeared from the queue > LLAP: Synthetic file ids need collision checks > -- > > Key: HIVE-12995 > URL: https://issues.apache.org/jira/browse/HIVE-12995 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.1.0 >Reporter: Gopal V >Assignee: Sergey Shelukhin > Attachments: HIVE-12995.01.patch, HIVE-12995.02.patch, > HIVE-12995.03.patch, HIVE-12995.04.patch, HIVE-12995.patch > > > LLAP synthetic file ids do not have any way of checking whether a collision > occurs other than a data-error. > Synthetic file-ids have only been used with unit tests so far - but they will > be needed to add cache mechanisms to non-HDFS filesystems. > In case of Synthetic file-ids, it is recommended that we track the full-tuple > (path, mtime, len) in the cache so that a cache-hit for the synthetic file-id > can be compared against the parameters & only accepted if those match. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13218) LLAP: better configs part 1
[ https://issues.apache.org/jira/browse/HIVE-13218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-13218: Resolution: Fixed Fix Version/s: 2.1.0 Status: Resolved (was: Patch Available) > LLAP: better configs part 1 > --- > > Key: HIVE-13218 > URL: https://issues.apache.org/jira/browse/HIVE-13218 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Fix For: 2.1.0 > > Attachments: HIVE-13218.01.patch, HIVE-13218.patch > > > 1) IO threads need to be settable when creating the package, and should be > equal to the number of executors by default. > 2) uber should be disabled in "all" mode as it's slower than running in LLAP. > Maybe others. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-13280: Description: With a simple query (select from orc table and insert into HBase external table): {code:sql} insert into table register.register select * from aa_temp {code} The aa_temp table have 45 orc files. It generate 45 mappers. Some mappers fail with this error: {noformat} Caused by: java.lang.IllegalArgumentException: Must specify table name at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) at org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) ... 25 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 (state=08S01,code=2) {noformat} If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is fine because there are only one mapper. was: With a simple query (select from orc table and insert into HBase external table): {code:sql} insert into table register.register select * from aa_temp {code} Some mapper fail with this error: {noformat} Caused by: java.lang.IllegalArgumentException: Must specify table name at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) at org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) ... 25 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 (state=08S01,code=2) {noformat} > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > The aa_temp table have 45 orc files. It generate 45 mappers. > Some mappers fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} > If I do an ALTER CONCATENATE for aa_temp. And redo the query. Everything is > fine because there are only one mapper. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-13280: Component/s: HBase Handler > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug > Components: HBase Handler >Affects Versions: 2.0.0 >Reporter: Damien Carol > > With a simple query (select from orc table and insert into HBase external > table): > {code:sql} > insert into table register.register select * from aa_temp > {code} > Some mapper fail with this error: > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13280) Error when more than 1 mapper for HBase storage handler
[ https://issues.apache.org/jira/browse/HIVE-13280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Damien Carol updated HIVE-13280: Description: {noformat} Caused by: java.lang.IllegalArgumentException: Must specify table name at org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) at org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) at org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) ... 25 more ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0 (state=08S01,code=2) {noformat} > Error when more than 1 mapper for HBase storage handler > --- > > Key: HIVE-13280 > URL: https://issues.apache.org/jira/browse/HIVE-13280 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Damien Carol > > {noformat} > Caused by: java.lang.IllegalArgumentException: Must specify table name > at > org.apache.hadoop.hbase.mapreduce.TableOutputFormat.setConf(TableOutputFormat.java:188) > at > org.apache.hive.common.util.ReflectionUtil.setConf(ReflectionUtil.java:101) > at > org.apache.hive.common.util.ReflectionUtil.newInstance(ReflectionUtil.java:87) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:300) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveOutputFormat(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.createHiveOutputFormat(FileSinkOperator.java:1126) > ... 25 more > ]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 > killedTasks:35, Vertex vertex_1457964631631_0015_3_00 [Map 1] killed/failed > due to:OWN_TASK_FAILURE]DAG did not succeed due to VERTEX_FAILURE. > failedVertices:1 killedVertices:0 (state=08S01,code=2) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13243) Hive drop table on encyption zone fails for external tables
[ https://issues.apache.org/jira/browse/HIVE-13243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193582#comment-15193582 ] Chaoyu Tang commented on HIVE-13243: The four test failures seem not related to this patch. [~spena], could you help to review the patch? Thanks > Hive drop table on encyption zone fails for external tables > --- > > Key: HIVE-13243 > URL: https://issues.apache.org/jira/browse/HIVE-13243 > Project: Hive > Issue Type: Bug > Components: Encryption, Metastore >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-13243.1.patch, HIVE-13243.2.patch, HIVE-13243.patch > > > When dropping an external table with its data located in an encryption zone, > hive should not throw out MetaException(message:Unable to drop table because > it is in an encryption zone and trash is enabled. Use PURGE option to skip > trash.) in checkTrashPurgeCombination since the data should not get deleted > (or trashed) anyway regardless HDFS Trash is enabled or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13243) Hive drop table on encyption zone fails for external tables
[ https://issues.apache.org/jira/browse/HIVE-13243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193583#comment-15193583 ] Chaoyu Tang commented on HIVE-13243: [~spena] Thanks! > Hive drop table on encyption zone fails for external tables > --- > > Key: HIVE-13243 > URL: https://issues.apache.org/jira/browse/HIVE-13243 > Project: Hive > Issue Type: Bug > Components: Encryption, Metastore >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-13243.1.patch, HIVE-13243.2.patch, HIVE-13243.patch > > > When dropping an external table with its data located in an encryption zone, > hive should not throw out MetaException(message:Unable to drop table because > it is in an encryption zone and trash is enabled. Use PURGE option to skip > trash.) in checkTrashPurgeCombination since the data should not get deleted > (or trashed) anyway regardless HDFS Trash is enabled or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13243) Hive drop table on encyption zone fails for external tables
[ https://issues.apache.org/jira/browse/HIVE-13243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193574#comment-15193574 ] Sergio Peña commented on HIVE-13243: Great. Tests are good. +1 > Hive drop table on encyption zone fails for external tables > --- > > Key: HIVE-13243 > URL: https://issues.apache.org/jira/browse/HIVE-13243 > Project: Hive > Issue Type: Bug > Components: Encryption, Metastore >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-13243.1.patch, HIVE-13243.2.patch, HIVE-13243.patch > > > When dropping an external table with its data located in an encryption zone, > hive should not throw out MetaException(message:Unable to drop table because > it is in an encryption zone and trash is enabled. Use PURGE option to skip > trash.) in checkTrashPurgeCombination since the data should not get deleted > (or trashed) anyway regardless HDFS Trash is enabled or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2
[ https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13149: Attachment: HIVE-13149.4.patch > Remove some unnecessary HMS connections from HS2 > - > > Key: HIVE-13149 > URL: https://issues.apache.org/jira/browse/HIVE-13149 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, > HIVE-13149.3.patch, HIVE-13149.4.patch > > > In SessionState class, currently we will always try to get a HMS connection > in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} > regardless of if the connection will be used later or not. > When SessionState is accessed by the tasks in TaskRunner.java, although most > of the tasks other than some like StatsTask, don't need to access HMS. > Currently a new HMS connection will be established for each Task thread. If > HiveServer2 is configured to run in parallel and the query involves many > tasks, then the connections are created but unused. > {noformat} > @Override > public void run() { > runner = Thread.currentThread(); > try { > OperationLog.setCurrentOperationLog(operationLog); > SessionState.start(ss); > runSequential(); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2
[ https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13149: Attachment: (was: HIVE-13149.4.patch) > Remove some unnecessary HMS connections from HS2 > - > > Key: HIVE-13149 > URL: https://issues.apache.org/jira/browse/HIVE-13149 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, > HIVE-13149.3.patch > > > In SessionState class, currently we will always try to get a HMS connection > in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} > regardless of if the connection will be used later or not. > When SessionState is accessed by the tasks in TaskRunner.java, although most > of the tasks other than some like StatsTask, don't need to access HMS. > Currently a new HMS connection will be established for each Task thread. If > HiveServer2 is configured to run in parallel and the query involves many > tasks, then the connections are created but unused. > {noformat} > @Override > public void run() { > runner = Thread.currentThread(); > try { > OperationLog.setCurrentOperationLog(operationLog); > SessionState.start(ss); > runSequential(); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13149) Remove some unnecessary HMS connections from HS2
[ https://issues.apache.org/jira/browse/HIVE-13149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-13149: Attachment: HIVE-13149.4.patch > Remove some unnecessary HMS connections from HS2 > - > > Key: HIVE-13149 > URL: https://issues.apache.org/jira/browse/HIVE-13149 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13149.1.patch, HIVE-13149.2.patch, > HIVE-13149.3.patch, HIVE-13149.4.patch > > > In SessionState class, currently we will always try to get a HMS connection > in {{start(SessionState startSs, boolean isAsync, LogHelper console)}} > regardless of if the connection will be used later or not. > When SessionState is accessed by the tasks in TaskRunner.java, although most > of the tasks other than some like StatsTask, don't need to access HMS. > Currently a new HMS connection will be established for each Task thread. If > HiveServer2 is configured to run in parallel and the query involves many > tasks, then the connections are created but unused. > {noformat} > @Override > public void run() { > runner = Thread.currentThread(); > try { > OperationLog.setCurrentOperationLog(operationLog); > SessionState.start(ss); > runSequential(); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13243) Hive drop table on encyption zone fails for external tables
[ https://issues.apache.org/jira/browse/HIVE-13243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193532#comment-15193532 ] Hive QA commented on HIVE-13243: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12792596/HIVE-13243.2.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9818 tests executed *Failed tests:* {noformat} TestSparkCliDriver-groupby3_map.q-sample2.q-auto_join14.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-groupby_map_ppr_multi_distinct.q-table_access_keys_stats.q-groupby4_noskew.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-join_rc.q-insert1.q-vectorized_rcfile_columnar.q-and-12-more - did not produce a TEST-*.xml file TestSparkCliDriver-ppd_join4.q-join9.q-ppd_join3.q-and-12-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7263/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/7263/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-7263/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12792596 - PreCommit-HIVE-TRUNK-Build > Hive drop table on encyption zone fails for external tables > --- > > Key: HIVE-13243 > URL: https://issues.apache.org/jira/browse/HIVE-13243 > Project: Hive > Issue Type: Bug > Components: Encryption, Metastore >Reporter: Chaoyu Tang >Assignee: Chaoyu Tang > Attachments: HIVE-13243.1.patch, HIVE-13243.2.patch, HIVE-13243.patch > > > When dropping an external table with its data located in an encryption zone, > hive should not throw out MetaException(message:Unable to drop table because > it is in an encryption zone and trash is enabled. Use PURGE option to skip > trash.) in checkTrashPurgeCombination since the data should not get deleted > (or trashed) anyway regardless HDFS Trash is enabled or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13232) Aggressively drop compression buffers in ORC OutStreams
[ https://issues.apache.org/jira/browse/HIVE-13232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-13232: - Attachment: HIVE-13232.patch At first I didn't think that I could unit test this change, but then I realized that I could use the OutStream.getBufferSize to observe the change. This patch just adds the new test. > Aggressively drop compression buffers in ORC OutStreams > --- > > Key: HIVE-13232 > URL: https://issues.apache.org/jira/browse/HIVE-13232 > Project: Hive > Issue Type: Bug > Components: ORC >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 0.14.1, 1.3.0, 2.1.0 > > Attachments: HIVE-13232.patch, HIVE-13232.patch, HIVE-13232.patch > > > In Hive 0.11, when ORC's OutStream's were flushed they dropped all of the > their buffers. In the patch for HIVE-4324, we inadvertently changed that > behavior so that one of the buffers is held on to. For queries with a lot of > writers and thus under significant memory pressure this can have a > significant impact on the memory usage. > Note that "hive.optimize.sort.dynamic.partition" avoids this problem by > sorting on the dynamic partition key and thus only a single ORC writer is > open at once. This will use memory more effectively and avoid creating ORC > files with very small stripes, which will produce better downstream > performance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-13279) SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's file system
[ https://issues.apache.org/jira/browse/HIVE-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-13279 started by Aleksey Vovchenko. > SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's > file system > -- > > Key: HIVE-13279 > URL: https://issues.apache.org/jira/browse/HIVE-13279 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Aleksey Vovchenko >Assignee: Aleksey Vovchenko > > h2. STEP 1. Create test Tables > Execute in command line: > {noformat} > nano test.data > {noformat} > Add to file: > {noformat} > 1,aa > 2,aa > 3,ff > 4,sad > 5,adsf > 6,adsf > 7,affss > {noformat} > {noformat} > hadoop fs -put test.data / > {noformat} > {noformat} > hive> create table test (x int, y string, z string) ROW FORMAT DELIMITED > FIELDS TERMINATED BY ','; > hive> create table ptest(x int, y string) partitioned by(z string); > hive> LOAD DATA INPATH '/test.data' OVERWRITE INTO TABLE test; > hive> insert overwrite table ptest partition(z=65) select * from test; > hive> insert overwrite table ptest partition(z=67) select * from test; > {noformat} > h2. STEP 2. Compare lastUpdateTime > Execute in Hive shell: > {noformat} > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65'); > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67'); > {noformat} > lastUpdateTime should be different. > h2. STEP 3. Put data into hdfs and compare lastUpdateTime > Execute in command line: > {noformat} > hadoop fs -put test.data /user/hive/warehouse/ptest > {noformat} > Execute in Hive shell: > {noformat} > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65'); > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67'); > {noformat} > lastUpdateTime should be different but they are same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13279) SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's file system
[ https://issues.apache.org/jira/browse/HIVE-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Vovchenko updated HIVE-13279: - Status: Open (was: Patch Available) > SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's > file system > -- > > Key: HIVE-13279 > URL: https://issues.apache.org/jira/browse/HIVE-13279 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Aleksey Vovchenko >Assignee: Aleksey Vovchenko > > h2. STEP 1. Create test Tables > Execute in command line: > {noformat} > nano test.data > {noformat} > Add to file: > {noformat} > 1,aa > 2,aa > 3,ff > 4,sad > 5,adsf > 6,adsf > 7,affss > {noformat} > {noformat} > hadoop fs -put test.data / > {noformat} > {noformat} > hive> create table test (x int, y string, z string) ROW FORMAT DELIMITED > FIELDS TERMINATED BY ','; > hive> create table ptest(x int, y string) partitioned by(z string); > hive> LOAD DATA INPATH '/test.data' OVERWRITE INTO TABLE test; > hive> insert overwrite table ptest partition(z=65) select * from test; > hive> insert overwrite table ptest partition(z=67) select * from test; > {noformat} > h2. STEP 2. Compare lastUpdateTime > Execute in Hive shell: > {noformat} > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65'); > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67'); > {noformat} > lastUpdateTime should be different. > h2. STEP 3. Put data into hdfs and compare lastUpdateTime > Execute in command line: > {noformat} > hadoop fs -put test.data /user/hive/warehouse/ptest > {noformat} > Execute in Hive shell: > {noformat} > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65'); > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67'); > {noformat} > lastUpdateTime should be different but they are same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13279) SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's file system
[ https://issues.apache.org/jira/browse/HIVE-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Vovchenko updated HIVE-13279: - Status: Patch Available (was: Open) > SHOW TABLE EXTENDED doesn't show the correct lastUpdateTime of partition's > file system > -- > > Key: HIVE-13279 > URL: https://issues.apache.org/jira/browse/HIVE-13279 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Aleksey Vovchenko >Assignee: Aleksey Vovchenko > > h2. STEP 1. Create test Tables > Execute in command line: > {noformat} > nano test.data > {noformat} > Add to file: > {noformat} > 1,aa > 2,aa > 3,ff > 4,sad > 5,adsf > 6,adsf > 7,affss > {noformat} > {noformat} > hadoop fs -put test.data / > {noformat} > {noformat} > hive> create table test (x int, y string, z string) ROW FORMAT DELIMITED > FIELDS TERMINATED BY ','; > hive> create table ptest(x int, y string) partitioned by(z string); > hive> LOAD DATA INPATH '/test.data' OVERWRITE INTO TABLE test; > hive> insert overwrite table ptest partition(z=65) select * from test; > hive> insert overwrite table ptest partition(z=67) select * from test; > {noformat} > h2. STEP 2. Compare lastUpdateTime > Execute in Hive shell: > {noformat} > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65'); > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67'); > {noformat} > lastUpdateTime should be different. > h2. STEP 3. Put data into hdfs and compare lastUpdateTime > Execute in command line: > {noformat} > hadoop fs -put test.data /user/hive/warehouse/ptest > {noformat} > Execute in Hive shell: > {noformat} > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='65'); > hive> SHOW TABLE EXTENDED FROM default LIKE 'ptest' PARTITION(z='67'); > {noformat} > lastUpdateTime should be different but they are same. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-13251) hive can't read the decimal in AVRO file generated from previous version
[ https://issues.apache.org/jira/browse/HIVE-13251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193423#comment-15193423 ] Aihua Xu commented on HIVE-13251: - dec_old.avro is binary. Seems we had issue to apply patch for such file as I found from other jira in HIVE-5823. I will do the same. I have verified the avro_decimal_old.q passed locally. Commit instructions: 1. add attachment dec.avro to data/files folder 2. apply attached patch. 3. commit > hive can't read the decimal in AVRO file generated from previous version > > > Key: HIVE-13251 > URL: https://issues.apache.org/jira/browse/HIVE-13251 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.1.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Attachments: HIVE-13251.1.patch > > > HIVE-7174 makes the avro schema change to match avro definition, while it > breaks the compatibility if the file is generated from the previous Hive > although the file schema from the file for such decimal is not correct based > on avro definition. We should allow to read old file format "precision" : > "4", "scale": "8", but when we write, we should write in the new format. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-13276) Hive on Spark doesn't work when spark.master=local
[ https://issues.apache.org/jira/browse/HIVE-13276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated HIVE-13276: --- Assignee: (was: Xuefu Zhang) > Hive on Spark doesn't work when spark.master=local > -- > > Key: HIVE-13276 > URL: https://issues.apache.org/jira/browse/HIVE-13276 > Project: Hive > Issue Type: Bug > Components: Spark >Affects Versions: 2.1.0 >Reporter: Xuefu Zhang > > The following problem occurs with latest Hive master and Spark 1.6.1. I'm > using hive CLI on mac. > {code} > set mapreduce.job.reduces= > java.lang.NoClassDefFoundError: Could not initialize class > org.apache.spark.rdd.RDDOperationScope$ > at org.apache.spark.SparkContext.withScope(SparkContext.scala:714) > at org.apache.spark.SparkContext.hadoopRDD(SparkContext.scala:991) > at > org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:419) > at > org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:205) > at > org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:145) > at > org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:117) > at > org.apache.hadoop.hive.ql.exec.spark.LocalHiveSparkClient.execute(LocalHiveSparkClient.java:130) > at > org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:71) > at > org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:94) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:156) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:101) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1837) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1578) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1351) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1122) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1110) > at > org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184) > at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400) > at > org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:778) > at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:717) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:645) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.util.RunJar.main(RunJar.java:212) > FAILED: Execution Error, return code -101 from > org.apache.hadoop.hive.ql.exec.spark.SparkTask. Could not initialize class > org.apache.spark.rdd.RDDOperationScope$ > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12619) Switching the field order within an array of structs causes the query to fail
[ https://issues.apache.org/jira/browse/HIVE-12619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jimmy Xiang updated HIVE-12619: --- Status: Patch Available (was: Open) > Switching the field order within an array of structs causes the query to fail > - > > Key: HIVE-12619 > URL: https://issues.apache.org/jira/browse/HIVE-12619 > Project: Hive > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Ang Zhang >Assignee: Mohammad Kamrul Islam >Priority: Minor > Attachments: HIVE-12619.2.patch, HIVE-12619.3.patch > > > Switching the field order within an array of structs causes the query to fail > or return the wrong data for the fields, but switching the field order within > just a struct works. > How to reproduce: > Case1 if the two fields have the same type, query will return wrong data for > the fields > drop table if exists schema_test; > create table schema_test (msg array) stored > as parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 'abc2')), array(named_struct('f1', 'efg', 'f2', 'efg2'))) from one > limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":"efg2"}] > --[{"f1":"abc","f2":"abc2"}] > alter table schema_test change msg msg array ; > select * from schema_test; > --returns > --[{"f2":"efg","f1":"efg2"}] > --[{"f2":"abc","f1":"abc2"}] > Case2: if the two fields have different type, the query will fail > drop table if exists schema_test; > create table schema_test (msg array ) stored as > parquet; > insert into table schema_test select stack(2, array(named_struct('f1', 'abc', > 'f2', 1)), array(named_struct('f1', 'efg', 'f2', 2))) from one limit 2; > select * from schema_test; > --returns > --[{"f1":"efg","f2":2}] > --[{"f1":"abc","f2":1}] > alter table schema_test change msg msg array ; > select * from schema_test; > Failed with exception > java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: > java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to > org.apache.hadoop.io.IntWritable -- This message was sent by Atlassian JIRA (v6.3.4#6332)