[jira] [Commented] (HIVE-10410) Apparent race condition in HiveServer2 causing intermittent query failures
[ https://issues.apache.org/jira/browse/HIVE-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573694#comment-14573694 ] Richard Williams commented on HIVE-10410: - [~ctang.ma] Specifically, we have found that this issue occurs whenever numerous JDBC clients (or rather, numerous clients that set the runAsync flag in TExecuteStatementReq to true, as JDBC does) are executing queries against HiveServer2 concurrently, as that is what causes multiple threads in the async execution thread pool to use their shared MetaStoreClient at once. I'll go ahead and regenerate the patch we've been running based on the Hive trunk. It's very simplistic--it just removes the code that sets the Hive objects in the pooled threads to the Hive object from the calling thread. As for the shared SessionState and HiveConf, those are suspicious as well, and might be causing other problems; however, since we began patching HiveServer2 to prevent the sharing of the Hive object, this particular issue has disappeared for us. Apparent race condition in HiveServer2 causing intermittent query failures -- Key: HIVE-10410 URL: https://issues.apache.org/jira/browse/HIVE-10410 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.1 Environment: CDH 5.3.3 CentOS 6.4 Reporter: Richard Williams On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC occasionally trigger odd Thrift exceptions with messages such as Read a negative frame size (-2147418110)! or out of sequence response in HiveServer2's connections to the metastore. For certain metastore calls (for example, showDatabases), these Thrift exceptions are converted to MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient from retrying these calls and thus causes the failure to bubble out to the JDBC client. Note that as far as we can tell, this issue appears to only affect queries that are submitted with the runAsync flag on TExecuteStatementReq set to true (which, in practice, seems to mean all JDBC queries), and it appears to only manifest when HiveServer2 is using the new HTTP transport mechanism. When both these conditions hold, we are able to fairly reliably reproduce the issue by spawning about 100 simple, concurrent hive queries (we have been using show databases), two or three of which typically fail. However, when either of these conditions do not hold, we are no longer able to reproduce the issue. Some example stack traces from the HiveServer2 logs: {noformat} 2015-04-16 13:54:55,486 ERROR hive.log: Got exception: org.apache.thrift.transport.TTransportException Read a negative frame size (-2147418110)! org.apache.thrift.transport.TTransportException: Read a negative frame size (-2147418110)! at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:435) at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:414) at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:837) at org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60) at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90) at com.sun.proxy.$Proxy6.getDatabases(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1139) at org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2445) at
[jira] [Updated] (HIVE-10551) OOM when running query_89 with vectorization on hybridgrace=false
[ https://issues.apache.org/jira/browse/HIVE-10551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10551: -- Assignee: (was: Matt McCline) OOM when running query_89 with vectorization on hybridgrace=false --- Key: HIVE-10551 URL: https://issues.apache.org/jira/browse/HIVE-10551 Project: Hive Issue Type: Bug Reporter: Rajesh Balamohan Attachments: HIVE-10551-explain-plan.log, hive-10551.png, hive_10551.png - TPC-DS Query_89 @ 10 TB scale - Trunk version of Hive + Tez 0.7.0-SNAPSHOT - Additional settings ( hive.vectorized.groupby.maxentries=1024 , tez.runtime.io.sort.factor=200 tez.runtime.io.sort.mb=1800 hive.tez.container.size=4096 ,hive.mapjoin.hybridgrace.hashtable=false ) Will attach the profiler snapshot asap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10551) OOM when running query_89 with vectorization on hybridgrace=false
[ https://issues.apache.org/jira/browse/HIVE-10551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10551: -- Assignee: Matt McCline (was: Vikram Dixit K) OOM when running query_89 with vectorization on hybridgrace=false --- Key: HIVE-10551 URL: https://issues.apache.org/jira/browse/HIVE-10551 Project: Hive Issue Type: Bug Reporter: Rajesh Balamohan Assignee: Matt McCline Attachments: HIVE-10551-explain-plan.log, hive-10551.png, hive_10551.png - TPC-DS Query_89 @ 10 TB scale - Trunk version of Hive + Tez 0.7.0-SNAPSHOT - Additional settings ( hive.vectorized.groupby.maxentries=1024 , tez.runtime.io.sort.factor=200 tez.runtime.io.sort.mb=1800 hive.tez.container.size=4096 ,hive.mapjoin.hybridgrace.hashtable=false ) Will attach the profiler snapshot asap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10910) Alter table drop partition queries in encrypted zone failing to remove data from HDFS
[ https://issues.apache.org/jira/browse/HIVE-10910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573932#comment-14573932 ] Hive QA commented on HIVE-10910: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12737742/HIVE-10910.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9002 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4182/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4182/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4182/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12737742 - PreCommit-HIVE-TRUNK-Build Alter table drop partition queries in encrypted zone failing to remove data from HDFS - Key: HIVE-10910 URL: https://issues.apache.org/jira/browse/HIVE-10910 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: 1.2.0 Reporter: Aswathy Chellammal Sreekumar Assignee: Eugene Koifman Attachments: HIVE-10910.patch Alter table query trying to drop partition removes metadata of partition but fails to remove the data from HDFS hive create table table_1(name string, age int, gpa double) partitioned by (b string) stored as textfile; OK Time taken: 0.732 seconds hive alter table table_1 add partition (b='2010-10-10'); OK Time taken: 0.496 seconds hive show partitions table_1; OK b=2010-10-10 Time taken: 0.781 seconds, Fetched: 1 row(s) hive alter table table_1 drop partition (b='2010-10-10'); FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Got exception: java.io.IOException Failed to move to trash: hdfs://ip-address:8020/warehouse-dir/table_1/b=2010-10-10 hive show partitions table_1; OK Time taken: 0.622 seconds -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch
[ https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573962#comment-14573962 ] Ferdinand Xu commented on HIVE-10943: - Hi [~xuefuz], is there anything else need to be done for enabling the precommit? Beeline-cli: Enable precommit for beelie-cli branch Key: HIVE-10943 URL: https://issues.apache.org/jira/browse/HIVE-10943 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10872) LLAP: make sure tests pass
[ https://issues.apache.org/jira/browse/HIVE-10872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573866#comment-14573866 ] Hive QA commented on HIVE-10872: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12737685/HIVE-10872.03.patch {color:red}ERROR:{color} -1 due to 424 failed/errored test(s), 8732 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketizedhiveinputformat_auto org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin12 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin13 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nulls org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nullsafe org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_7 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_list_bucket_dml_9 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_llap org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_15 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_8 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_stats11 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_char_mapjoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_decimal_mapjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_inner_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_interval_mapjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_left_outer_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_left_outer_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_leftsemi_mapjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_nullsafe_join org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join0 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join2 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join3 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_outer_join5 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_varchar_mapjoin1 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_context org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_mapjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_nested_mapjoin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestCompareCliDriver.testCompareCliDriver_llap_0 org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.initializationError org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_alter_merge_stats_orc org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_auto_sortmerge_join_5 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_windowing org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
[jira] [Updated] (HIVE-10939) Make TestFileDump robust
[ https://issues.apache.org/jira/browse/HIVE-10939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10939: Attachment: (was: HIVE-10939.patch) Make TestFileDump robust Key: HIVE-10939 URL: https://issues.apache.org/jira/browse/HIVE-10939 Project: Hive Issue Type: Test Components: Tests Affects Versions: 1.3.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10939.patch It fails on Windows OS currently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10939) Make TestFileDump robust
[ https://issues.apache.org/jira/browse/HIVE-10939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10939: Attachment: HIVE-10939.patch Make TestFileDump robust Key: HIVE-10939 URL: https://issues.apache.org/jira/browse/HIVE-10939 Project: Hive Issue Type: Test Components: Tests Affects Versions: 1.3.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10939.patch It fails on Windows OS currently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10941) Provide option to disable spark tests outside itests
[ https://issues.apache.org/jira/browse/HIVE-10941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10941: - Attachment: HIVE-10941.1.patch [~sushanth] Can you please take a look at this patch Thanks Hari Provide option to disable spark tests outside itests Key: HIVE-10941 URL: https://issues.apache.org/jira/browse/HIVE-10941 Project: Hive Issue Type: Bug Components: Tests Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10941.1.patch HIVE-10477 provided an option to disable spark module, however we missed the following files that are outside itests directory. i.e we need to club the option with disabling the following tests as well : {code} org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10941) Provide option to disable spark tests outside itests
[ https://issues.apache.org/jira/browse/HIVE-10941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10941: - Description: HIVE-10477 provided an option to disable spark module, however we missed the following files that are outside itests directory. i.e we need to club the option with disabling the following tests as well : {code} org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery {code} was: HIVE-10477 provided an option to disable spark module, however we missed the following files that are outside itests directory. i.e we need to club the option with disabling the following tests as well : {code} org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery The above tests need to be disabled. {code} Provide option to disable spark tests outside itests Key: HIVE-10941 URL: https://issues.apache.org/jira/browse/HIVE-10941 Project: Hive Issue Type: Bug Components: Tests Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan HIVE-10477 provided an option to disable spark module, however we missed the following files that are outside itests directory. i.e we need to club the option with disabling the following tests as well : {code} org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testMultiSessionMultipleUse org.apache.hadoop.hive.ql.exec.spark.session.TestSparkSessionManagerImpl.testSingleSessionMultipleUse org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10933) Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns()
[ https://issues.apache.org/jira/browse/HIVE-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573956#comment-14573956 ] Chaoyu Tang commented on HIVE-10933: I could not reproduce your issue in trunk and believe it has been resolved by HIVE-5847 since Hive 0.14. Could you try again in Hive 1.2? Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns() Key: HIVE-10933 URL: https://issues.apache.org/jira/browse/HIVE-10933 Project: Hive Issue Type: Bug Components: API Affects Versions: 0.13.0 Reporter: Son Nguyen Assignee: Chaoyu Tang DatabaseMetadata.getColumns() returns COLUMN_SIZE as 0 for a column defined as varchar(32), or char(32). While ResultSetMetaData.getPrecision() returns correct value 32. Here is the segment program that reproduces the issue. try { statement = connection.createStatement(); statement.execute(drop table if exists son_table); statement.execute(create table son_table( col1 varchar(32) )); statement.close(); } catch ( Exception e) { return; } // get column info using metadata try { DatabaseMetaData dmd = null; ResultSet resultSet = null; dmd = connection.getMetaData(); resultSet = dmd.getColumns(null, null, son_table, col1); if ( resultSet.next() ) { String tabName = resultSet.getString(TABLE_NAME); String colName = resultSet.getString(COLUMN_NAME); String dataType = resultSet.getString(DATA_TYPE); String typeName = resultSet.getString(TYPE_NAME); int precision = resultSet.getInt(COLUMN_SIZE); // output is: colName = col1, dataType = 12, typeName = VARCHAR, precision = 0. System.out.format(colName = %s, dataType = %s, typeName = %s, precision = %d., colName, dataType, typeName, precision); } } catch ( Exception e) { return; } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10939) Make TestFileDump robust
[ https://issues.apache.org/jira/browse/HIVE-10939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10939: Attachment: HIVE-10939.patch [~hsubramaniyan] Can you please review this ? Make TestFileDump robust Key: HIVE-10939 URL: https://issues.apache.org/jira/browse/HIVE-10939 Project: Hive Issue Type: Test Components: Tests Affects Versions: 1.3.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10939.patch It fails on Windows OS currently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10910) Alter table drop partition queries in encrypted zone failing to remove data from HDFS
[ https://issues.apache.org/jira/browse/HIVE-10910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573934#comment-14573934 ] Eugene Koifman commented on HIVE-10910: --- the failures are not related Alter table drop partition queries in encrypted zone failing to remove data from HDFS - Key: HIVE-10910 URL: https://issues.apache.org/jira/browse/HIVE-10910 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: 1.2.0 Reporter: Aswathy Chellammal Sreekumar Assignee: Eugene Koifman Attachments: HIVE-10910.patch Alter table query trying to drop partition removes metadata of partition but fails to remove the data from HDFS hive create table table_1(name string, age int, gpa double) partitioned by (b string) stored as textfile; OK Time taken: 0.732 seconds hive alter table table_1 add partition (b='2010-10-10'); OK Time taken: 0.496 seconds hive show partitions table_1; OK b=2010-10-10 Time taken: 0.781 seconds, Fetched: 1 row(s) hive alter table table_1 drop partition (b='2010-10-10'); FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Got exception: java.io.IOException Failed to move to trash: hdfs://ip-address:8020/warehouse-dir/table_1/b=2010-10-10 hive show partitions table_1; OK Time taken: 0.622 seconds -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values
[ https://issues.apache.org/jira/browse/HIVE-10929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vikram Dixit K updated HIVE-10929: -- Attachment: HIVE-10929.2.patch In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values -- Key: HIVE-10929 URL: https://issues.apache.org/jira/browse/HIVE-10929 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10929.1.patch, HIVE-10929.2.patch {code} create table dummy(i int); insert into table dummy values (1); select * from dummy; create table partunion1(id1 int) partitioned by (part1 string); set hive.exec.dynamic.partition.mode=nonstrict; set hive.execution.engine=tez; explain insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; select * from partunion1; {code} fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6791) Support variable substition for Beeline shell command
[ https://issues.apache.org/jira/browse/HIVE-6791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-6791: --- Attachment: HIVE-6791-beeline-cli.patch Hi [~xuefuz], [~chinnalalam], could you review the patch? I have created HIVE-10943 to see whether it breaks other functionality. Thank you! Support variable substition for Beeline shell command - Key: HIVE-6791 URL: https://issues.apache.org/jira/browse/HIVE-6791 Project: Hive Issue Type: New Feature Components: CLI, Clients Affects Versions: 0.14.0 Reporter: Xuefu Zhang Assignee: Ferdinand Xu Attachments: HIVE-6791-beeline-cli.patch A follow-up task from HIVE-6694. Similar to HIVE-6570. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10939) Make TestFileDump robust
[ https://issues.apache.org/jira/browse/HIVE-10939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573875#comment-14573875 ] Gunther Hagleitner commented on HIVE-10939: --- +1 Make TestFileDump robust Key: HIVE-10939 URL: https://issues.apache.org/jira/browse/HIVE-10939 Project: Hive Issue Type: Test Components: Tests Affects Versions: 1.3.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-10939.patch It fails on Windows OS currently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10943) Beeline-cli: Enable precommit for beelie-cli branch
[ https://issues.apache.org/jira/browse/HIVE-10943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu updated HIVE-10943: Attachment: HIVE-10943.patch Beeline-cli: Enable precommit for beelie-cli branch Key: HIVE-10943 URL: https://issues.apache.org/jira/browse/HIVE-10943 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Ferdinand Xu Assignee: Ferdinand Xu Priority: Minor Attachments: HIVE-10943.patch NO PRECOMMIT TESTS -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10935) LLAP: merge master to branch
[ https://issues.apache.org/jira/browse/HIVE-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-10935. - Resolution: Fixed Fix Version/s: llap Done. Needed this for recent commits and for test patch LLAP: merge master to branch Key: HIVE-10935 URL: https://issues.apache.org/jira/browse/HIVE-10935 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive
[ https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573687#comment-14573687 ] Sergey Shelukhin commented on HIVE-10761: - Hello. Immediately after integrating this, I am getting a non-stop stream of NPEs (several a second) in the log when running HS2: {noformat} 2015-06-04 15:17:13,648 WARN [org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor@6fca5907()]: common.JvmPauseMonitor (JvmPauseMonitor.java:incrementMetricsCounter(205)) - Error Reporting JvmPauseMonitor to Metrics system java.lang.NullPointerException at org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor.incrementMetricsCounter(JvmPauseMonitor.java:203) at org.apache.hadoop.hive.common.JvmPauseMonitor$Monitor.run(JvmPauseMonitor.java:195) at java.lang.Thread.run(Thread.java:745) {noformat} Create codahale-based metrics system for Hive - Key: HIVE-10761 URL: https://issues.apache.org/jira/browse/HIVE-10761 Project: Hive Issue Type: New Feature Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.3.0 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, hms-metrics.json There is a current Hive metrics system that hooks up to a JMX reporting, but all its measurements, models are custom. This is to make another metrics system that will be based on Codahale (ie yammer, dropwizard), which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in reporting frameworks like JMX, Console, Log, JSON webserver It is used for many projects, including several Apache projects like Oozie. Overall, monitoring tools should find it easier to understand these common metric, measurement, reporting models. The existing metric subsystem will be kept and can be enabled if backward compatibility is desired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive
[ https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573620#comment-14573620 ] Thejas M Nair commented on HIVE-10761: -- [~szehon] If it is committed to only master then the fix version should be 2.0.0, if its committed to branch-1 as well, the fix version should be 1.3.0 as well. Create codahale-based metrics system for Hive - Key: HIVE-10761 URL: https://issues.apache.org/jira/browse/HIVE-10761 Project: Hive Issue Type: New Feature Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.3.0 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, hms-metrics.json There is a current Hive metrics system that hooks up to a JMX reporting, but all its measurements, models are custom. This is to make another metrics system that will be based on Codahale (ie yammer, dropwizard), which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in reporting frameworks like JMX, Console, Log, JSON webserver It is used for many projects, including several Apache projects like Oozie. Overall, monitoring tools should find it easier to understand these common metric, measurement, reporting models. The existing metric subsystem will be kept and can be enabled if backward compatibility is desired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive
[ https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573621#comment-14573621 ] Thejas M Nair commented on HIVE-10761: -- [~szehon] If it is committed to only master then the fix version should be 2.0.0, if its committed to branch-1 as well, the fix version should be 1.3.0 as well. Create codahale-based metrics system for Hive - Key: HIVE-10761 URL: https://issues.apache.org/jira/browse/HIVE-10761 Project: Hive Issue Type: New Feature Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.3.0 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, hms-metrics.json There is a current Hive metrics system that hooks up to a JMX reporting, but all its measurements, models are custom. This is to make another metrics system that will be based on Codahale (ie yammer, dropwizard), which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in reporting frameworks like JMX, Console, Log, JSON webserver It is used for many projects, including several Apache projects like Oozie. Overall, monitoring tools should find it easier to understand these common metric, measurement, reporting models. The existing metric subsystem will be kept and can be enabled if backward compatibility is desired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10551) OOM when running query_89 with vectorization on hybridgrace=false
[ https://issues.apache.org/jira/browse/HIVE-10551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573716#comment-14573716 ] Vikram Dixit K commented on HIVE-10551: --- [~mmccline] for your reference. OOM when running query_89 with vectorization on hybridgrace=false --- Key: HIVE-10551 URL: https://issues.apache.org/jira/browse/HIVE-10551 Project: Hive Issue Type: Bug Reporter: Rajesh Balamohan Attachments: HIVE-10551-explain-plan.log, hive-10551.png, hive_10551.png - TPC-DS Query_89 @ 10 TB scale - Trunk version of Hive + Tez 0.7.0-SNAPSHOT - Additional settings ( hive.vectorized.groupby.maxentries=1024 , tez.runtime.io.sort.factor=200 tez.runtime.io.sort.mb=1800 hive.tez.container.size=4096 ,hive.mapjoin.hybridgrace.hashtable=false ) Will attach the profiler snapshot asap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument
[ https://issues.apache.org/jira/browse/HIVE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-10427: -- Labels: TODOC2.0 (was: TODOC1.3) collect_list() and collect_set() should accept struct types as argument --- Key: HIVE-10427 URL: https://issues.apache.org/jira/browse/HIVE-10427 Project: Hive Issue Type: Wish Components: UDF Reporter: Alexander Behm Assignee: Chao Sun Labels: TODOC2.0 Attachments: HIVE-10427.1.patch, HIVE-10427.2.patch, HIVE-10427.3.patch, HIVE-10427.4.patch The collect_list() and collect_set() functions currently only accept scalar argument types. It would be very useful if these functions could also accept struct argument types for creating nested data from flat data. For example, suppose I wanted to create a nested customers/orders table from two flat tables, customers and orders. Then it'd be very convenient to write something like this: {code} insert into table nested_customers_orders select c.*, collect_list(named_struct(oid, o.oid, order_date: o.date...)) from customers c inner join orders o on (c.cid = o.oid) group by c.cid {code} Thanks you for your consideration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10936) incorrect result set when hive.vectorized.execution.enabled = true with predicate casting to CHAR or VARCHAR
[ https://issues.apache.org/jira/browse/HIVE-10936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] N Campbell updated HIVE-10936: -- Attachment: GO_TIME_DIM.zip incorrect result set when hive.vectorized.execution.enabled = true with predicate casting to CHAR or VARCHAR Key: HIVE-10936 URL: https://issues.apache.org/jira/browse/HIVE-10936 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 0.14.0 Environment: In this case using HDP install of Hive - 0.14.0.2.2.4.2-2 Reporter: N Campbell Attachments: GO_TIME_DIM.zip Query returns data when set hive.vectorized.execution.enabled = false -or- if target of CAST is STRING and not CHAR/VARCHAR set hive.vectorized.execution.enabled = true; select `GO_TIME_DIM`.`day_key` from `gosalesdw1021`.`go_time_dim` `GO_TIME_DIM` where CAST(`GO_TIME_DIM`.`current_year` AS CHAR(4)) = '2010' group by `GO_TIME_DIM`.`day_key`; create table GO_TIME_DIM ( DAY_KEY int , DAY_DATE timestamp , MONTH_KEY int , CURRENT_MONTH smallint , MONTH_NUMBER int , QUARTER_KEY int , CURRENT_QUARTER smallint , CURRENT_YEAR smallint , DAY_OF_WEEK smallint , DAY_OF_MONTH smallint , DAYS_IN_MONTH smallint , DAY_OF_YEAR smallint , WEEK_OF_MONTH smallint , WEEK_OF_QUARTER smallint , WEEK_OF_YEAR smallint , MONTH_EN string , WEEKDAY_EN string , MONTH_DE string , WEEKDAY_DE string , MONTH_FR string , WEEKDAY_FR string , MONTH_JA string , WEEKDAY_JA string , MONTH_AR string , WEEKDAY_AR string , MONTH_CS string , WEEKDAY_CS string , MONTH_DA string , WEEKDAY_DA string , MONTH_EL string , WEEKDAY_EL string , MONTH_ES string , WEEKDAY_ES string , MONTH_FI string , WEEKDAY_FI string , MONTH_HR string , WEEKDAY_HR string , MONTH_HU string , WEEKDAY_HU string , MONTH_ID string , WEEKDAY_ID string , MONTH_IT string , WEEKDAY_IT string , MONTH_KK string , WEEKDAY_KK string , MONTH_KO string , WEEKDAY_KO string , MONTH_MS string , WEEKDAY_MS string , MONTH_NL string , WEEKDAY_NL string , MONTH_NO string , WEEKDAY_NO string , MONTH_PL string , WEEKDAY_PL string , MONTH_PT string , WEEKDAY_PT string , MONTH_RO string , WEEKDAY_RO string , MONTH_RU string , WEEKDAY_RU string , MONTH_SC string , WEEKDAY_SC string , MONTH_SL string , WEEKDAY_SL string , MONTH_SV string , WEEKDAY_SV string , MONTH_TC string , WEEKDAY_TC string , MONTH_TH string , WEEKDAY_TH string , MONTH_TR string , WEEKDAY_TR string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' LINES TERMINATED BY '\n' STORED AS TEXTFILE LOCATION '../GO_TIME_DIM'; Then create an ORC equivalent table and load it insert overwrite table GO_TIME_DIM select * from TEXT.GO_TIME_DIM ; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10874) Fail in TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2.q due to duplicate column name
[ https://issues.apache.org/jira/browse/HIVE-10874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-10874: -- Fix Version/s: 1.2.1 Fail in TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2.q due to duplicate column name - Key: HIVE-10874 URL: https://issues.apache.org/jira/browse/HIVE-10874 Project: Hive Issue Type: Bug Reporter: Jesus Camacho Rodriguez Assignee: Jesus Camacho Rodriguez Fix For: 1.2.1 Attachments: HIVE-10874.01.patch, HIVE-10874.patch Aggregate operators may derive row types with duplicate column names. The reason is that the column names for grouping sets columns and aggregation columns might be generated automatically, but we do not check whether the column name already exists in the same row. This error can be reproduced by TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2.q, which fails with the following trace: {code} junit.framework.AssertionFailedError: Unexpected exception java.lang.AssertionError: RecordType(BIGINT $f1, BIGINT $f1) at org.apache.calcite.rel.core.Project.isValid(Project.java:200) at org.apache.calcite.rel.core.Project.init(Project.java:85) at org.apache.calcite.rel.core.Project.init(Project.java:91) at org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveProject.init(HiveProject.java:70) at org.apache.hadoop.hive.ql.optimizer.calcite.reloperators.HiveProject.create(HiveProject.java:103) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.PlanModifierForASTConv.introduceDerivedTable(PlanModifierForASTConv.java:211) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.PlanModifierForASTConv.convertOpTree(PlanModifierForASTConv.java:67) at org.apache.hadoop.hive.ql.optimizer.calcite.translator.ASTConverter.convert(ASTConverter.java:94) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.getOptimizedAST(CalcitePlanner.java:617) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.genOPTree(CalcitePlanner.java:248) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10108) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:208) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) ... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10410) Apparent race condition in HiveServer2 causing intermittent query failures
[ https://issues.apache.org/jira/browse/HIVE-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Williams updated HIVE-10410: Attachment: HIVE-10410.1.patch Apparent race condition in HiveServer2 causing intermittent query failures -- Key: HIVE-10410 URL: https://issues.apache.org/jira/browse/HIVE-10410 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.1 Environment: CDH 5.3.3 CentOS 6.4 Reporter: Richard Williams Attachments: HIVE-10410.1.patch On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC occasionally trigger odd Thrift exceptions with messages such as Read a negative frame size (-2147418110)! or out of sequence response in HiveServer2's connections to the metastore. For certain metastore calls (for example, showDatabases), these Thrift exceptions are converted to MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient from retrying these calls and thus causes the failure to bubble out to the JDBC client. Note that as far as we can tell, this issue appears to only affect queries that are submitted with the runAsync flag on TExecuteStatementReq set to true (which, in practice, seems to mean all JDBC queries), and it appears to only manifest when HiveServer2 is using the new HTTP transport mechanism. When both these conditions hold, we are able to fairly reliably reproduce the issue by spawning about 100 simple, concurrent hive queries (we have been using show databases), two or three of which typically fail. However, when either of these conditions do not hold, we are no longer able to reproduce the issue. Some example stack traces from the HiveServer2 logs: {noformat} 2015-04-16 13:54:55,486 ERROR hive.log: Got exception: org.apache.thrift.transport.TTransportException Read a negative frame size (-2147418110)! org.apache.thrift.transport.TTransportException: Read a negative frame size (-2147418110)! at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:435) at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:414) at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:837) at org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60) at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90) at com.sun.proxy.$Proxy6.getDatabases(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1139) at org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2445) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:364) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:957) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:145) at org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:200) at
[jira] [Commented] (HIVE-9664) Hive add jar command should be able to download and add jars from a repository
[ https://issues.apache.org/jira/browse/HIVE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573744#comment-14573744 ] Anthony Hsu commented on HIVE-9664: --- Thanks. Looks good. Hive add jar command should be able to download and add jars from a repository Key: HIVE-9664 URL: https://issues.apache.org/jira/browse/HIVE-9664 Project: Hive Issue Type: Improvement Affects Versions: 0.14.0 Reporter: Anant Nag Assignee: Anant Nag Labels: TODOC1.2, hive, patch Fix For: 1.2.0 Attachments: HIVE-9664.4.patch, HIVE-9664.5.patch, HIVE-9664.patch, HIVE-9664.patch, HIVE-9664.patch Currently Hive's add jar command takes a local path to the dependency jar. This clutters the local file-system as users may forget to remove this jar later It would be nice if Hive supported a Gradle like notation to download the jar from a repository. Example: add jar org:module:version It should also be backward compatible and should take jar from the local file-system as well. RB: https://reviews.apache.org/r/31628/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10748) Replace StringBuffer with StringBuilder where possible
[ https://issues.apache.org/jira/browse/HIVE-10748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Pivovarov updated HIVE-10748: --- Fix Version/s: 2.0.0 Replace StringBuffer with StringBuilder where possible -- Key: HIVE-10748 URL: https://issues.apache.org/jira/browse/HIVE-10748 Project: Hive Issue Type: Improvement Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Priority: Minor Fix For: 1.3.0, 2.0.0 Attachments: HIVE-10748.1.patch, HIVE-10748.1.patch, HIVE-10748.2.patch I found 40 places in Hive where new StringBuffer( is used. Where possible, it is recommended that StringBuilder be used in preference to StringBuffer as it will be faster under most implementations https://docs.oracle.com/javase/7/docs/api/java/lang/StringBuilder.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10937) LLAP: make ObjectCache for plans work properly in the daemon
[ https://issues.apache.org/jira/browse/HIVE-10937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10937: Fix Version/s: llap LLAP: make ObjectCache for plans work properly in the daemon Key: HIVE-10937 URL: https://issues.apache.org/jira/browse/HIVE-10937 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap There's perf hit otherwise, esp. when stupid planner creates 1009 reducers of 4Mb each. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10907) Hive on Tez: Classcast exception in some cases with SMB joins
[ https://issues.apache.org/jira/browse/HIVE-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573591#comment-14573591 ] Hive QA commented on HIVE-10907: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12737668/HIVE-10907.4.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8998 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4179/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4179/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4179/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12737668 - PreCommit-HIVE-TRUNK-Build Hive on Tez: Classcast exception in some cases with SMB joins - Key: HIVE-10907 URL: https://issues.apache.org/jira/browse/HIVE-10907 Project: Hive Issue Type: Bug Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10907.1.patch, HIVE-10907.2.patch, HIVE-10907.3.patch, HIVE-10907.4.patch In cases where there is a mix of Map side work and reduce side work, we get a classcast exception because we assume homogeneity in the code. We need to fix this correctly. For now this is a workaround. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive
[ https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573613#comment-14573613 ] Mithun Radhakrishnan commented on HIVE-10761: - Hey, Sush, Szehon. I can confirm that Yahoo cares about HS2 metrics. :p I'm not familiar with codehale, but if it works with JMX, that's cool. Lemme do some homework. Thanks for the heads-up and the nifty addition, chaps. Create codahale-based metrics system for Hive - Key: HIVE-10761 URL: https://issues.apache.org/jira/browse/HIVE-10761 Project: Hive Issue Type: New Feature Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.3.0 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, hms-metrics.json There is a current Hive metrics system that hooks up to a JMX reporting, but all its measurements, models are custom. This is to make another metrics system that will be based on Codahale (ie yammer, dropwizard), which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in reporting frameworks like JMX, Console, Log, JSON webserver It is used for many projects, including several Apache projects like Oozie. Overall, monitoring tools should find it easier to understand these common metric, measurement, reporting models. The existing metric subsystem will be kept and can be enabled if backward compatibility is desired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive
[ https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573617#comment-14573617 ] Mithun Radhakrishnan commented on HIVE-10761: - Question: Are we proposing to deprecate the old metrics system on trunk? What release are we considering deprecation and removal? Create codahale-based metrics system for Hive - Key: HIVE-10761 URL: https://issues.apache.org/jira/browse/HIVE-10761 Project: Hive Issue Type: New Feature Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.3.0 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, hms-metrics.json There is a current Hive metrics system that hooks up to a JMX reporting, but all its measurements, models are custom. This is to make another metrics system that will be based on Codahale (ie yammer, dropwizard), which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in reporting frameworks like JMX, Console, Log, JSON webserver It is used for many projects, including several Apache projects like Oozie. Overall, monitoring tools should find it easier to understand these common metric, measurement, reporting models. The existing metric subsystem will be kept and can be enabled if backward compatibility is desired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument
[ https://issues.apache.org/jira/browse/HIVE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-10427: -- Labels: TODOC1.3 (was: ) collect_list() and collect_set() should accept struct types as argument --- Key: HIVE-10427 URL: https://issues.apache.org/jira/browse/HIVE-10427 Project: Hive Issue Type: Wish Components: UDF Reporter: Alexander Behm Assignee: Chao Sun Labels: TODOC1.3 Attachments: HIVE-10427.1.patch, HIVE-10427.2.patch, HIVE-10427.3.patch, HIVE-10427.4.patch The collect_list() and collect_set() functions currently only accept scalar argument types. It would be very useful if these functions could also accept struct argument types for creating nested data from flat data. For example, suppose I wanted to create a nested customers/orders table from two flat tables, customers and orders. Then it'd be very convenient to write something like this: {code} insert into table nested_customers_orders select c.*, collect_list(named_struct(oid, o.oid, order_date: o.date...)) from customers c inner join orders o on (c.cid = o.oid) group by c.cid {code} Thanks you for your consideration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10165) Improve hive-hcatalog-streaming extensibility and support updates and deletes.
[ https://issues.apache.org/jira/browse/HIVE-10165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573630#comment-14573630 ] Alan Gates commented on HIVE-10165: --- package.html: * this is excellent documentation. We may want to move much of this into the wiki for users. * Compactions are done by the metastore server, not HiveServer2. * Currently, when issuing queries on streaming tables, query client must set hive.input.format = org.apache.hadoop.hive.ql.io.HiveInputFormat hive.vectorized.execution.enabled = false The above client settings are a temporary requirement and the intention is to drop the need for them in the near future. I don't believe either of those are true anymore (as of Hive 0.14). LockImpl: * internalAcquire: Why do you recreate the connection the metastore each time through the loop? These seems expensive. Same comment for building the lock request. This shouldn't change as you go through the loop. * internalRelease: You've built in handling for releasing locks that are not part of transactions. When you do envision users locking something that isn't part of a transaction? Since this is doing write operations I would assume you'll always have a transaction. MutatorDestination: This appears to be a simple struct that records data about a table, why have it as an interface with an impl? TransactionImpl: * Why do commit() and abort() release the locks? Since these locks are part of a transaction they will always be released when the transaction is committed or aborted. MutatorClient: * Why is Lock external to this class? It seems like Lock is a component of this class. Or do you envision users using one Lock object to manage multiple MutatorClients? MutatorCoordinator: * Comments in the class javadoc: it's origTxnId, bucketid that controls the ordering, not lastTxnId, since origTxnId is immutable. * In the constructor, why are you passing in CreatePartitionHelper and SequenceValidator when there's only one instance of these? * resetMutator, this code is closing the Mutator everytime you switch Mutators. But if I understand correctly this is going to result in writing a footer in the ORC file. You're going to end up with a thousand tiny stripes in your files. That is not what you want. You do need to make sure you don't have too many open at a time to avoids OOMs and too many file handles open errors. But you'll need to keep a list of which ones are open and then close them on an LRU basis (or maybe pick the one with the most records since it will give you the best stripe size) as you need to open more rather than closing each one each time. [~owen.omalley] comments? CreationPartitionHelper: * createPartitionIfNotExists: Why are you running the Driver class here? Why not call IMetaStoreClient.addPartition()? That would be much lighter weight. Hive doesn't currently have a deadlock detector. ([~ekoifman] is working on fixing this as part of HIVE-9675). The way this is written it could deadlock with other stream writers or with SQL users. This code will eventually recover since it only tries to lock so many times and then gives up. I'm not sure there's anything to do about this for now, but it should be documented. Improve hive-hcatalog-streaming extensibility and support updates and deletes. -- Key: HIVE-10165 URL: https://issues.apache.org/jira/browse/HIVE-10165 Project: Hive Issue Type: Improvement Components: HCatalog Affects Versions: 1.2.0 Reporter: Elliot West Assignee: Elliot West Labels: streaming_api Attachments: HIVE-10165.0.patch, HIVE-10165.4.patch, HIVE-10165.5.patch, mutate-system-overview.png h3. Overview I'd like to extend the [hive-hcatalog-streaming|https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest] API so that it also supports the writing of record updates and deletes in addition to the already supported inserts. h3. Motivation We have many Hadoop processes outside of Hive that merge changed facts into existing datasets. Traditionally we achieve this by: reading in a ground-truth dataset and a modified dataset, grouping by a key, sorting by a sequence and then applying a function to determine inserted, updated, and deleted rows. However, in our current scheme we must rewrite all partitions that may potentially contain changes. In practice the number of mutated records is very small when compared with the records contained in a partition. This approach results in a number of operational issues: * Excessive amount of write activity required for small data changes. * Downstream applications cannot robustly read these datasets while they are being updated. * Due to scale of
[jira] [Commented] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument
[ https://issues.apache.org/jira/browse/HIVE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573788#comment-14573788 ] Lefty Leverenz commented on HIVE-10427: --- Doc note: Adding TODOC2.0 label since this was committed to master today. If it is also committed to branch-1, please replace TODOC2.0 with TODOC1.3. Documentation for collect_list() and collect_set() is in the UDAF section on the UDFs page: * [Built-in Aggregate Functions (UDAF) | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inAggregateFunctions(UDAF)] collect_list() and collect_set() should accept struct types as argument --- Key: HIVE-10427 URL: https://issues.apache.org/jira/browse/HIVE-10427 Project: Hive Issue Type: Wish Components: UDF Reporter: Alexander Behm Assignee: Chao Sun Labels: TODOC2.0 Attachments: HIVE-10427.1.patch, HIVE-10427.2.patch, HIVE-10427.3.patch, HIVE-10427.4.patch The collect_list() and collect_set() functions currently only accept scalar argument types. It would be very useful if these functions could also accept struct argument types for creating nested data from flat data. For example, suppose I wanted to create a nested customers/orders table from two flat tables, customers and orders. Then it'd be very convenient to write something like this: {code} insert into table nested_customers_orders select c.*, collect_list(named_struct(oid, o.oid, order_date: o.date...)) from customers c inner join orders o on (c.cid = o.oid) group by c.cid {code} Thanks you for your consideration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10910) Alter table drop partition queries in encrypted zone failing to remove data from HDFS
[ https://issues.apache.org/jira/browse/HIVE-10910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-10910: -- Attachment: HIVE-10910.patch [~sushanth] or [~hagleitn] could you review please? Alter table drop partition queries in encrypted zone failing to remove data from HDFS - Key: HIVE-10910 URL: https://issues.apache.org/jira/browse/HIVE-10910 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: 1.2.0 Reporter: Aswathy Chellammal Sreekumar Assignee: Eugene Koifman Attachments: HIVE-10910.patch Alter table query trying to drop partition removes metadata of partition but fails to remove the data from HDFS hive create table table_1(name string, age int, gpa double) partitioned by (b string) stored as textfile; OK Time taken: 0.732 seconds hive alter table table_1 add partition (b='2010-10-10'); OK Time taken: 0.496 seconds hive show partitions table_1; OK b=2010-10-10 Time taken: 0.781 seconds, Fetched: 1 row(s) hive alter table table_1 drop partition (b='2010-10-10'); FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Got exception: java.io.IOException Failed to move to trash: hdfs://ip-address:8020/warehouse-dir/table_1/b=2010-10-10 hive show partitions table_1; OK Time taken: 0.622 seconds -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10872) LLAP: make sure tests pass
[ https://issues.apache.org/jira/browse/HIVE-10872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573558#comment-14573558 ] Sergey Shelukhin commented on HIVE-10872: - Btw, I ran main (non-itest) tests locally, and they all pass except some obscure avro test that is probably just related to running on mac LLAP: make sure tests pass -- Key: HIVE-10872 URL: https://issues.apache.org/jira/browse/HIVE-10872 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-10872.01.patch, HIVE-10872.02.patch, HIVE-10872.03.patch, HIVE-10872.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10872) LLAP: make sure tests pass
[ https://issues.apache.org/jira/browse/HIVE-10872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10872: Attachment: HIVE-10872.03.patch This should build. LLAP: make sure tests pass -- Key: HIVE-10872 URL: https://issues.apache.org/jira/browse/HIVE-10872 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Attachments: HIVE-10872.01.patch, HIVE-10872.02.patch, HIVE-10872.03.patch, HIVE-10872.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573596#comment-14573596 ] Mithun Radhakrishnan commented on HIVE-10754: - I see what we're trying to achieve, but I still need help understanding how this change fixes the problem. (Sorry. :/) Here's the relevant code from {{Job.java}} from Hadoop 2.6. {code:java|title=Job.java|borderStyle=solid|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#CE} @Deprecated public Job(Configuration conf) throws IOException { this(new JobConf(conf)); } Job(JobConf conf) throws IOException { super(conf, null); // propagate existing user credentials to job this.credentials.mergeAll(this.ugi.getCredentials()); this.cluster = null; } public static Job getInstance(Configuration conf) throws IOException { // create with a null Cluster JobConf jobConf = new JobConf(conf); return new Job(jobConf); } {code} # The current implementation of {{HCatLoader.setLocation()}} calls {{new Job( Configuration )}}, which clones the {{JobConf}} inline and calls the private constructor {{Job(JobConf)}}. # Your improved implementation of {{HCatLoader.setLocation()}} calls {{Job.getInstance()}}. This method clones the {{JobConf}} explicitly, and then calls the private constructor {{Job(jobConf)}}. bq. These two are different (JobConf is not cloned when we call new Job(conf)). Both of these seem identical in effect to me. :/ There's no way for {{HCatLoader.setLocation()}} to call the {{Job(JobConf)}} constructor, because it's package-private, right? Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader Key: HIVE-10754 URL: https://issues.apache.org/jira/browse/HIVE-10754 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10754.patch {noformat} Create table tbl1 (key string, value string) stored as rcfile; Create table tbl2 (key string, value string); insert into tbl1 values( '1', '111'); insert into tbl2 values('1', '2'); {noformat} Pig script: {noformat} src_tbl1 = FILTER tbl1 BY (key == '1'); prj_tbl1 = FOREACH src_tbl1 GENERATE key as tbl1_key, value as tbl1_value, '333' as tbl1_v1; src_tbl2 = FILTER tbl2 BY (key == '1'); prj_tbl2 = FOREACH src_tbl2 GENERATE key as tbl2_key, value as tbl2_value; dump prj_tbl1; dump prj_tbl2; result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key); prj_result = FOREACH result GENERATE prj_tbl1::tbl1_key AS key1, prj_tbl1::tbl1_value AS value1, prj_tbl1::tbl1_v1 AS v1, prj_tbl2::tbl2_key AS key2, prj_tbl2::tbl2_value AS value2; dump prj_result; {noformat} The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2). We need to clone the job instance in HCatLoader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10779) LLAP: Daemons should shutdown in case of fatal errors
[ https://issues.apache.org/jira/browse/HIVE-10779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth reassigned HIVE-10779: - Assignee: Siddharth Seth LLAP: Daemons should shutdown in case of fatal errors - Key: HIVE-10779 URL: https://issues.apache.org/jira/browse/HIVE-10779 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: HIVE-10779.1.txt For example, the scheduler loop exiting. Currently they end up getting stuck - while still accepting new work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10779) LLAP: Daemons should shutdown in case of fatal errors
[ https://issues.apache.org/jira/browse/HIVE-10779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-10779: -- Attachment: HIVE-10779.1.txt Patch adds an UncauhtExceptionHandler and a shutdown hook to stop services. LLAP: Daemons should shutdown in case of fatal errors - Key: HIVE-10779 URL: https://issues.apache.org/jira/browse/HIVE-10779 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Attachments: HIVE-10779.1.txt For example, the scheduler loop exiting. Currently they end up getting stuck - while still accepting new work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10934) Restore support for DROP PARTITION PURGE
[ https://issues.apache.org/jira/browse/HIVE-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573730#comment-14573730 ] Hive QA commented on HIVE-10934: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12737678/HIVE-10934.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 9001 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4180/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4180/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4180/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12737678 - PreCommit-HIVE-TRUNK-Build Restore support for DROP PARTITION PURGE Key: HIVE-10934 URL: https://issues.apache.org/jira/browse/HIVE-10934 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 1.2.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-10934.patch HIVE-9086 added support for PURGE in {noformat} ALTER TABLE my_doomed_table DROP IF EXISTS PARTITION (part_key = sayonara) IGNORE PROTECTION PURGE; {noformat} looks like this was accidentally lost in HIVE-10228 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument
[ https://issues.apache.org/jira/browse/HIVE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573812#comment-14573812 ] Chao Sun commented on HIVE-10427: - Yes, it should also work for branch-1. I'll commit it to that branch later and update the tag. collect_list() and collect_set() should accept struct types as argument --- Key: HIVE-10427 URL: https://issues.apache.org/jira/browse/HIVE-10427 Project: Hive Issue Type: Wish Components: UDF Reporter: Alexander Behm Assignee: Chao Sun Labels: TODOC2.0 Attachments: HIVE-10427.1.patch, HIVE-10427.2.patch, HIVE-10427.3.patch, HIVE-10427.4.patch The collect_list() and collect_set() functions currently only accept scalar argument types. It would be very useful if these functions could also accept struct argument types for creating nested data from flat data. For example, suppose I wanted to create a nested customers/orders table from two flat tables, customers and orders. Then it'd be very convenient to write something like this: {code} insert into table nested_customers_orders select c.*, collect_list(named_struct(oid, o.oid, order_date: o.date...)) from customers c inner join orders o on (c.cid = o.oid) group by c.cid {code} Thanks you for your consideration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10934) Restore support for DROP PARTITION PURGE
[ https://issues.apache.org/jira/browse/HIVE-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573665#comment-14573665 ] Gunther Hagleitner commented on HIVE-10934: --- +1 Restore support for DROP PARTITION PURGE Key: HIVE-10934 URL: https://issues.apache.org/jira/browse/HIVE-10934 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 1.2.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-10934.patch HIVE-9086 added support for PURGE in {noformat} ALTER TABLE my_doomed_table DROP IF EXISTS PARTITION (part_key = sayonara) IGNORE PROTECTION PURGE; {noformat} looks like this was accidentally lost in HIVE-10228 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10779) LLAP: Daemons should shutdown in case of fatal errors
[ https://issues.apache.org/jira/browse/HIVE-10779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved HIVE-10779. --- Resolution: Fixed Fix Version/s: llap Committed to the llap branch. LLAP: Daemons should shutdown in case of fatal errors - Key: HIVE-10779 URL: https://issues.apache.org/jira/browse/HIVE-10779 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Fix For: llap Attachments: HIVE-10779.1.txt For example, the scheduler loop exiting. Currently they end up getting stuck - while still accepting new work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10934) Restore support for DROP PARTITION PURGE
[ https://issues.apache.org/jira/browse/HIVE-10934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573757#comment-14573757 ] Gunther Hagleitner commented on HIVE-10934: --- Test failures are unrelated. Restore support for DROP PARTITION PURGE Key: HIVE-10934 URL: https://issues.apache.org/jira/browse/HIVE-10934 Project: Hive Issue Type: Bug Components: Parser Affects Versions: 1.2.0 Reporter: Eugene Koifman Assignee: Eugene Koifman Attachments: HIVE-10934.patch HIVE-9086 added support for PURGE in {noformat} ALTER TABLE my_doomed_table DROP IF EXISTS PARTITION (part_key = sayonara) IGNORE PROTECTION PURGE; {noformat} looks like this was accidentally lost in HIVE-10228 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10551) OOM when running query_89 with vectorization on hybridgrace=false
[ https://issues.apache.org/jira/browse/HIVE-10551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-10551: -- Assignee: Matt McCline OOM when running query_89 with vectorization on hybridgrace=false --- Key: HIVE-10551 URL: https://issues.apache.org/jira/browse/HIVE-10551 Project: Hive Issue Type: Bug Reporter: Rajesh Balamohan Assignee: Matt McCline Attachments: HIVE-10551-explain-plan.log, hive-10551.png, hive_10551.png - TPC-DS Query_89 @ 10 TB scale - Trunk version of Hive + Tez 0.7.0-SNAPSHOT - Additional settings ( hive.vectorized.groupby.maxentries=1024 , tez.runtime.io.sort.factor=200 tez.runtime.io.sort.mb=1800 hive.tez.container.size=4096 ,hive.mapjoin.hybridgrace.hashtable=false ) Will attach the profiler snapshot asap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-7292) Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-7292: - Assignee: Xuefu Zhang (was: dutianmin) Hive on Spark - Key: HIVE-7292 URL: https://issues.apache.org/jira/browse/HIVE-7292 Project: Hive Issue Type: Improvement Components: Spark Reporter: Xuefu Zhang Assignee: Xuefu Zhang Labels: Spark-M1, Spark-M2, Spark-M3, Spark-M4, Spark-M5 Attachments: Hive-on-Spark.pdf Spark as an open-source data analytics cluster computing framework has gained significant momentum recently. Many Hive users already have Spark installed as their computing backbone. To take advantages of Hive, they still need to have either MapReduce or Tez on their cluster. This initiative will provide user a new alternative so that those user can consolidate their backend. Secondly, providing such an alternative further increases Hive's adoption as it exposes Spark users to a viable, feature-rich de facto standard SQL tools on Hadoop. Finally, allowing Hive to run on Spark also has performance benefits. Hive queries, especially those involving multiple reducer stages, will run faster, thus improving user experience as Tez does. This is an umbrella JIRA which will cover many coming subtask. Design doc will be attached here shortly, and will be on the wiki as well. Feedback from the community is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10932) Unit test udf_nondeterministic failure due to HIVE-10728
[ https://issues.apache.org/jira/browse/HIVE-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573177#comment-14573177 ] Ashutosh Chauhan commented on HIVE-10932: - +1 Unit test udf_nondeterministic failure due to HIVE-10728 Key: HIVE-10932 URL: https://issues.apache.org/jira/browse/HIVE-10932 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.3.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10932.patch The test udf_nondeterministic.q failed due to the change in HIVE-10728, in which unix_timestamp() is now marked as deterministic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10736) HiveServer2 shutdown of cached tez app-masters is not clean
[ https://issues.apache.org/jira/browse/HIVE-10736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-10736: -- Issue Type: Bug (was: Sub-task) Parent: (was: HIVE-7926) HiveServer2 shutdown of cached tez app-masters is not clean --- Key: HIVE-10736 URL: https://issues.apache.org/jira/browse/HIVE-10736 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Gopal V Assignee: Vikram Dixit K Attachments: HIVE-10736.1.patch, HIVE-10736.2.patch The shutdown process throws concurrent modification exceptions and fails to clean up the app masters per queue. {code} 2015-05-17 20:24:00,464 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:OperationManager is stopped. 2015-05-17 20:24:00,464 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:SessionManager is stopped. 2015-05-17 20:24:00,464 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:CLIService is stopped. 2015-05-17 20:24:00,465 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:HiveServer2 is stopped. 2015-05-17 20:24:00,465 INFO [Thread-6()]: tez.TezSessionState (TezSessionState.java:close(332)) - Closing Tez Session 2015-05-17 20:24:00,466 INFO [Thread-6()]: client.TezClient (TezClient.java:stop(495)) - Shutting down Tez Session, sessionName=HIVE-94cc629d-63bc-490a-a135-af85c0cc0f2e, applicationId=application_1431919257083_0012 2015-05-17 20:24:00,570 ERROR [Thread-6()]: server.HiveServer2 (HiveServer2.java:stop(322)) - Tez session pool manager stop had an error during stop of HiveServer2. Shutting down HiveServer2 anyway. java.util.ConcurrentModificationException at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) at java.util.LinkedList$ListItr.next(LinkedList.java:888) at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:187) at org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:320) at org.apache.hive.service.server.HiveServer2$1.run(HiveServer2.java:107) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values
[ https://issues.apache.org/jira/browse/HIVE-10929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573267#comment-14573267 ] Gunther Hagleitner commented on HIVE-10929: --- +1 patch looks good. Test failures: udaf_histogram_numeric, ql_rewrite_gbtoidx_cbo_2, udf_nondeterministic are unrelated. The other differences look like the might actually be correct ... can you validate the what the stats should be here? In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values -- Key: HIVE-10929 URL: https://issues.apache.org/jira/browse/HIVE-10929 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10929.1.patch {code} create table dummy(i int); insert into table dummy values (1); select * from dummy; create table partunion1(id1 int) partitioned by (part1 string); set hive.exec.dynamic.partition.mode=nonstrict; set hive.execution.engine=tez; explain insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; select * from partunion1; {code} fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak
[ https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573347#comment-14573347 ] Vaibhav Gumashta commented on HIVE-10925: - Test failures are unrelated. Non-static threadlocals in metastore code can potentially cause memory leak --- Key: HIVE-10925 URL: https://issues.apache.org/jira/browse/HIVE-10925 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Attachments: HIVE-10925.1.patch There are many places where non-static threadlocals are used. I can't seem to find a good logic for using them. However, they can potentially result in leaking objects if for example they are created in a long running thread every time the thread handles a new session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10427) collect_list() and collect_set() should accept struct types as argument
[ https://issues.apache.org/jira/browse/HIVE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573363#comment-14573363 ] Alexander Pivovarov commented on HIVE-10427: +1 collect_list() and collect_set() should accept struct types as argument --- Key: HIVE-10427 URL: https://issues.apache.org/jira/browse/HIVE-10427 Project: Hive Issue Type: Wish Components: UDF Reporter: Alexander Behm Assignee: Chao Sun Attachments: HIVE-10427.1.patch, HIVE-10427.2.patch, HIVE-10427.3.patch, HIVE-10427.4.patch The collect_list() and collect_set() functions currently only accept scalar argument types. It would be very useful if these functions could also accept struct argument types for creating nested data from flat data. For example, suppose I wanted to create a nested customers/orders table from two flat tables, customers and orders. Then it'd be very convenient to write something like this: {code} insert into table nested_customers_orders select c.*, collect_list(named_struct(oid, o.oid, order_date: o.date...)) from customers c inner join orders o on (c.cid = o.oid) group by c.cid {code} Thanks you for your consideration. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10932) Unit test udf_nondeterministic failure due to HIVE-10728
[ https://issues.apache.org/jira/browse/HIVE-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573238#comment-14573238 ] Hive QA commented on HIVE-10932: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12737565/HIVE-10932.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8998 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 org.apache.hive.jdbc.TestSSL.testSSLFetchHttp {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4177/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4177/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4177/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12737565 - PreCommit-HIVE-TRUNK-Build Unit test udf_nondeterministic failure due to HIVE-10728 Key: HIVE-10932 URL: https://issues.apache.org/jira/browse/HIVE-10932 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.3.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10932.patch The test udf_nondeterministic.q failed due to the change in HIVE-10728, in which unix_timestamp() is now marked as deterministic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10736) HiveServer2 shutdown of cached tez app-masters is not clean
[ https://issues.apache.org/jira/browse/HIVE-10736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gunther Hagleitner updated HIVE-10736: -- Summary: HiveServer2 shutdown of cached tez app-masters is not clean (was: LLAP: HiveServer2 shutdown of cached tez app-masters is not clean) HiveServer2 shutdown of cached tez app-masters is not clean --- Key: HIVE-10736 URL: https://issues.apache.org/jira/browse/HIVE-10736 Project: Hive Issue Type: Sub-task Components: HiveServer2 Reporter: Gopal V Assignee: Vikram Dixit K Attachments: HIVE-10736.1.patch, HIVE-10736.2.patch The shutdown process throws concurrent modification exceptions and fails to clean up the app masters per queue. {code} 2015-05-17 20:24:00,464 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:OperationManager is stopped. 2015-05-17 20:24:00,464 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:SessionManager is stopped. 2015-05-17 20:24:00,464 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:CLIService is stopped. 2015-05-17 20:24:00,465 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:HiveServer2 is stopped. 2015-05-17 20:24:00,465 INFO [Thread-6()]: tez.TezSessionState (TezSessionState.java:close(332)) - Closing Tez Session 2015-05-17 20:24:00,466 INFO [Thread-6()]: client.TezClient (TezClient.java:stop(495)) - Shutting down Tez Session, sessionName=HIVE-94cc629d-63bc-490a-a135-af85c0cc0f2e, applicationId=application_1431919257083_0012 2015-05-17 20:24:00,570 ERROR [Thread-6()]: server.HiveServer2 (HiveServer2.java:stop(322)) - Tez session pool manager stop had an error during stop of HiveServer2. Shutting down HiveServer2 anyway. java.util.ConcurrentModificationException at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) at java.util.LinkedList$ListItr.next(LinkedList.java:888) at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:187) at org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:320) at org.apache.hive.service.server.HiveServer2$1.run(HiveServer2.java:107) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10869) fold_case.q failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-10869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10869: Fix Version/s: 1.2.1 fold_case.q failing on trunk Key: HIVE-10869 URL: https://issues.apache.org/jira/browse/HIVE-10869 Project: Hive Issue Type: Test Components: Tests Affects Versions: 1.2.1 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.3.0, 1.2.1 Attachments: HIVE-10869.patch Race condition of commits between HIVE-10716 HIVE-10812 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10869) fold_case.q failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-10869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10869: Affects Version/s: (was: 1.3.0) 1.2.1 fold_case.q failing on trunk Key: HIVE-10869 URL: https://issues.apache.org/jira/browse/HIVE-10869 Project: Hive Issue Type: Test Components: Tests Affects Versions: 1.2.1 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.3.0, 1.2.1 Attachments: HIVE-10869.patch Race condition of commits between HIVE-10716 HIVE-10812 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10920) LLAP: elevator reads some useless data even if all RGs are eliminated by SARG
[ https://issues.apache.org/jira/browse/HIVE-10920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-10920. - Resolution: Fixed committed to branch... {noformat} TezTaskRunner_attempt_1431919257083_3541_2_00_000803_0(attempt_1431919257083_3541_2_00_000803_0)] IN FO org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl: Fragment counters for [cn041-10.l42scl.hortonworks.com/172.19.128.41, tpch_orc_snappy_1000.lineitem, 7895722, 3,2]: [ NUM_VECTOR_BATCHES=0, NUM_DECODED_BATCHES=0, SELECTED_ROWGROUPS=0, NUM_ERRORS =0, ROWS_EMITTED=0, METADATA_CACHE_HIT=3, METADATA_CACHE_MISS=0, CACHE_HIT_BYTES=0, CACHE_MISS_BYTES=0, ALLOCATED_BYTES=0, AL LOCATED_USED_BYTES=0, TOTAL_IO_TIME_US=284922, DECODE_TIME_US=0, HDFS_TIME_US=0, CONSUMER_TIME_US=934 ] {noformat} LLAP: elevator reads some useless data even if all RGs are eliminated by SARG - Key: HIVE-10920 URL: https://issues.apache.org/jira/browse/HIVE-10920 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-10920) LLAP: elevator reads some useless data even if all RGs are eliminated by SARG
[ https://issues.apache.org/jira/browse/HIVE-10920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573336#comment-14573336 ] Sergey Shelukhin edited comment on HIVE-10920 at 6/4/15 6:31 PM: - committed to branch... {noformat} TezTaskRunner_attempt_1431919257083_3541_2_00_000803_0(attempt_1431919257083_3541_2_00_000803_0)] IN FO org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl: Fragment counters for [snip, tpch_orc_snappy_1000.lineitem, 7895722, 3,2]: [ NUM_VECTOR_BATCHES=0, NUM_DECODED_BATCHES=0, SELECTED_ROWGROUPS=0, NUM_ERRORS =0, ROWS_EMITTED=0, METADATA_CACHE_HIT=3, METADATA_CACHE_MISS=0, CACHE_HIT_BYTES=0, CACHE_MISS_BYTES=0, ALLOCATED_BYTES=0, AL LOCATED_USED_BYTES=0, TOTAL_IO_TIME_US=284922, DECODE_TIME_US=0, HDFS_TIME_US=0, CONSUMER_TIME_US=934 ] {noformat} was (Author: sershe): committed to branch... {noformat} TezTaskRunner_attempt_1431919257083_3541_2_00_000803_0(attempt_1431919257083_3541_2_00_000803_0)] IN FO org.apache.hadoop.hive.llap.io.api.impl.LlapIoImpl: Fragment counters for [cn041-10.l42scl.hortonworks.com/172.19.128.41, tpch_orc_snappy_1000.lineitem, 7895722, 3,2]: [ NUM_VECTOR_BATCHES=0, NUM_DECODED_BATCHES=0, SELECTED_ROWGROUPS=0, NUM_ERRORS =0, ROWS_EMITTED=0, METADATA_CACHE_HIT=3, METADATA_CACHE_MISS=0, CACHE_HIT_BYTES=0, CACHE_MISS_BYTES=0, ALLOCATED_BYTES=0, AL LOCATED_USED_BYTES=0, TOTAL_IO_TIME_US=284922, DECODE_TIME_US=0, HDFS_TIME_US=0, CONSUMER_TIME_US=934 ] {noformat} LLAP: elevator reads some useless data even if all RGs are eliminated by SARG - Key: HIVE-10920 URL: https://issues.apache.org/jira/browse/HIVE-10920 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10907) Hive on Tez: Classcast exception in some cases with SMB joins
[ https://issues.apache.org/jira/browse/HIVE-10907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573367#comment-14573367 ] Gunther Hagleitner commented on HIVE-10907: --- I think the check is too restrictive? (i.e. all sides need to have same size of rs) - the commented out code looks better :-) Hive on Tez: Classcast exception in some cases with SMB joins - Key: HIVE-10907 URL: https://issues.apache.org/jira/browse/HIVE-10907 Project: Hive Issue Type: Bug Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10907.1.patch, HIVE-10907.2.patch, HIVE-10907.3.patch In cases where there is a mix of Map side work and reduce side work, we get a classcast exception because we assume homogeneity in the code. We need to fix this correctly. For now this is a workaround. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10736) HiveServer2 shutdown of cached tez app-masters is not clean
[ https://issues.apache.org/jira/browse/HIVE-10736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573274#comment-14573274 ] Gunther Hagleitner commented on HIVE-10736: --- +1 branch-1.2 as well? HiveServer2 shutdown of cached tez app-masters is not clean --- Key: HIVE-10736 URL: https://issues.apache.org/jira/browse/HIVE-10736 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Gopal V Assignee: Vikram Dixit K Attachments: HIVE-10736.1.patch, HIVE-10736.2.patch The shutdown process throws concurrent modification exceptions and fails to clean up the app masters per queue. {code} 2015-05-17 20:24:00,464 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:OperationManager is stopped. 2015-05-17 20:24:00,464 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:SessionManager is stopped. 2015-05-17 20:24:00,464 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-9()]: tez.TezSessionPoolManager (TezSessionPoolManager.java:close(175)) - Closing tez session default? true 2015-05-17 20:24:00,465 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:CLIService is stopped. 2015-05-17 20:24:00,465 INFO [Thread-6()]: service.AbstractService (AbstractService.java:stop(125)) - Service:HiveServer2 is stopped. 2015-05-17 20:24:00,465 INFO [Thread-6()]: tez.TezSessionState (TezSessionState.java:close(332)) - Closing Tez Session 2015-05-17 20:24:00,466 INFO [Thread-6()]: client.TezClient (TezClient.java:stop(495)) - Shutting down Tez Session, sessionName=HIVE-94cc629d-63bc-490a-a135-af85c0cc0f2e, applicationId=application_1431919257083_0012 2015-05-17 20:24:00,570 ERROR [Thread-6()]: server.HiveServer2 (HiveServer2.java:stop(322)) - Tez session pool manager stop had an error during stop of HiveServer2. Shutting down HiveServer2 anyway. java.util.ConcurrentModificationException at java.util.LinkedList$ListItr.checkForComodification(LinkedList.java:966) at java.util.LinkedList$ListItr.next(LinkedList.java:888) at org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.stop(TezSessionPoolManager.java:187) at org.apache.hive.service.server.HiveServer2.stop(HiveServer2.java:320) at org.apache.hive.service.server.HiveServer2$1.run(HiveServer2.java:107) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive
[ https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573309#comment-14573309 ] Sushanth Sowmyan commented on HIVE-10761: - This looks very useful, thanks! Also, I would suggest deprecating the current metrics system targeting removal in a couple of releases. I don't think it has been used much outside of Yahoo - [~mithun] can clarify if they still care about it. Create codahale-based metrics system for Hive - Key: HIVE-10761 URL: https://issues.apache.org/jira/browse/HIVE-10761 Project: Hive Issue Type: New Feature Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Fix For: 1.3.0 Attachments: HIVE-10761.2.patch, HIVE-10761.3.patch, HIVE-10761.4.patch, HIVE-10761.5.patch, HIVE-10761.6.patch, HIVE-10761.patch, hms-metrics.json There is a current Hive metrics system that hooks up to a JMX reporting, but all its measurements, models are custom. This is to make another metrics system that will be based on Codahale (ie yammer, dropwizard), which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in reporting frameworks like JMX, Console, Log, JSON webserver It is used for many projects, including several Apache projects like Oozie. Overall, monitoring tools should find it easier to understand these common metric, measurement, reporting models. The existing metric subsystem will be kept and can be enabled if backward compatibility is desired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10914) LLAP: fix hadoop-1 build for good by removing llap-server from hadoop-1 build
[ https://issues.apache.org/jira/browse/HIVE-10914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin resolved HIVE-10914. - Resolution: Fixed the 2nd part committed to branch... hadoop-1 build now works for me even with clean maven repo LLAP: fix hadoop-1 build for good by removing llap-server from hadoop-1 build - Key: HIVE-10914 URL: https://issues.apache.org/jira/browse/HIVE-10914 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap LLAP won't ever work with hadoop 1, so no point in building it -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak
[ https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573361#comment-14573361 ] Vaibhav Gumashta commented on HIVE-10925: - cc [~ekoifman] [~alangates] I'm making the transaction handler a static threadlocal in this patch. Can you review that change? Non-static threadlocals in metastore code can potentially cause memory leak --- Key: HIVE-10925 URL: https://issues.apache.org/jira/browse/HIVE-10925 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Attachments: HIVE-10925.1.patch There are many places where non-static threadlocals are used. I can't seem to find a good logic for using them. However, they can potentially result in leaking objects if for example they are created in a long running thread every time the thread handles a new session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573232#comment-14573232 ] Mithun Radhakrishnan commented on HIVE-10754: - Hello, Aihua. I'm all for switching from the deprecated {{Job}} constructor to using {{Job.getInstance()}}. But I am unable to understand how this changes/fixes anything. Both {{new Job(Configuration)}} and {{Job.getInstance(Configuration)}} seem to eventually use the package-private {{Job(JobConf)}} constructor. No latter references to {{clone}} or {{job}} have been modified in {{HCatLoader.setLocation()}}. Could you please explain your intention? Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader Key: HIVE-10754 URL: https://issues.apache.org/jira/browse/HIVE-10754 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10754.patch {noformat} Create table tbl1 (key string, value string) stored as rcfile; Create table tbl2 (key string, value string); insert into tbl1 values( '1', '111'); insert into tbl2 values('1', '2'); {noformat} Pig script: {noformat} src_tbl1 = FILTER tbl1 BY (key == '1'); prj_tbl1 = FOREACH src_tbl1 GENERATE key as tbl1_key, value as tbl1_value, '333' as tbl1_v1; src_tbl2 = FILTER tbl2 BY (key == '1'); prj_tbl2 = FOREACH src_tbl2 GENERATE key as tbl2_key, value as tbl2_value; dump prj_tbl1; dump prj_tbl2; result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key); prj_result = FOREACH result GENERATE prj_tbl1::tbl1_key AS key1, prj_tbl1::tbl1_value AS value1, prj_tbl1::tbl1_v1 AS v1, prj_tbl2::tbl2_key AS key2, prj_tbl2::tbl2_value AS value2; dump prj_result; {noformat} The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2). We need to clone the job instance in HCatLoader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10932) Unit test udf_nondeterministic failure due to HIVE-10728
[ https://issues.apache.org/jira/browse/HIVE-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573255#comment-14573255 ] Aihua Xu commented on HIVE-10932: - Those failures are unrelated to the patch. Unit test udf_nondeterministic failure due to HIVE-10728 Key: HIVE-10932 URL: https://issues.apache.org/jira/browse/HIVE-10932 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.3.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10932.patch The test udf_nondeterministic.q failed due to the change in HIVE-10728, in which unix_timestamp() is now marked as deterministic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10869) fold_case.q failing on trunk
[ https://issues.apache.org/jira/browse/HIVE-10869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573307#comment-14573307 ] Ashutosh Chauhan commented on HIVE-10869: - Cherry-picked on 1.2 as well. fold_case.q failing on trunk Key: HIVE-10869 URL: https://issues.apache.org/jira/browse/HIVE-10869 Project: Hive Issue Type: Test Components: Tests Affects Versions: 1.2.1 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.3.0, 1.2.1 Attachments: HIVE-10869.patch Race condition of commits between HIVE-10716 HIVE-10812 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-10914) LLAP: fix hadoop-1 build for good by removing llap-server from hadoop-1 build
[ https://issues.apache.org/jira/browse/HIVE-10914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reopened HIVE-10914: - I should clean my maven repo before testing this locally :( LLAP: fix hadoop-1 build for good by removing llap-server from hadoop-1 build - Key: HIVE-10914 URL: https://issues.apache.org/jira/browse/HIVE-10914 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: llap LLAP won't ever work with hadoop 1, so no point in building it -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10933) Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns()
[ https://issues.apache.org/jira/browse/HIVE-10933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chaoyu Tang reassigned HIVE-10933: -- Assignee: Chaoyu Tang Hive 0.13 returns precision 0 for varchar(32) from DatabaseMetadata.getColumns() Key: HIVE-10933 URL: https://issues.apache.org/jira/browse/HIVE-10933 Project: Hive Issue Type: Bug Components: API Affects Versions: 0.13.0 Reporter: Son Nguyen Assignee: Chaoyu Tang DatabaseMetadata.getColumns() returns COLUMN_SIZE as 0 for a column defined as varchar(32), or char(32). While ResultSetMetaData.getPrecision() returns correct value 32. Here is the segment program that reproduces the issue. try { statement = connection.createStatement(); statement.execute(drop table if exists son_table); statement.execute(create table son_table( col1 varchar(32) )); statement.close(); } catch ( Exception e) { return; } // get column info using metadata try { DatabaseMetaData dmd = null; ResultSet resultSet = null; dmd = connection.getMetaData(); resultSet = dmd.getColumns(null, null, son_table, col1); if ( resultSet.next() ) { String tabName = resultSet.getString(TABLE_NAME); String colName = resultSet.getString(COLUMN_NAME); String dataType = resultSet.getString(DATA_TYPE); String typeName = resultSet.getString(TYPE_NAME); int precision = resultSet.getInt(COLUMN_SIZE); // output is: colName = col1, dataType = 12, typeName = VARCHAR, precision = 0. System.out.format(colName = %s, dataType = %s, typeName = %s, precision = %d., colName, dataType, typeName, precision); } } catch ( Exception e) { return; } -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements
[ https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573342#comment-14573342 ] Alexander Pivovarov commented on HIVE-10841: Better to put code/sql/plan to \{code\}...\{code\} blocks. It will be easier to read [WHERE col is not null] does not work sometimes for queries with many JOIN statements - Key: HIVE-10841 URL: https://issues.apache.org/jira/browse/HIVE-10841 Project: Hive Issue Type: Bug Components: Query Planning, Query Processor Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0 Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Attachments: HIVE-10841.patch The result from the following SELECT query is 3 rows but it should be 1 row. I checked it in MySQL - it returned 1 row. To reproduce the issue in Hive 1. prepare tables {code} drop table if exists L; drop table if exists LA; drop table if exists FR; drop table if exists A; drop table if exists PI; drop table if exists acct; create table L as select 4436 id; create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id; create table FR as select 4436 loan_id; create table A as select 4748 id; create table PI as select 4415 id; create table acct as select 4748 aid, 10 acc_n, 122 brn; insert into table acct values(4748, null, null); insert into table acct values(4748, null, null); {code} 2. run SELECT query {code} select acct.ACC_N, acct.brn FROM L JOIN LA ON L.id = LA.loan_id JOIN FR ON L.id = FR.loan_id JOIN A ON LA.aid = A.id JOIN PI ON PI.id = LA.pi_id JOIN acct ON A.id = acct.aid WHERE L.id = 4436 and acct.brn is not null; {code} the result is 3 rows {code} 10122 NULL NULL NULL NULL {code} but it should be 1 row {code} 10122 {code} 2.1 explain select ... output for hive-1.3.0 MR {code} STAGE DEPENDENCIES: Stage-12 is a root stage Stage-9 depends on stages: Stage-12 Stage-0 depends on stages: Stage-9 STAGE PLANS: Stage: Stage-12 Map Reduce Local Work Alias - Map Local Tables: a Fetch Operator limit: -1 acct Fetch Operator limit: -1 fr Fetch Operator limit: -1 l Fetch Operator limit: -1 pi Fetch Operator limit: -1 Alias - Map Local Operator Tree: a TableScan alias: a Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) acct TableScan alias: acct Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: aid is not null (type: boolean) Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) fr TableScan alias: fr Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (loan_id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) l TableScan alias: l Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) pi TableScan alias: pi Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE
[jira] [Updated] (HIVE-10919) Windows: create table with JsonSerDe failed via beeline unless you add hcatalog core jar to classpath
[ https://issues.apache.org/jira/browse/HIVE-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10919: - Fix Version/s: 1.2.1 Windows: create table with JsonSerDe failed via beeline unless you add hcatalog core jar to classpath - Key: HIVE-10919 URL: https://issues.apache.org/jira/browse/HIVE-10919 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.3.0, 1.2.1 Attachments: HIVE-10919.1.patch NO PRECOMMIT TESTS Before we run HiveServer2 tests, we create table via beeline. And 'create table' with JsonSerDe failed on Winodws. It works on Linux: {noformat} 0: jdbc:hive2://localhost:10001 create external table all100kjson( 0: jdbc:hive2://localhost:10001 s string, 0: jdbc:hive2://localhost:10001 i int, 0: jdbc:hive2://localhost:10001 d double, 0: jdbc:hive2://localhost:10001 m mapstring, string, 0: jdbc:hive2://localhost:10001 bb arraystructa: int, b: string, 0: jdbc:hive2://localhost:10001 t timestamp) 0: jdbc:hive2://localhost:10001 row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 0: jdbc:hive2://localhost:10001 WITH SERDEPROPERTIES ('timestamp.formats'='-MM-dd\'T\'HH:mm:ss') 0: jdbc:hive2://localhost:10001 STORED AS TEXTFILE 0: jdbc:hive2://localhost:10001 location '/user/hcat/tests/data/all100kjson'; Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLT ask. Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe (state=08S01,code=1) {noformat} hive.log shows: {noformat} 2015-05-21 21:59:17,004 ERROR operation.Operation (SQLOperation.java:run(209)) - Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:156) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe at org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:3871) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4011) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:306) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) ... 11 more Caused by: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.data.JsonSerDe not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:3865) ... 21 more {noformat} If you do add the hcatalog jar to classpath, it works: {noformat}0: jdbc:hive2://localhost:10001 add jar hdfs:///tmp/testjars/hive-hcatalog-core-1.2.0.2.3.0.0-2079.jar; INFO : converting to local hdfs:///tmp/testjars/hive-hcatalog-core-1.2.0.2.3.0.0-2079.jar INFO : Added
[jira] [Commented] (HIVE-10919) Windows: create table with JsonSerDe failed via beeline unless you add hcatalog core jar to classpath
[ https://issues.apache.org/jira/browse/HIVE-10919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573499#comment-14573499 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-10919: -- Committed to branch-1.2 as well. Windows: create table with JsonSerDe failed via beeline unless you add hcatalog core jar to classpath - Key: HIVE-10919 URL: https://issues.apache.org/jira/browse/HIVE-10919 Project: Hive Issue Type: Bug Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Fix For: 1.3.0, 1.2.1 Attachments: HIVE-10919.1.patch NO PRECOMMIT TESTS Before we run HiveServer2 tests, we create table via beeline. And 'create table' with JsonSerDe failed on Winodws. It works on Linux: {noformat} 0: jdbc:hive2://localhost:10001 create external table all100kjson( 0: jdbc:hive2://localhost:10001 s string, 0: jdbc:hive2://localhost:10001 i int, 0: jdbc:hive2://localhost:10001 d double, 0: jdbc:hive2://localhost:10001 m mapstring, string, 0: jdbc:hive2://localhost:10001 bb arraystructa: int, b: string, 0: jdbc:hive2://localhost:10001 t timestamp) 0: jdbc:hive2://localhost:10001 row format serde 'org.apache.hive.hcatalog.data.JsonSerDe' 0: jdbc:hive2://localhost:10001 WITH SERDEPROPERTIES ('timestamp.formats'='-MM-dd\'T\'HH:mm:ss') 0: jdbc:hive2://localhost:10001 STORED AS TEXTFILE 0: jdbc:hive2://localhost:10001 location '/user/hcat/tests/data/all100kjson'; Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLT ask. Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe (state=08S01,code=1) {noformat} hive.log shows: {noformat} 2015-05-21 21:59:17,004 ERROR operation.Operation (SQLOperation.java:run(209)) - Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:315) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:156) at org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:71) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:206) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:218) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Cannot validate serde: org.apache.hive.hcatalog.data.JsonSerDe at org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:3871) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4011) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:306) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1650) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1409) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:154) ... 11 more Caused by: java.lang.ClassNotFoundException: Class org.apache.hive.hcatalog.data.JsonSerDe not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.hive.ql.exec.DDLTask.validateSerDe(DDLTask.java:3865) ... 21 more {noformat} If you do add the hcatalog jar to classpath, it works: {noformat}0: jdbc:hive2://localhost:10001 add jar hdfs:///tmp/testjars/hive-hcatalog-core-1.2.0.2.3.0.0-2079.jar; INFO : converting to local hdfs:///tmp/testjars/hive-hcatalog-core-1.2.0.2.3.0.0-2079.jar INFO :
[jira] [Assigned] (HIVE-10935) LLAP: merge master to branch
[ https://issues.apache.org/jira/browse/HIVE-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin reassigned HIVE-10935: --- Assignee: Sergey Shelukhin LLAP: merge master to branch Key: HIVE-10935 URL: https://issues.apache.org/jira/browse/HIVE-10935 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10929) In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values
[ https://issues.apache.org/jira/browse/HIVE-10929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573072#comment-14573072 ] Hive QA commented on HIVE-10929: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12737504/HIVE-10929.1.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8999 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union4 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union6 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_leftsemi_mapjoin org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_multi_insert org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_outer_join4 org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4174/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4174/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4174/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12737504 - PreCommit-HIVE-TRUNK-Build In Tez mode,dynamic partitioning query with union all fails at moveTask,Invalid partition key values -- Key: HIVE-10929 URL: https://issues.apache.org/jira/browse/HIVE-10929 Project: Hive Issue Type: Bug Components: Tez Affects Versions: 1.2.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-10929.1.patch {code} create table dummy(i int); insert into table dummy values (1); select * from dummy; create table partunion1(id1 int) partitioned by (part1 string); set hive.exec.dynamic.partition.mode=nonstrict; set hive.execution.engine=tez; explain insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; insert into table partunion1 partition(part1) select temps.* from ( select 1 as id1, '2014' as part1 from dummy union all select 2 as id1, '2014' as part1 from dummy ) temps; select * from partunion1; {code} fails. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10410) Apparent race condition in HiveServer2 causing intermittent query failures
[ https://issues.apache.org/jira/browse/HIVE-10410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572981#comment-14572981 ] Chaoyu Tang commented on HIVE-10410: I think not only the Hive object, but also sessionState and HiveConf shared in child threads may also cause the race issue. Apparent race condition in HiveServer2 causing intermittent query failures -- Key: HIVE-10410 URL: https://issues.apache.org/jira/browse/HIVE-10410 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.1 Environment: CDH 5.3.3 CentOS 6.4 Reporter: Richard Williams On our secure Hadoop cluster, queries submitted to HiveServer2 through JDBC occasionally trigger odd Thrift exceptions with messages such as Read a negative frame size (-2147418110)! or out of sequence response in HiveServer2's connections to the metastore. For certain metastore calls (for example, showDatabases), these Thrift exceptions are converted to MetaExceptions in HiveMetaStoreClient, which prevents RetryingMetaStoreClient from retrying these calls and thus causes the failure to bubble out to the JDBC client. Note that as far as we can tell, this issue appears to only affect queries that are submitted with the runAsync flag on TExecuteStatementReq set to true (which, in practice, seems to mean all JDBC queries), and it appears to only manifest when HiveServer2 is using the new HTTP transport mechanism. When both these conditions hold, we are able to fairly reliably reproduce the issue by spawning about 100 simple, concurrent hive queries (we have been using show databases), two or three of which typically fail. However, when either of these conditions do not hold, we are no longer able to reproduce the issue. Some example stack traces from the HiveServer2 logs: {noformat} 2015-04-16 13:54:55,486 ERROR hive.log: Got exception: org.apache.thrift.transport.TTransportException Read a negative frame size (-2147418110)! org.apache.thrift.transport.TTransportException: Read a negative frame size (-2147418110)! at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:435) at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:414) at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.hadoop.hive.thrift.TFilterTransport.readAll(TFilterTransport.java:62) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_databases(ThriftHiveMetastore.java:600) at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_databases(ThriftHiveMetastore.java:587) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getDatabases(HiveMetaStoreClient.java:837) at org.apache.sentry.binding.metastore.SentryHiveMetaStoreClient.getDatabases(SentryHiveMetaStoreClient.java:60) at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:90) at com.sun.proxy.$Proxy6.getDatabases(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getDatabasesByPattern(Hive.java:1139) at org.apache.hadoop.hive.ql.exec.DDLTask.showDatabases(DDLTask.java:2445) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:364) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1554) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1321) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1139) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:962) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:957) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:145) at org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69) at
[jira] [Updated] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10754: Description: {noformat} Create table tbl1 (key string, value string) stored as rcfile; Create table tbl2 (key string, value string); insert into tbl1 values( '1', '111'); insert into tbl2 values('1', '2'); {noformat} Pig script: {noformat} src_tbl1 = FILTER tbl1 BY (key == '1'); prj_tbl1 = FOREACH src_tbl1 GENERATE key as tbl1_key, value as tbl1_value, '333' as tbl1_v1; src_tbl2 = FILTER tbl2 BY (key == '1'); prj_tbl2 = FOREACH src_tbl2 GENERATE key as tbl2_key, value as tbl2_value; dump prj_tbl1; dump prj_tbl2; result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key); prj_result = FOREACH result GENERATE prj_tbl1::tbl1_key AS key1, prj_tbl1::tbl1_value AS value1, prj_tbl1::tbl1_v1 AS v1, prj_tbl2::tbl2_key AS key2, prj_tbl2::tbl2_value AS value2; dump prj_result; {noformat} The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2). We need to clone the job instance in HCatLoader. Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader Key: HIVE-10754 URL: https://issues.apache.org/jira/browse/HIVE-10754 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10754.patch {noformat} Create table tbl1 (key string, value string) stored as rcfile; Create table tbl2 (key string, value string); insert into tbl1 values( '1', '111'); insert into tbl2 values('1', '2'); {noformat} Pig script: {noformat} src_tbl1 = FILTER tbl1 BY (key == '1'); prj_tbl1 = FOREACH src_tbl1 GENERATE key as tbl1_key, value as tbl1_value, '333' as tbl1_v1; src_tbl2 = FILTER tbl2 BY (key == '1'); prj_tbl2 = FOREACH src_tbl2 GENERATE key as tbl2_key, value as tbl2_value; dump prj_tbl1; dump prj_tbl2; result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key); prj_result = FOREACH result GENERATE prj_tbl1::tbl1_key AS key1, prj_tbl1::tbl1_value AS value1, prj_tbl1::tbl1_v1 AS v1, prj_tbl2::tbl2_key AS key2, prj_tbl2::tbl2_value AS value2; dump prj_result; {noformat} The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2). We need to clone the job instance in HCatLoader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10754) Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader
[ https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573060#comment-14573060 ] Aihua Xu commented on HIVE-10754: - [~mithun] Can you help review the change? Thanks. Pig+Hcatalog doesn't work properly since we need to clone the Job instance in HCatLoader Key: HIVE-10754 URL: https://issues.apache.org/jira/browse/HIVE-10754 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10754.patch {noformat} Create table tbl1 (key string, value string) stored as rcfile; Create table tbl2 (key string, value string); insert into tbl1 values( '1', '111'); insert into tbl2 values('1', '2'); {noformat} Pig script: {noformat} src_tbl1 = FILTER tbl1 BY (key == '1'); prj_tbl1 = FOREACH src_tbl1 GENERATE key as tbl1_key, value as tbl1_value, '333' as tbl1_v1; src_tbl2 = FILTER tbl2 BY (key == '1'); prj_tbl2 = FOREACH src_tbl2 GENERATE key as tbl2_key, value as tbl2_value; dump prj_tbl1; dump prj_tbl2; result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key); prj_result = FOREACH result GENERATE prj_tbl1::tbl1_key AS key1, prj_tbl1::tbl1_value AS value1, prj_tbl1::tbl1_v1 AS v1, prj_tbl2::tbl2_key AS key2, prj_tbl2::tbl2_value AS value2; dump prj_result; {noformat} The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2). We need to clone the job instance in HCatLoader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10921) Change trunk pom version to reflect the branch-1 split
[ https://issues.apache.org/jira/browse/HIVE-10921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572961#comment-14572961 ] Alan Gates commented on HIVE-10921: --- +1 Change trunk pom version to reflect the branch-1 split -- Key: HIVE-10921 URL: https://issues.apache.org/jira/browse/HIVE-10921 Project: Hive Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Fix For: 2.0.0 Attachments: HIVE-10921.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10880) The bucket number is not respected in insert overwrite.
[ https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572804#comment-14572804 ] Yongzhi Chen commented on HIVE-10880: - The implement of method private static String replaceTaskId(String taskId, int bucketNum) looks not right. For the code is in the source for a while, I am not very confident about that. Attached the patch 3 fixes that issue too. If the tests pass, should use patch3, otherwise keep patch2. The bucket number is not respected in insert overwrite. --- Key: HIVE-10880 URL: https://issues.apache.org/jira/browse/HIVE-10880 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Priority: Blocker Attachments: HIVE-10880.1.patch, HIVE-10880.2.patch, HIVE-10880.3.patch When hive.enforce.bucketing is true, the bucket number defined in the table is no longer respected in current master and 1.2. This is a regression. Reproduce: {noformat} CREATE TABLE IF NOT EXISTS buckettestinput( data string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput1( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput2( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; Then I inserted the following data into the buckettestinput table firstinsert1 firstinsert2 firstinsert3 firstinsert4 firstinsert5 firstinsert6 firstinsert7 firstinsert8 secondinsert1 secondinsert2 secondinsert3 secondinsert4 secondinsert5 secondinsert6 secondinsert7 secondinsert8 set hive.enforce.bucketing = true; set hive.enforce.sorting=true; insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'; set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); Error: Error while compiling statement: FAILED: SemanticException [Error 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 (state=42000,code=10141) {noformat} The related debug information related to insert overwrite: {noformat} 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'insert overwrite table buckettestoutput1 0: jdbc:hive2://localhost:1 ; select * from buckettestinput where data like ' first%'; INFO : Number of reduce tasks determined at compile time: 2 INFO : In order to change the average load for a reducer (in bytes): INFO : set hive.exec.reducers.bytes.per.reducer=number INFO : In order to limit the maximum number of reducers: INFO : set hive.exec.reducers.max=number INFO : In order to set a constant number of reducers: INFO : set mapred.reduce.tasks=number INFO : Job running in-process (local Hadoop) INFO : 2015-06-01 11:09:29,650 Stage-1 map = 86%, reduce = 100% INFO : Ended Job = job_local107155352_0001 INFO : Loading data to table default.buckettestoutput1 from file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1 INFO : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, totalSize=52, rawDataSize=48] No rows affected (1.692 seconds) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10880) The bucket number is not respected in insert overwrite.
[ https://issues.apache.org/jira/browse/HIVE-10880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-10880: Attachment: HIVE-10880.3.patch The bucket number is not respected in insert overwrite. --- Key: HIVE-10880 URL: https://issues.apache.org/jira/browse/HIVE-10880 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Yongzhi Chen Assignee: Yongzhi Chen Priority: Blocker Attachments: HIVE-10880.1.patch, HIVE-10880.2.patch, HIVE-10880.3.patch When hive.enforce.bucketing is true, the bucket number defined in the table is no longer respected in current master and 1.2. This is a regression. Reproduce: {noformat} CREATE TABLE IF NOT EXISTS buckettestinput( data string ) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput1( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; CREATE TABLE IF NOT EXISTS buckettestoutput2( data string )CLUSTERED BY(data) INTO 2 BUCKETS ROW FORMAT DELIMITED FIELDS TERMINATED BY ','; Then I inserted the following data into the buckettestinput table firstinsert1 firstinsert2 firstinsert3 firstinsert4 firstinsert5 firstinsert6 firstinsert7 firstinsert8 secondinsert1 secondinsert2 secondinsert3 secondinsert4 secondinsert5 secondinsert6 secondinsert7 secondinsert8 set hive.enforce.bucketing = true; set hive.enforce.sorting=true; insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'; set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; select * from buckettestoutput1 a join buckettestoutput2 b on (a.data=b.data); Error: Error while compiling statement: FAILED: SemanticException [Error 10141]: Bucketed table metadata is not correct. Fix the metadata or don't use bucketed mapjoin, by setting hive.enforce.bucketmapjoin to false. The number of buckets for table buckettestoutput1 is 2, whereas the number of files is 1 (state=42000,code=10141) {noformat} The related debug information related to insert overwrite: {noformat} 0: jdbc:hive2://localhost:1 insert overwrite table buckettestoutput1 select * from buckettestinput where data like 'first%'insert overwrite table buckettestoutput1 0: jdbc:hive2://localhost:1 ; select * from buckettestinput where data like ' first%'; INFO : Number of reduce tasks determined at compile time: 2 INFO : In order to change the average load for a reducer (in bytes): INFO : set hive.exec.reducers.bytes.per.reducer=number INFO : In order to limit the maximum number of reducers: INFO : set hive.exec.reducers.max=number INFO : In order to set a constant number of reducers: INFO : set mapred.reduce.tasks=number INFO : Job running in-process (local Hadoop) INFO : 2015-06-01 11:09:29,650 Stage-1 map = 86%, reduce = 100% INFO : Ended Job = job_local107155352_0001 INFO : Loading data to table default.buckettestoutput1 from file:/user/hive/warehouse/buckettestoutput1/.hive-staging_hive_2015-06-01_11-09-28_166_3109203968904090801-1/-ext-1 INFO : Table default.buckettestoutput1 stats: [numFiles=1, numRows=4, totalSize=52, rawDataSize=48] No rows affected (1.692 seconds) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak
[ https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572905#comment-14572905 ] Hive QA commented on HIVE-10925: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12737489/HIVE-10925.1.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8998 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4173/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4173/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4173/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12737489 - PreCommit-HIVE-TRUNK-Build Non-static threadlocals in metastore code can potentially cause memory leak --- Key: HIVE-10925 URL: https://issues.apache.org/jira/browse/HIVE-10925 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Attachments: HIVE-10925.1.patch There are many places where non-static threadlocals are used. I can't seem to find a good logic for using them. However, they can potentially result in leaking objects if for example they are created in a long running thread every time the thread handles a new session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10932) Unit test udf_nondeterministic failure due to HIVE-10728
[ https://issues.apache.org/jira/browse/HIVE-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10932: Attachment: HIVE-10932.patch Unit test udf_nondeterministic failure due to HIVE-10728 Key: HIVE-10932 URL: https://issues.apache.org/jira/browse/HIVE-10932 Project: Hive Issue Type: Bug Components: Tests Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10932.patch The test udf_nondeterministic.q failed due to the change in HIVE-10728, in which unix_timestamp() is now marked as deterministic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10932) Unit test udf_nondeterministic failure due to HIVE-10728
[ https://issues.apache.org/jira/browse/HIVE-10932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572882#comment-14572882 ] Aihua Xu commented on HIVE-10932: - [~ashutoshc] Can you help review the test code? Unit test udf_nondeterministic failure due to HIVE-10728 Key: HIVE-10932 URL: https://issues.apache.org/jira/browse/HIVE-10932 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 1.3.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10932.patch The test udf_nondeterministic.q failed due to the change in HIVE-10728, in which unix_timestamp() is now marked as deterministic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10922) In HS2 doAs=false mode, file system related errors in one query causes other failures
[ https://issues.apache.org/jira/browse/HIVE-10922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572287#comment-14572287 ] Hive QA commented on HIVE-10922: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12737416/HIVE-10922.1.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 8992 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_autogen_colalias org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_histogram_numeric org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_nondeterministic org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx_cbo_2 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4167/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/4167/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-4167/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12737416 - PreCommit-HIVE-TRUNK-Build In HS2 doAs=false mode, file system related errors in one query causes other failures - Key: HIVE-10922 URL: https://issues.apache.org/jira/browse/HIVE-10922 Project: Hive Issue Type: Bug Affects Versions: 1.0.0, 1.2.0, 1.1.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-10922.1.patch Warehouse class has a few methods that close file system object on errors. With doAs=false, since all queries use the same HS2 ugi, the filesystem object is shared across queries/threads. When the close on one filesystem object gets called, it leads to filesystem object used in other threads also get closed and any files registered for deletion on exit also getting deleted. There is also no close being done in case of the happy code path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10922) In HS2 doAs=false mode, file system related errors in one query causes other failures
[ https://issues.apache.org/jira/browse/HIVE-10922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572327#comment-14572327 ] Gunther Hagleitner commented on HIVE-10922: --- Test failures are unrelated. In HS2 doAs=false mode, file system related errors in one query causes other failures - Key: HIVE-10922 URL: https://issues.apache.org/jira/browse/HIVE-10922 Project: Hive Issue Type: Bug Affects Versions: 1.0.0, 1.2.0, 1.1.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-10922.1.patch Warehouse class has a few methods that close file system object on errors. With doAs=false, since all queries use the same HS2 ugi, the filesystem object is shared across queries/threads. When the close on one filesystem object gets called, it leads to filesystem object used in other threads also get closed and any files registered for deletion on exit also getting deleted. There is also no close being done in case of the happy code path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10904) Use beeline-log4j.properties for migrated CLI [beeline-cli Branch]
[ https://issues.apache.org/jira/browse/HIVE-10904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572241#comment-14572241 ] Chinna Rao Lalam commented on HIVE-10904: - Thanks [~leftylev], Linked this issue to HIVE-10810 Use beeline-log4j.properties for migrated CLI [beeline-cli Branch] -- Key: HIVE-10904 URL: https://issues.apache.org/jira/browse/HIVE-10904 Project: Hive Issue Type: Sub-task Affects Versions: beeline-cli-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10904.patch Updated CLI printing logs on the console. Use beeline-log4j.properties for redirecting to file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10555) Improve windowing spec of range based windowing to support additional range formats
[ https://issues.apache.org/jira/browse/HIVE-10555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572250#comment-14572250 ] Lefty Leverenz commented on HIVE-10555: --- Doc note: Subtasks that need documentation have been marked with TODOC1.3 labels. Improve windowing spec of range based windowing to support additional range formats --- Key: HIVE-10555 URL: https://issues.apache.org/jira/browse/HIVE-10555 Project: Hive Issue Type: Improvement Components: PTF-Windowing Affects Versions: 1.3.0 Reporter: Aihua Xu Assignee: Aihua Xu Fix For: 1.3.0 Currently windowing function only supports the formats of {{x preceding and current}}, {{x preceding and y following}}, {{current and y following}}. Windowing of {{x preceding and y preceding}} and {{x following and y following}} doesn't work properly. The following functions should be supported. First_value(), last_value(), sum(), avg() , count(), min(), max() -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10928) Concurrent Beeline Connections can not work on different databases
[ https://issues.apache.org/jira/browse/HIVE-10928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chirag aggarwal updated HIVE-10928: --- Summary: Concurrent Beeline Connections can not work on different databases (was: Concurrent Beeline Connections can not work different databases) Concurrent Beeline Connections can not work on different databases -- Key: HIVE-10928 URL: https://issues.apache.org/jira/browse/HIVE-10928 Project: Hive Issue Type: Bug Components: Beeline Affects Versions: 0.14.0 Reporter: chirag aggarwal The concurrent beeline connections are not able to work on different databases. If one connection calls 'use abc', then all the connections start working on database 'abc'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak
[ https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-10925: Affects Version/s: (was: 0.12.0) (was: 0.11.0) Non-static threadlocals in metastore code can potentially cause memory leak --- Key: HIVE-10925 URL: https://issues.apache.org/jira/browse/HIVE-10925 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Attachments: HIVE-10925.1.patch There are many places where non-static threadlocals are used. I can't seem to find a good logic for using them. However, they can potentially result in leaking objects if for example they are created in a long running thread every time the thread handles a new session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10904) Use beeline-log4j.properties for migrated CLI [beeline-cli Branch]
[ https://issues.apache.org/jira/browse/HIVE-10904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572231#comment-14572231 ] Lefty Leverenz commented on HIVE-10904: --- Should this be documented? If so, please link it to HIVE-10810 (Document Beeline/CLI changes). Use beeline-log4j.properties for migrated CLI [beeline-cli Branch] -- Key: HIVE-10904 URL: https://issues.apache.org/jira/browse/HIVE-10904 Project: Hive Issue Type: Sub-task Affects Versions: beeline-cli-branch Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-10904.patch Updated CLI printing logs on the console. Use beeline-log4j.properties for redirecting to file. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements
[ https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572263#comment-14572263 ] Laljo John Pullokkaran commented on HIVE-10841: --- [~apivovarov] I see that predicate is being pushed down with the patch. See the attached explain below: hive explain select acct.ACC_N, acct.brn FROM L JOIN LA ON L.id = LA.loan_id JOIN FR ON L.id = FR.loan_id JOIN A ON LA.aid = A.id JOIN PI ON PI.id = LA.pi_id JOIN acct ON A.id = acct.aid and acct.brn is not null WHERE L.id = 4436; OK STAGE DEPENDENCIES: Stage-12 is a root stage Stage-9 depends on stages: Stage-12 Stage-0 depends on stages: Stage-9 STAGE PLANS: Stage: Stage-12 Map Reduce Local Work Alias - Map Local Tables: a Fetch Operator limit: -1 acct Fetch Operator limit: -1 fr Fetch Operator limit: -1 l Fetch Operator limit: -1 pi Fetch Operator limit: -1 Alias - Map Local Operator Tree: a TableScan alias: a filterExpr: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) acct TableScan alias: acct filterExpr: (brn is not null and aid is not null) (type: boolean) Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (brn is not null and aid is not null) (type: boolean) Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) fr TableScan alias: fr filterExpr: (loan_id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (loan_id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) l TableScan alias: l filterExpr: (id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) pi TableScan alias: pi filterExpr: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col6 (type: int) 1 id (type: int) Stage: Stage-9 Map Reduce Map Operator Tree: TableScan alias: la filterExpr: (((loan_id is not null and aid is not null) and pi_id is not null) and (loan_id = 4436)) (type: boolean) Statistics: Num rows: 1 Data size: 14 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (((loan_id is not null and aid is not null) and pi_id is not null) and (loan_id = 4436)) (type: boolean) Statistics: Num rows: 1 Data size: 14 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: Inner Join 0 to 1 Inner Join 0 to 2 keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) outputColumnNames: _col5, _col6 Statistics: Num rows: 2 Data size: 8 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: _col5 is not null (type: boolean)
[jira] [Updated] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak
[ https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-10925: Attachment: HIVE-10925.1.patch cc [~thejas] [~sushanth] Non-static threadlocals in metastore code can potentially cause memory leak --- Key: HIVE-10925 URL: https://issues.apache.org/jira/browse/HIVE-10925 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0, 0.12.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Attachments: HIVE-10925.1.patch There are many places where non-static threadlocals are used. I can't seem to find a good logic for using them. However, they can potentially result in leaking objects if for example they are created in a long running thread every time the thread handles a new session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10826) Support min()/max() functions over x preceding and y preceding windowing
[ https://issues.apache.org/jira/browse/HIVE-10826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-10826: -- Labels: TODOC1.3 (was: ) Support min()/max() functions over x preceding and y preceding windowing - Key: HIVE-10826 URL: https://issues.apache.org/jira/browse/HIVE-10826 Project: Hive Issue Type: Sub-task Components: PTF-Windowing Reporter: Aihua Xu Assignee: Aihua Xu Labels: TODOC1.3 Fix For: 1.3.0 Attachments: HIVE-10826.patch Currently the query {noformat} select key, value, min(value) over (partition by key order by value rows between 1 preceding and 1 preceding) from small; {noformat} doesn't work. It failed with {noformat} java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{reducesinkkey0:2},value:{_col0:500}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{reducesinkkey0:2},value:{_col0:500}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) ... 3 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: cannot generate all output rows for a Partition at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:520) at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337) at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10834) Support First_value()/last_value() over x preceding and y preceding windowing
[ https://issues.apache.org/jira/browse/HIVE-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572246#comment-14572246 ] Lefty Leverenz commented on HIVE-10834: --- Doc note: This needs to be documented in the wiki for the 1.3.0 release. * [Windowing and Analytics -- WINDOW clause | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics#LanguageManualWindowingAndAnalytics-WINDOWclause] Support First_value()/last_value() over x preceding and y preceding windowing - Key: HIVE-10834 URL: https://issues.apache.org/jira/browse/HIVE-10834 Project: Hive Issue Type: Sub-task Components: PTF-Windowing Reporter: Aihua Xu Assignee: Aihua Xu Labels: TODOC1.3 Fix For: 1.3.0 Attachments: HIVE-10834.patch Currently the following query {noformat} select ts, f, first_value(f) over (partition by ts order by t rows between 2 preceding and 1 preceding) from over10k limit 100; {noformat} throws exception: {noformat} java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{reducesinkkey0:2013-03-01 09:11:58.703071,reducesinkkey1:-3},value:{_col3:0.83}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{reducesinkkey0:2013-03-01 09:11:58.703071,reducesinkkey1:-3},value:{_col3:0.83}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) ... 3 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: cannot generate all output rows for a Partition at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:519) at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337) at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10834) Support First_value()/last_value() over x preceding and y preceding windowing
[ https://issues.apache.org/jira/browse/HIVE-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-10834: -- Labels: TODOC1.3 (was: ) Support First_value()/last_value() over x preceding and y preceding windowing - Key: HIVE-10834 URL: https://issues.apache.org/jira/browse/HIVE-10834 Project: Hive Issue Type: Sub-task Components: PTF-Windowing Reporter: Aihua Xu Assignee: Aihua Xu Labels: TODOC1.3 Fix For: 1.3.0 Attachments: HIVE-10834.patch Currently the following query {noformat} select ts, f, first_value(f) over (partition by ts order by t rows between 2 preceding and 1 preceding) from over10k limit 100; {noformat} throws exception: {noformat} java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{reducesinkkey0:2013-03-01 09:11:58.703071,reducesinkkey1:-3},value:{_col3:0.83}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {key:{reducesinkkey0:2013-03-01 09:11:58.703071,reducesinkkey1:-3},value:{_col3:0.83}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) ... 3 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: cannot generate all output rows for a Partition at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:519) at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337) at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837) at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10885) with vectorization enabled join operation involving interval_day_time fails
[ https://issues.apache.org/jira/browse/HIVE-10885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572292#comment-14572292 ] Lefty Leverenz commented on HIVE-10885: --- Note: The commits have the wrong Jira number -- they say HIVE-10855 instead of HIVE-10885. * Commit to master: 09100831adff7589ee48e735a4beac6ebb25cb3e * Commit to branch-1.2: f3ab5fda6af57afff31c29ad048d906fd095d5fb with vectorization enabled join operation involving interval_day_time fails --- Key: HIVE-10885 URL: https://issues.apache.org/jira/browse/HIVE-10885 Project: Hive Issue Type: Bug Affects Versions: 1.2.0 Reporter: Jagruti Varia Assignee: Matt McCline Fix For: 1.2.1 Attachments: HIVE-10885.01.patch, HIVE-10885.02.patch, HIVE-10885.03.patch When vectorization is on, join operation involving interval_day_time type throws following error: {noformat} Status: Failed Vertex failed, vertexName=Map 2, vertexId=vertex_1432858236614_0247_1_01, diagnostics=[Task failed, taskId=task_1432858236614_0247_1_01_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:229) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:147) ... 14 more Caused by: java.lang.RuntimeException: Cannot allocate vector copy row for interval_day_time at org.apache.hadoop.hive.ql.exec.vector.VectorCopyRow.init(VectorCopyRow.java:213) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.initializeOp(VectorMapJoinCommonOperator.java:581) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:362) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:481) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:438) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:375) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:214) ... 15 more ], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167) at
[jira] [Commented] (HIVE-10841) [WHERE col is not null] does not work sometimes for queries with many JOIN statements
[ https://issues.apache.org/jira/browse/HIVE-10841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572310#comment-14572310 ] Laljo John Pullokkaran commented on HIVE-10841: --- Never Mind thats the wrong query. I think i can report the filter not getting in to mapper with the original query. [WHERE col is not null] does not work sometimes for queries with many JOIN statements - Key: HIVE-10841 URL: https://issues.apache.org/jira/browse/HIVE-10841 Project: Hive Issue Type: Bug Components: Query Planning, Query Processor Affects Versions: 0.13.0, 0.14.0, 0.13.1, 1.2.0 Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Attachments: HIVE-10841.patch The result from the following SELECT query is 3 rows but it should be 1 row. I checked it in MySQL - it returned 1 row. To reproduce the issue in Hive 1. prepare tables {code} drop table if exists L; drop table if exists LA; drop table if exists FR; drop table if exists A; drop table if exists PI; drop table if exists acct; create table L as select 4436 id; create table LA as select 4436 loan_id, 4748 aid, 4415 pi_id; create table FR as select 4436 loan_id; create table A as select 4748 id; create table PI as select 4415 id; create table acct as select 4748 aid, 10 acc_n, 122 brn; insert into table acct values(4748, null, null); insert into table acct values(4748, null, null); {code} 2. run SELECT query {code} select acct.ACC_N, acct.brn FROM L JOIN LA ON L.id = LA.loan_id JOIN FR ON L.id = FR.loan_id JOIN A ON LA.aid = A.id JOIN PI ON PI.id = LA.pi_id JOIN acct ON A.id = acct.aid WHERE L.id = 4436 and acct.brn is not null; {code} the result is 3 rows {code} 10122 NULL NULL NULL NULL {code} but it should be 1 row {code} 10122 {code} 2.1 explain select ... output for hive-1.3.0 MR {code} STAGE DEPENDENCIES: Stage-12 is a root stage Stage-9 depends on stages: Stage-12 Stage-0 depends on stages: Stage-9 STAGE PLANS: Stage: Stage-12 Map Reduce Local Work Alias - Map Local Tables: a Fetch Operator limit: -1 acct Fetch Operator limit: -1 fr Fetch Operator limit: -1 l Fetch Operator limit: -1 pi Fetch Operator limit: -1 Alias - Map Local Operator Tree: a TableScan alias: a Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) acct TableScan alias: acct Statistics: Num rows: 3 Data size: 31 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: aid is not null (type: boolean) Statistics: Num rows: 2 Data size: 20 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col5 (type: int) 1 id (type: int) 2 aid (type: int) fr TableScan alias: fr Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (loan_id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) l TableScan alias: l Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: (id = 4436) (type: boolean) Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 4436 (type: int) 1 4436 (type: int) 2 4436 (type: int) pi TableScan alias: pi Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE Column stats: NONE Filter Operator predicate: id is not null (type: boolean) Statistics: Num rows: 1 Data
[jira] [Commented] (HIVE-9664) Hive add jar command should be able to download and add jars from a repository
[ https://issues.apache.org/jira/browse/HIVE-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572528#comment-14572528 ] Anant Nag commented on HIVE-9664: - Yes, this can be done. I'll update the wiki to make it more clear. Hive add jar command should be able to download and add jars from a repository Key: HIVE-9664 URL: https://issues.apache.org/jira/browse/HIVE-9664 Project: Hive Issue Type: Improvement Affects Versions: 0.14.0 Reporter: Anant Nag Assignee: Anant Nag Labels: TODOC1.2, hive, patch Fix For: 1.2.0 Attachments: HIVE-9664.4.patch, HIVE-9664.5.patch, HIVE-9664.patch, HIVE-9664.patch, HIVE-9664.patch Currently Hive's add jar command takes a local path to the dependency jar. This clutters the local file-system as users may forget to remove this jar later It would be nice if Hive supported a Gradle like notation to download the jar from a repository. Example: add jar org:module:version It should also be backward compatible and should take jar from the local file-system as well. RB: https://reviews.apache.org/r/31628/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10925) Non-static threadlocals in metastore code can potentially cause memory leak
[ https://issues.apache.org/jira/browse/HIVE-10925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573437#comment-14573437 ] Alan Gates commented on HIVE-10925: --- Changes making the transaction handler thread local static look good. Non-static threadlocals in metastore code can potentially cause memory leak --- Key: HIVE-10925 URL: https://issues.apache.org/jira/browse/HIVE-10925 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 0.13 Reporter: Vaibhav Gumashta Assignee: Vaibhav Gumashta Attachments: HIVE-10925.1.patch There are many places where non-static threadlocals are used. I can't seem to find a good logic for using them. However, they can potentially result in leaking objects if for example they are created in a long running thread every time the thread handles a new session. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10910) Alter table drop partition queries in encrypted zone failing to remove data from HDFS
[ https://issues.apache.org/jira/browse/HIVE-10910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-10910: -- Issue Type: Sub-task (was: Bug) Parent: HIVE-8065 Alter table drop partition queries in encrypted zone failing to remove data from HDFS - Key: HIVE-10910 URL: https://issues.apache.org/jira/browse/HIVE-10910 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: 1.2.0 Reporter: Aswathy Chellammal Sreekumar Assignee: Eugene Koifman Alter table query trying to drop partition removes metadata of partition but fails to remove the data from HDFS hive create table table_1(name string, age int, gpa double) partitioned by (b string) stored as textfile; OK Time taken: 0.732 seconds hive alter table table_1 add partition (b='2010-10-10'); OK Time taken: 0.496 seconds hive show partitions table_1; OK b=2010-10-10 Time taken: 0.781 seconds, Fetched: 1 row(s) hive alter table table_1 drop partition (b='2010-10-10'); FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Got exception: java.io.IOException Failed to move to trash: hdfs://ip-address:8020/warehouse-dir/table_1/b=2010-10-10 hive show partitions table_1; OK Time taken: 0.622 seconds -- This message was sent by Atlassian JIRA (v6.3.4#6332)