[jira] [Commented] (HIVE-10744) LLAP: dags get stuck in yet another way
[ https://issues.apache.org/jira/browse/HIVE-10744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549863#comment-14549863 ] Prasanth Jayachandran commented on HIVE-10744: -- [~sseth] Can you take a look at the patch? The task scheduler is much more simplified now. Removed all book-keeping data structures. LLAP: dags get stuck in yet another way --- Key: HIVE-10744 URL: https://issues.apache.org/jira/browse/HIVE-10744 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Prasanth Jayachandran Attachments: HIVE-10744.patch DAG gets stuck when number of tasks that is multiple of number of containers on machine (6, 12, ... in my case) fails to finish at the end of the stage (I am running a job with 500-1000 maps). Status just hangs forever (beyond 5 min timeout) with some tasks shown as running. Happened twice on 3rd DAG with 1000-map job (TPCH Q1), then when I reduced to 500 happened on 7th DAG so far. [~sseth] has the details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10744) LLAP: dags get stuck in yet another way
[ https://issues.apache.org/jira/browse/HIVE-10744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10744: - Attachment: (was: HIVE-10744.patch) LLAP: dags get stuck in yet another way --- Key: HIVE-10744 URL: https://issues.apache.org/jira/browse/HIVE-10744 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Prasanth Jayachandran Attachments: HIVE-10744.patch DAG gets stuck when number of tasks that is multiple of number of containers on machine (6, 12, ... in my case) fails to finish at the end of the stage (I am running a job with 500-1000 maps). Status just hangs forever (beyond 5 min timeout) with some tasks shown as running. Happened twice on 3rd DAG with 1000-map job (TPCH Q1), then when I reduced to 500 happened on 7th DAG so far. [~sseth] has the details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10744) LLAP: dags get stuck in yet another way
[ https://issues.apache.org/jira/browse/HIVE-10744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10744: - Attachment: HIVE-10744.patch LLAP: dags get stuck in yet another way --- Key: HIVE-10744 URL: https://issues.apache.org/jira/browse/HIVE-10744 Project: Hive Issue Type: Sub-task Reporter: Sergey Shelukhin Assignee: Prasanth Jayachandran Attachments: HIVE-10744.patch DAG gets stuck when number of tasks that is multiple of number of containers on machine (6, 12, ... in my case) fails to finish at the end of the stage (I am running a job with 500-1000 maps). Status just hangs forever (beyond 5 min timeout) with some tasks shown as running. Happened twice on 3rd DAG with 1000-map job (TPCH Q1), then when I reduced to 500 happened on 7th DAG so far. [~sseth] has the details. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10732) Hive JDBC driver does not close operation for metadata queries
[ https://issues.apache.org/jira/browse/HIVE-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549958#comment-14549958 ] Hive QA commented on HIVE-10732: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12733702/HIVE-10732.patch {color:red}ERROR:{color} -1 due to 25 failed/errored test(s), 8945 tests executed *Failed tests:* {noformat} TestCustomAuthentication - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver_load_hdfs_file_with_space_in_the_name org.apache.hive.beeline.TestBeeLineWithArgs.testLastLineCmdInScriptFile org.apache.hive.jdbc.TestJdbcDriver2.testBuiltInUDFCol org.apache.hive.jdbc.TestJdbcDriver2.testCloseResultSet org.apache.hive.jdbc.TestJdbcDriver2.testDuplicateColumnNameOrder org.apache.hive.jdbc.TestJdbcDriver2.testExprCol org.apache.hive.jdbc.TestJdbcDriver2.testParentReferences org.apache.hive.jdbc.TestJdbcDriver2.testPostClose org.apache.hive.jdbc.TestJdbcDriver2.testPrepareStatement org.apache.hive.jdbc.TestJdbcDriver2.testSetCommand org.apache.hive.jdbc.TestJdbcDriver2.testShowGrant org.apache.hive.jdbc.TestJdbcDriver2.testShowRoleGrant org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testNonSparkQuery org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestJdbcWithLocalClusterSpark.testTempTable org.apache.hive.jdbc.TestJdbcWithMiniHS2.testConnection org.apache.hive.jdbc.TestJdbcWithMiniHS2.testConnectionSchemaAPIs org.apache.hive.jdbc.TestJdbcWithMiniMr.testMrQuery org.apache.hive.jdbc.TestJdbcWithMiniMr.testNonMrQuery org.apache.hive.jdbc.TestJdbcWithMiniMr.testTempTable org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testNonSparkQuery org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.miniHS2.TestMiniHS2.testConfInSession {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3940/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3940/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3940/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 25 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12733702 - PreCommit-HIVE-TRUNK-Build Hive JDBC driver does not close operation for metadata queries -- Key: HIVE-10732 URL: https://issues.apache.org/jira/browse/HIVE-10732 Project: Hive Issue Type: Bug Components: JDBC Reporter: Mala Chikka Kempanna Assignee: Chaoyu Tang Attachments: HIVE-10732.patch In following file http://github.mtv.cloudera.com/CDH/hive/blob/cdh5-0.14.1/jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java Line 315 implemented the ResultSet.close() method. Because DatabaseMetadata operation doesn't have a statement, it doesn't close the operation. However, regardless whether it has a statement or not, it should close the operation through the stmtHandle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10256) Filter row groups based on the block statistics in Parquet
[ https://issues.apache.org/jira/browse/HIVE-10256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Chen updated HIVE-10256: - Attachment: HIVE-10256-parquet.2.patch Patch rebased Filter row groups based on the block statistics in Parquet -- Key: HIVE-10256 URL: https://issues.apache.org/jira/browse/HIVE-10256 Project: Hive Issue Type: Sub-task Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-10256-parquet.1.patch, HIVE-10256-parquet.2.patch, HIVE-10256-parquet.patch In Parquet PPD, the not matched row groups should be eliminated. See {{TestOrcSplitElimination}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10741) count distinct rewrite is not firing
[ https://issues.apache.org/jira/browse/HIVE-10741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550057#comment-14550057 ] Hive QA commented on HIVE-10741: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12733706/HIVE-10741.1.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8946 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_context_ngrams org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_parquet_types org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3941/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3941/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3941/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12733706 - PreCommit-HIVE-TRUNK-Build count distinct rewrite is not firing Key: HIVE-10741 URL: https://issues.apache.org/jira/browse/HIVE-10741 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 1.2.0 Reporter: Mostafa Mokhtar Assignee: Ashutosh Chauhan Attachments: HIVE-10741.1.patch, HIVE-10741.patch Rewrite introduced in HIVE-10568 is not effective outside of test environment -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10257) Ensure Parquet Hive has null optimization
[ https://issues.apache.org/jira/browse/HIVE-10257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dong Chen updated HIVE-10257: - Attachment: HIVE-10257-parquet.2.patch Sure, Thanks! Patch updated. Since this renaming has been fixed in HIVE-10256, I updated this patch based on the code there. So this patch has to be merged after that one. Sorroy for the inconvenience. Ensure Parquet Hive has null optimization - Key: HIVE-10257 URL: https://issues.apache.org/jira/browse/HIVE-10257 Project: Hive Issue Type: Sub-task Components: Tests Reporter: Dong Chen Assignee: Dong Chen Attachments: HIVE-10257-parquet.1.patch, HIVE-10257-parquet.2.patch, HIVE-10257-parquet.patch In Parquet statistics, a boolean value {{hasNonNullValue}} is used for each column chunk. Hive could use this value to skip a column, avoid null-checking logic, and speed up vectorization like HIVE-4478 (in the future, Parquet vectorization is not completed yet). In this Jira we could check whether this null optimization works, and make changes if any. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-7193) Hive should support additional LDAP authentication parameters
[ https://issues.apache.org/jira/browse/HIVE-7193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550189#comment-14550189 ] Hive QA commented on HIVE-7193: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12733705/HIVE-7193.patch {color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 8946 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3942/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3942/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3942/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 1 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12733705 - PreCommit-HIVE-TRUNK-Build Hive should support additional LDAP authentication parameters - Key: HIVE-7193 URL: https://issues.apache.org/jira/browse/HIVE-7193 Project: Hive Issue Type: Bug Affects Versions: 0.10.0 Reporter: Mala Chikka Kempanna Assignee: Naveen Gangam Attachments: HIVE-7193.patch, LDAPAuthentication_Design_Doc.docx Currently hive has only following authenticator parameters for LDAP authentication for hiveserver2. property namehive.server2.authentication/name valueLDAP/value /property property namehive.server2.authentication.ldap.url/name valueldap://our_ldap_address/value /property We need to include other LDAP properties as part of hive-LDAP authentication like below a group search base - dc=domain,dc=com a group search filter - member={0} a user search base - dc=domain,dc=com a user search filter - sAMAAccountName={0} a list of valid user groups - group1,group2,group3 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10550) Dynamic RDD caching optimization for HoS.[Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550208#comment-14550208 ] Hive QA commented on HIVE-10550: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12733759/HIVE-10550.3-spark.patch {color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 8721 tests executed *Failed tests:* {noformat} TestMinimrCliDriver-bucket6.q-scriptfile1_win.q-quotedid_smb.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-bucketizedhiveinputformat.q-empty_dir_in_table.q - did not produce a TEST-*.xml file TestMinimrCliDriver-groupby2.q-infer_bucket_sort_map_operators.q-load_hdfs_file_with_space_in_the_name.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-import_exported_table.q-truncate_column_buckets.q-bucket_num_reducers2.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-index_bitmap3.q-infer_bucket_sort_num_buckets.q-parallel_orderby.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-infer_bucket_sort_reducers_power_two.q-join1.q-infer_bucket_sort_bucketed_table.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-leftsemijoin_mr.q-bucket5.q-infer_bucket_sort_merge.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-list_bucket_dml_10.q-input16_cc.q-temp_table_external.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx.q-bucket_num_reducers.q-scriptfile1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-ql_rewrite_gbtoidx_cbo_2.q-bucketmapjoin6.q-bucket4.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-reduce_deduplicate.q-infer_bucket_sort_dyn_part.q-udf_using.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-schemeAuthority2.q-uber_reduce.q-ql_rewrite_gbtoidx_cbo_1.q-and-1-more - did not produce a TEST-*.xml file TestMinimrCliDriver-stats_counter_partitioned.q-external_table_with_space_in_location_path.q-disable_merge_for_bucketing.q-and-1-more - did not produce a TEST-*.xml file {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/861/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/861/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-861/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 13 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12733759 - PreCommit-HIVE-SPARK-Build Dynamic RDD caching optimization for HoS.[Spark Branch] --- Key: HIVE-10550 URL: https://issues.apache.org/jira/browse/HIVE-10550 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Chengxiang Li Assignee: Chengxiang Li Attachments: HIVE-10550.1-spark.patch, HIVE-10550.1.patch, HIVE-10550.2-spark.patch, HIVE-10550.3-spark.patch A Hive query may try to scan the same table multi times, like self-join, self-union, or even share the same subquery, [TPC-DS Q39|https://github.com/hortonworks/hive-testbench/blob/hive14/sample-queries-tpcds/query39.sql] is an example. As you may know that, Spark support cache RDD data, which mean Spark would put the calculated RDD data in memory and get the data from memory directly for next time, this avoid the calculation cost of this RDD(and all the cost of its dependencies) at the cost of more memory usage. Through analyze the query context, we should be able to understand which part of query could be shared, so that we can reuse the cached RDD in the generated Spark job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10458) Enable parallel order by for spark [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-10458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550283#comment-14550283 ] Rui Li commented on HIVE-10458: --- Hi [~xuefuz], we won't do double sample for approach a1. Because the {{TotalOrderPartitioner}} is MR-specific. One interesting thing I found is the qtest {{parallel_orderby.q}}. As I mentioned above, when sorted data is stored in multiple files, we have to read these files in a proper order to maintain the global sort. Seems when retrieving the results in FetchOperator, we rely on InputFormat::getSplits which is related to the underlying FileSystem and doesn't guarantee an order. So if I run {{parallel_orderby}} with local-cluster mode (TestSparkCliDriver), the FS used is LocalFileSystem and it doesn't produce a correct result (in fact we do produce the correct results but we don't read it in a proper way). However if I run {{parallel_orderby}} with yarn mode (TestMiniSparkOnYarnCliDriver), the FS used is DistributedFileSystem and the result is correct. I also tried sorting the splits in FetchOperator and then both modes work fine. Maybe we should verify and fix this in a separate JIRA. What do you think? Enable parallel order by for spark [Spark Branch] - Key: HIVE-10458 URL: https://issues.apache.org/jira/browse/HIVE-10458 Project: Hive Issue Type: Sub-task Components: Spark Reporter: Rui Li Assignee: Rui Li Attachments: HIVE-10458.1-spark.patch, HIVE-10458.2-spark.patch, HIVE-10458.3-spark.patch We don't have to force reducer# to 1 as spark supports parallel sorting. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6867) Bucketized Table feature fails in some cases
[ https://issues.apache.org/jira/browse/HIVE-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551864#comment-14551864 ] Pengcheng Xiong commented on HIVE-6867: --- [~jpullokkaran], could you please take a look? The failed test is not related. Bucketized Table feature fails in some cases Key: HIVE-6867 URL: https://issues.apache.org/jira/browse/HIVE-6867 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Laljo John Pullokkaran Assignee: Pengcheng Xiong Attachments: HIVE-6867.01.patch, HIVE-6867.02.patch Bucketized Table feature fails in some cases. if src destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination. Example -- CREATE TABLE P1(key STRING, val STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1; – perform an insert to make sure there are 2 files INSERT OVERWRITE TABLE P1 select key, val from P1; -- This is not a regression. This has never worked. This got only discovered due to Hadoop2 changes. In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads). Long term solution seems to be to prevent load data for bucketed table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10767) LLAP: Improve the way task finishable information is processed
[ https://issues.apache.org/jira/browse/HIVE-10767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved HIVE-10767. --- Resolution: Fixed Fix Version/s: llap LLAP: Improve the way task finishable information is processed -- Key: HIVE-10767 URL: https://issues.apache.org/jira/browse/HIVE-10767 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Fix For: llap Attachments: HIVE-10767.1.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HIVE-10764) LLAP: Wait queue scheduler goes into tight loop
[ https://issues.apache.org/jira/browse/HIVE-10764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth reopened HIVE-10764: --- LLAP: Wait queue scheduler goes into tight loop --- Key: HIVE-10764 URL: https://issues.apache.org/jira/browse/HIVE-10764 Project: Hive Issue Type: Sub-task Affects Versions: llap Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Fix For: llap Attachments: HIVE-10764.patch {code} if (!task.canFinish() || numSlotsAvailable.get() == 0) { {code} this condition makes it to run into tight loop if no slots available and if the task is finishable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-10404) hive.exec.parallel=true causes out of sequence response and SocketTimeoutException: Read timed out
[ https://issues.apache.org/jira/browse/HIVE-10404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10404: Comment: was deleted (was: After discussing with [~ashutoshc], we would like to estimate the efforts needed if we would like to set hive.exec.parallel=true as default.) hive.exec.parallel=true causes out of sequence response and SocketTimeoutException: Read timed out Key: HIVE-10404 URL: https://issues.apache.org/jira/browse/HIVE-10404 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Eugene Koifman With hive.exec.parallel=true, Driver.lauchTask() calls Task.initialize() from 1 thread on several Tasks. It then starts new threads to run those tasks. Taks.initiazlie() gets an instance of Hive and holds on to it. Hive.java internally uses ThreadLocal to hand out instances, but since Task.initialize() is called by a single thread from the Driver multiple tasks share an instance of Hive. Each Hive instances has a single instance of MetaStoreClient; the later is not thread safe. With hive.exec.parallel=true, different threads actually execute the tasks, different threads end up sharing the same MetaStoreClient. If you make 2 concurrent calls, for example Hive.getTable(String), the Thrift responses may return to the wrong caller. Thus the first caller gets out of sequence response, drops this message and reconnects. If the timing is right, it will consume the other's response, but the the other caller will block for hive.metastore.client.socket.timeout since its response message has now been lost. This is just one concrete example. One possible fix is to make Task.db use ThreadLocal. This could be related to HIVE-6893 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10764) LLAP: Wait queue scheduler goes into tight loop
[ https://issues.apache.org/jira/browse/HIVE-10764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved HIVE-10764. --- Resolution: Implemented Done as part of HIVE-10767. The patch here was reverted. LLAP: Wait queue scheduler goes into tight loop --- Key: HIVE-10764 URL: https://issues.apache.org/jira/browse/HIVE-10764 Project: Hive Issue Type: Sub-task Affects Versions: llap Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Fix For: llap Attachments: HIVE-10764.patch {code} if (!task.canFinish() || numSlotsAvailable.get() == 0) { {code} this condition makes it to run into tight loop if no slots available and if the task is finishable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10767) LLAP: Improve the way task finishable information is processed
[ https://issues.apache.org/jira/browse/HIVE-10767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-10767: -- Attachment: HIVE-10767.1.txt LLAP: Improve the way task finishable information is processed -- Key: HIVE-10767 URL: https://issues.apache.org/jira/browse/HIVE-10767 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: HIVE-10767.1.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8529) HiveSessionImpl#fetchResults should not try to fetch operation log when hive.server2.logging.operation.enabled is false.
[ https://issues.apache.org/jira/browse/HIVE-8529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551497#comment-14551497 ] Hive QA commented on HIVE-8529: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12733936/HIVE-8529.2.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8945 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler.org.apache.hive.hcatalog.hbase.TestPigHBaseStorageHandler {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3949/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3949/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3949/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12733936 - PreCommit-HIVE-TRUNK-Build HiveSessionImpl#fetchResults should not try to fetch operation log when hive.server2.logging.operation.enabled is false. Key: HIVE-8529 URL: https://issues.apache.org/jira/browse/HIVE-8529 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0 Reporter: Vaibhav Gumashta Assignee: Yongzhi Chen Attachments: HIVE-8529.1.patch, HIVE-8529.2.patch Throws this even when it is disabled: {code} 14/10/20 15:53:14 [HiveServer2-Handler-Pool: Thread-53]: DEBUG security.UserGroupInformation: PrivilegedActionException as:vgumashta (auth:SIMPLE) cause:org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated with operation handle: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b3d05ca6-e3e8-4bef-b869-0ea0732c3ac5] 14/10/20 15:53:14 [HiveServer2-Handler-Pool: Thread-53]: WARN thrift.ThriftCLIService: Error fetching results: org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated with operation handle: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b3d05ca6-e3e8-4bef-b869-0ea0732c3ac5] at org.apache.hive.service.cli.operation.OperationManager.getOperationLogRowSet(OperationManager.java:240) at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:665) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:508) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60) at com.sun.proxy.$Proxy20.fetchResults(Unknown Source) at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:427) at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:582) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at
[jira] [Commented] (HIVE-10732) Hive JDBC driver does not close operation for metadata queries
[ https://issues.apache.org/jira/browse/HIVE-10732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551499#comment-14551499 ] Xuefu Zhang commented on HIVE-10732: [~ctang.ma], could you explain (as I don't quite understand) why you made the change from patch #0 to #1? I understand it's related to test failures. Hive JDBC driver does not close operation for metadata queries -- Key: HIVE-10732 URL: https://issues.apache.org/jira/browse/HIVE-10732 Project: Hive Issue Type: Bug Components: JDBC Reporter: Mala Chikka Kempanna Assignee: Chaoyu Tang Attachments: HIVE-10732.1.patch, HIVE-10732.patch In following file http://github.mtv.cloudera.com/CDH/hive/blob/cdh5-0.14.1/jdbc/src/java/org/apache/hive/jdbc/HiveQueryResultSet.java Line 315 implemented the ResultSet.close() method. Because DatabaseMetadata operation doesn't have a statement, it doesn't close the operation. However, regardless whether it has a statement or not, it should close the operation through the stmtHandle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10761) Create codahale-based metrics system for Hive
[ https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Szehon Ho updated HIVE-10761: - Attachment: HIVE-10761.patch Review board: [https://reviews.apache.org/r/34447/|https://reviews.apache.org/r/34447/] This adds a codahale-based metrics system to HiveServer2 and HiveMetastore. Metrics implementation is now internally pluggable, and the existing Metrics system can be re-enabled by configuration if desired for backward-compatibility. Following metrics are supported by Metrics system: 1. JVMPauseMonitor (used to call Hadoop's internal implementation, now forked off to integrate with Metrics system) 2. HMS API calls 3. Standard JVM metrics (only for new implementation, as its free with codahale). The following metrics reporting are supported by new system (configuration exposed) 1. JMX 2. CONSOLE 3. JSON_FILE (periodic file of metrics that gets overwritten). An eventual goal is to add a web-server that exposes the JSON metrics, but this will defer to a later JIRA. Create codahale-based metrics system for Hive - Key: HIVE-10761 URL: https://issues.apache.org/jira/browse/HIVE-10761 Project: Hive Issue Type: New Feature Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10761.patch, hms-metrics.json There is a current Hive metrics system that hooks up to a JMX reporting, but all its measurements, models are custom. This is to make another metrics system that will be based on Codahale (ie yammer, dropwizard), which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in reporting frameworks like JMX, Console, Log, JSON webserver It is used for many projects, including several Apache projects like Oozie. Overall, monitoring tools should find it easier to understand these common metric, measurement, reporting models. The existing metric subsystem will be kept and can be enabled if backward compatibility is desired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10761) Create codahale-based metrics system for Hive
[ https://issues.apache.org/jira/browse/HIVE-10761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551541#comment-14551541 ] Thejas M Nair commented on HIVE-10761: -- This looks very useful! Thanks for working on it. Better monitoring capabilities will really help to improve the server uptimes! Create codahale-based metrics system for Hive - Key: HIVE-10761 URL: https://issues.apache.org/jira/browse/HIVE-10761 Project: Hive Issue Type: New Feature Components: Diagnosability Reporter: Szehon Ho Assignee: Szehon Ho Attachments: HIVE-10761.patch, hms-metrics.json There is a current Hive metrics system that hooks up to a JMX reporting, but all its measurements, models are custom. This is to make another metrics system that will be based on Codahale (ie yammer, dropwizard), which has the following advantage: * Well-defined metric model for frequently-needed metrics (ie JVM metrics) * Well-defined measurements for all metrics (ie max, mean, stddev, mean_rate, etc), * Built-in reporting frameworks like JMX, Console, Log, JSON webserver It is used for many projects, including several Apache projects like Oozie. Overall, monitoring tools should find it easier to understand these common metric, measurement, reporting models. The existing metric subsystem will be kept and can be enabled if backward compatibility is desired. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10244) Vectorization : TPC-DS Q80 fails with java.lang.ClassCastException when hive.vectorized.execution.reduce.enabled is enabled
[ https://issues.apache.org/jira/browse/HIVE-10244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Laljo John Pullokkaran updated HIVE-10244: -- Assignee: Matt McCline (was: Jesus Camacho Rodriguez) Vectorization : TPC-DS Q80 fails with java.lang.ClassCastException when hive.vectorized.execution.reduce.enabled is enabled --- Key: HIVE-10244 URL: https://issues.apache.org/jira/browse/HIVE-10244 Project: Hive Issue Type: Bug Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Matt McCline Attachments: explain_q80_vectorized_reduce_on.txt Query {code} set hive.vectorized.execution.reduce.enabled=true; with ssr as (select s_store_id as store_id, sum(ss_ext_sales_price) as sales, sum(coalesce(sr_return_amt, 0)) as returns, sum(ss_net_profit - coalesce(sr_net_loss, 0)) as profit from store_sales left outer join store_returns on (ss_item_sk = sr_item_sk and ss_ticket_number = sr_ticket_number), date_dim, store, item, promotion where ss_sold_date_sk = d_date_sk and d_date between cast('1998-08-04' as date) and (cast('1998-09-04' as date)) and ss_store_sk = s_store_sk and ss_item_sk = i_item_sk and i_current_price 50 and ss_promo_sk = p_promo_sk and p_channel_tv = 'N' group by s_store_id) , csr as (select cp_catalog_page_id as catalog_page_id, sum(cs_ext_sales_price) as sales, sum(coalesce(cr_return_amount, 0)) as returns, sum(cs_net_profit - coalesce(cr_net_loss, 0)) as profit from catalog_sales left outer join catalog_returns on (cs_item_sk = cr_item_sk and cs_order_number = cr_order_number), date_dim, catalog_page, item, promotion where cs_sold_date_sk = d_date_sk and d_date between cast('1998-08-04' as date) and (cast('1998-09-04' as date)) and cs_catalog_page_sk = cp_catalog_page_sk and cs_item_sk = i_item_sk and i_current_price 50 and cs_promo_sk = p_promo_sk and p_channel_tv = 'N' group by cp_catalog_page_id) , wsr as (select web_site_id, sum(ws_ext_sales_price) as sales, sum(coalesce(wr_return_amt, 0)) as returns, sum(ws_net_profit - coalesce(wr_net_loss, 0)) as profit from web_sales left outer join web_returns on (ws_item_sk = wr_item_sk and ws_order_number = wr_order_number), date_dim, web_site, item, promotion where ws_sold_date_sk = d_date_sk and d_date between cast('1998-08-04' as date) and (cast('1998-09-04' as date)) and ws_web_site_sk = web_site_sk and ws_item_sk = i_item_sk and i_current_price 50 and ws_promo_sk = p_promo_sk and p_channel_tv = 'N' group by web_site_id) select channel , id , sum(sales) as sales , sum(returns) as returns , sum(profit) as profit from (select 'store channel' as channel , concat('store', store_id) as id , sales , returns , profit from ssr union all select 'catalog channel' as channel , concat('catalog_page', catalog_page_id) as id , sales , returns , profit from csr union all select 'web channel' as channel , concat('web_site', web_site_id) as id , sales , returns , profit from wsr ) x group by channel, id with rollup order by channel ,id limit 100 {code} Exception {code} Vertex failed, vertexName=Reducer 5, vertexId=vertex_1426707664723_1377_1_22, diagnostics=[Task failed, taskId=task_1426707664723_1377_1_22_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing vector batch (tag=0) \N\N09.285817653506076E84.639990363237801E7-1.1814318134887291E8 \N\N04.682909323885761E82.2415242712669864E7-5.966176123188091E7 \N\N01.2847032699693155E96.300096113768728E7-5.94963316209578E8 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:171) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:137) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:330) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179) at
[jira] [Updated] (HIVE-10764) LLAP: Wait queue scheduler goes into tight loop
[ https://issues.apache.org/jira/browse/HIVE-10764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-10764: - Attachment: HIVE-10764.patch LLAP: Wait queue scheduler goes into tight loop --- Key: HIVE-10764 URL: https://issues.apache.org/jira/browse/HIVE-10764 Project: Hive Issue Type: Sub-task Affects Versions: llap Reporter: Prasanth Jayachandran Assignee: Prasanth Jayachandran Attachments: HIVE-10764.patch {code} if (!task.canFinish() || numSlotsAvailable.get() == 0) { {code} this condition makes it to run into tight loop if no slots available and if the task is finishable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10753) hs2 jdbc url - wrong connection string cause error on beeline/jdbc/odbc client, misleading message
[ https://issues.apache.org/jira/browse/HIVE-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551425#comment-14551425 ] Hari Sankar Sivarama Subramaniyan commented on HIVE-10753: -- [~thejas] Thanks for the review, I noticed that OOM does not happen with master branch and it happens only with 0.14.0, most likely the OOM error was resolved with HIVE-6468. However, I still get a connection error message like this : {code} localhost:bin hsubramaniyan$ ./beeline --verbose=true Beeline version 1.3.0-SNAPSHOT by Apache Hive beeline !connect jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http Connecting to jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http Enter username for jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http: scott Enter password for jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http: * Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http: Invalid status 72 (state=08S01,code=0) java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http: Invalid status 72 at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:228) at org.apache.hive.jdbc.HiveConnection.init(HiveConnection.java:175) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:142) at org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:207) at org.apache.hive.beeline.Commands.connect(Commands.java:1139) at org.apache.hive.beeline.Commands.connect(Commands.java:1060) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:52) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:976) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:815) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:772) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:485) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:468) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: org.apache.thrift.transport.TTransportException: Invalid status 72 at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:307) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:203) ... 24 more {code} I will upload a new patch which improves the error message i.e. the point 1 you mentioned above. Thanks Hari hs2 jdbc url - wrong connection string cause error on beeline/jdbc/odbc client, misleading message --- Key: HIVE-10753 URL: https://issues.apache.org/jira/browse/HIVE-10753 Project: Hive Issue Type: Bug Components: Beeline, JDBC Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10753.1.patch {noformat} beeline -u 'jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http' -n hdiuser scan complete in 15ms Connecting to jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http Java heap space Beeline version 0.14.0.2.2.4.1-1 by Apache Hive 0: jdbc:hive2://localhost:10001/default (closed) ^Chdiuser@headnode0:~$ But it works if I use the deprecated param - hdiuser@headnode0:~$ beeline -u
[jira] [Commented] (HIVE-10725) Better resource management in HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-10725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551539#comment-14551539 ] Thejas M Nair commented on HIVE-10725: -- The work in HIVE-10761 will also help to improve HS2 uptime, by making it easier to monitor Better resource management in HiveServer2 - Key: HIVE-10725 URL: https://issues.apache.org/jira/browse/HIVE-10725 Project: Hive Issue Type: Improvement Components: HiveServer2, JDBC Affects Versions: 1.3.0 Reporter: Vaibhav Gumashta We have various ways to control the number of queries that can be run on one HS2 instance (max threads, thread pool queuing etc). We also have ways to run multiple HS2 instances using dynamic service discovery. We should do a better job at: 1. Monitoring resource utilization (sessions, ophandles, memory, threads etc). 2. Being upfront to the client when we cannot accept new queries. 3. Throttle among different server instances in case dynamic service discovery is used. 4. Consolidate existing ways to control #queries into a simpler model. 5. See if we can recommend reasonable values for OS resources or provide alerts if we run out of those. 6. Health reports, server status API (to get number of queries, sessions etc). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10709) Update Avro version to 1.7.7
[ https://issues.apache.org/jira/browse/HIVE-10709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550423#comment-14550423 ] Hive QA commented on HIVE-10709: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12733725/HIVE-10790.3.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8946 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_percentile_approx_23 org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hive.hcatalog.streaming.TestStreaming.testTransactionBatchCommit_Json {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3944/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3944/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3944/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12733725 - PreCommit-HIVE-TRUNK-Build Update Avro version to 1.7.7 Key: HIVE-10709 URL: https://issues.apache.org/jira/browse/HIVE-10709 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Swarnim Kulkarni Assignee: Swarnim Kulkarni Attachments: HIVE-10709.1.patch, HIVE-10709.2.patch, HIVE-10709.2.patch, HIVE-10790.3.patch We should update the avro version to 1.7.7 to consumer some of the nicer compatibility features. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10665) Continue to make udaf_percentile_approx_23.q test more stable
[ https://issues.apache.org/jira/browse/HIVE-10665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550487#comment-14550487 ] Swarnim Kulkarni commented on HIVE-10665: - +1. Just ran into this failure on HIVE-10709 Continue to make udaf_percentile_approx_23.q test more stable - Key: HIVE-10665 URL: https://issues.apache.org/jira/browse/HIVE-10665 Project: Hive Issue Type: Bug Components: UDF Reporter: Alexander Pivovarov Assignee: Alexander Pivovarov Priority: Minor Attachments: HIVE-10665.1.patch HIVE-10059 fixed line 628 in q.out Similar issue exists on line 567 and should be fixed as well. {code} Running: diff -a /home/hiveptest/54.159.254.207-hiveptest-2/apache-github-source-source/itests/qtest/../../itests/qtest/target/qfile-results/clientpositive/udaf_percentile_approx_23.q.out /home/hiveptest/54.159.254.207-hiveptest-2/apache-github-source-source/itests/qtest/../../ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 567c567 342.0 --- 341.5 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-9880) Support configurable username attribute for HiveServer2 LDAP authentication
[ https://issues.apache.org/jira/browse/HIVE-9880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam reassigned HIVE-9880: --- Assignee: Naveen Gangam Support configurable username attribute for HiveServer2 LDAP authentication --- Key: HIVE-9880 URL: https://issues.apache.org/jira/browse/HIVE-9880 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.0 Reporter: Jaime Murillo Assignee: Naveen Gangam Attachments: HIVE-9880-1.patch OpenLDAP requires that when bind authenticating, the DN being supplied must be the creation DN of the account. Since, OpenLDAP allows for any attribute to be used when creating a DN for an account, organizations that don’t use hardcoded *uid* attribute won’t be able to utilize HiveServer2 LDAP authentication. HiveServer2 should support a configurable username attribute when constructing the bindDN -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10752) Revert HIVE-5193
[ https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-10752: Attachment: HIVE-10752.patch Revert HIVE-5193. Please note, we have an additional problem even after reverting. I will address later. I didn't include in this patch to keep the work separate. Revert HIVE-5193 Key: HIVE-10752 URL: https://issues.apache.org/jira/browse/HIVE-10752 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Attachments: HIVE-10752.patch Revert HIVE-5193 since it causes pig+hcatalog not working. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10735) LLAP: Cached plan race condition - VectorMapJoinCommonOperator has no closeOp()
[ https://issues.apache.org/jira/browse/HIVE-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551135#comment-14551135 ] Mostafa Mokhtar commented on HIVE-10735: [~gopalv] [~hagleitn] Can you add the query and the plan? LLAP: Cached plan race condition - VectorMapJoinCommonOperator has no closeOp() --- Key: HIVE-10735 URL: https://issues.apache.org/jira/browse/HIVE-10735 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Gopal V Assignee: Matt McCline Priority: Critical Looks like some state is mutated during execution across threads in LLAP. Either we can't share the operator objects across threads, because they are tied to the data objects per invocation or this is missing a closeOp() which resets the common-setup between reuses. {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator.process(VectorMapJoinInnerBigOnlyLongOperator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:114) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:164) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) ... 18 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:379) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:599) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.generateHashMultiSetResultRepeatedAll(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:304) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.finishInnerBigOnlyRepeated(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:328) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator.process(VectorMapJoinInnerBigOnlyLongOperator.java:201) ... 24 more Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(BytesColumnVector.java:152) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow$StringReaderByValue.apply(VectorDeserializeRow.java:349) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeByValue(VectorDeserializeRow.java:688) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultSingleValue(VectorMapJoinGenerateResultOperator.java:177) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:201) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:359) ... 29 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8529) HiveSessionImpl#fetchResults should not try to fetch operation log when hive.server2.logging.operation.enabled is false.
[ https://issues.apache.org/jira/browse/HIVE-8529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-8529: --- Attachment: HIVE-8529.2.patch HiveSessionImpl#fetchResults should not try to fetch operation log when hive.server2.logging.operation.enabled is false. Key: HIVE-8529 URL: https://issues.apache.org/jira/browse/HIVE-8529 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 1.0.0, 1.2.0, 1.1.0, 1.3.0 Reporter: Vaibhav Gumashta Assignee: Yongzhi Chen Attachments: HIVE-8529.1.patch, HIVE-8529.2.patch Throws this even when it is disabled: {code} 14/10/20 15:53:14 [HiveServer2-Handler-Pool: Thread-53]: DEBUG security.UserGroupInformation: PrivilegedActionException as:vgumashta (auth:SIMPLE) cause:org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated with operation handle: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b3d05ca6-e3e8-4bef-b869-0ea0732c3ac5] 14/10/20 15:53:14 [HiveServer2-Handler-Pool: Thread-53]: WARN thrift.ThriftCLIService: Error fetching results: org.apache.hive.service.cli.HiveSQLException: Couldn't find log associated with operation handle: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b3d05ca6-e3e8-4bef-b869-0ea0732c3ac5] at org.apache.hive.service.cli.operation.OperationManager.getOperationLogRowSet(OperationManager.java:240) at org.apache.hive.service.cli.session.HiveSessionImpl.fetchResults(HiveSessionImpl.java:665) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79) at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37) at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:394) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:508) at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60) at com.sun.proxy.$Proxy20.fetchResults(Unknown Source) at org.apache.hive.service.cli.CLIService.fetchResults(CLIService.java:427) at org.apache.hive.service.cli.thrift.ThriftCLIService.FetchResults(ThriftCLIService.java:582) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1553) at org.apache.hive.service.cli.thrift.TCLIService$Processor$FetchResults.getResult(TCLIService.java:1538) at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) at java.lang.Thread.run(Thread.java:695) 14/10/20 15:53:14 [HiveServer2-Handler-Pool: Thread-53]: DEBUG transport.TSaslTransport: writing data length: 2525 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8190) LDAP user match for authentication on hiveserver2
[ https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551272#comment-14551272 ] Naveen Gangam commented on HIVE-8190: - I just uploaded a patch for HIVE-7193. There is also a design doc attached. The new enhancements made should make it more flexible for the users to configure LDAP for authentication. Filter support (user and group) has been added. Let me know if you have questions or any feedback. Thanks LDAP user match for authentication on hiveserver2 - Key: HIVE-8190 URL: https://issues.apache.org/jira/browse/HIVE-8190 Project: Hive Issue Type: Improvement Components: Authorization, Clients Affects Versions: 0.13.1 Environment: Centos 6.5 Reporter: LINTE Assignee: Naveen Gangam Some LDAP has the user composant as CN and not UID. SO when you try to authenticate the LDAP authentication module of hive try to authenticate with the following string : uid=$login,basedn Some AD have user objects that are not uid but cn, so it is be important to personalize the kind of objects that the authentication moduel look for in ldap. We can see an exemple in knox LDAP module configuration the parameter main.ldapRealm.userDnTemplate can be configured to look for : uid : 'uid={0}, basedn' or cn : 'cn={0}, basedn' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10753) hs2 jdbc url - wrong connection string cause OOM error on beeline/jdbc/odbc client, misleading message
[ https://issues.apache.org/jira/browse/HIVE-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551191#comment-14551191 ] Hive QA commented on HIVE-10753: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12733904/HIVE-10753.1.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 8946 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hive.jdbc.TestJdbcDriver2.testSetOnConnection org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3947/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3947/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3947/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12733904 - PreCommit-HIVE-TRUNK-Build hs2 jdbc url - wrong connection string cause OOM error on beeline/jdbc/odbc client, misleading message -- Key: HIVE-10753 URL: https://issues.apache.org/jira/browse/HIVE-10753 Project: Hive Issue Type: Bug Components: Beeline, JDBC Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10753.1.patch {noformat} beeline -u 'jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http' -n hdiuser scan complete in 15ms Connecting to jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http Java heap space Beeline version 0.14.0.2.2.4.1-1 by Apache Hive 0: jdbc:hive2://localhost:10001/default (closed) ^Chdiuser@headnode0:~$ But it works if I use the deprecated param - hdiuser@headnode0:~$ beeline -u 'jdbc:hive2://localhost:10001/default?hive.server2.transport.mode=http;httpPath=/' -n hdiuser scan complete in 12ms Connecting to jdbc:hive2://localhost:10001/default?hive.server2.transport.mode=http;httpPath=/ 15/04/28 23:16:46 [main]: WARN jdbc.Utils: * JDBC param deprecation * 15/04/28 23:16:46 [main]: WARN jdbc.Utils: The use of hive.server2.transport.mode is deprecated. 15/04/28 23:16:46 [main]: WARN jdbc.Utils: Please use transportMode like so: jdbc:hive2://host:port/dbName;transportMode=transport_mode_value Connected to: Apache Hive (version 0.14.0.2.2.4.1-1) Driver: Hive JDBC (version 0.14.0.2.2.4.1-1) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.14.0.2.2.4.1-1 by Apache Hive 0: jdbc:hive2://localhost:10001/default show tables; +--+--+ | tab_name | +--+--+ | hivesampletable | +--+--+ 1 row selected (18.181 seconds) 0: jdbc:hive2://localhost:10001/default ^Chdiuser@headnode0:~$ ^C {noformat} The reason for the above message is : The url is wrong. Correct one: {code} beeline -u 'jdbc:hive2://localhost:10001/default;httpPath=/;transportMode=http' -n hdiuser {code} Note the ; instead of ?. The deprecation msg prints the format as well: {code} Please use transportMode like so: jdbc:hive2://host:port/dbName;transportMode=transport_mode_value {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6867) Bucketized Table feature fails in some cases
[ https://issues.apache.org/jira/browse/HIVE-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-6867: -- Attachment: HIVE-6867.02.patch with q files updated Bucketized Table feature fails in some cases Key: HIVE-6867 URL: https://issues.apache.org/jira/browse/HIVE-6867 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Laljo John Pullokkaran Assignee: Pengcheng Xiong Attachments: HIVE-6867.01.patch, HIVE-6867.02.patch Bucketized Table feature fails in some cases. if src destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination. Example -- CREATE TABLE P1(key STRING, val STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1; – perform an insert to make sure there are 2 files INSERT OVERWRITE TABLE P1 select key, val from P1; -- This is not a regression. This has never worked. This got only discovered due to Hadoop2 changes. In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads). Long term solution seems to be to prevent load data for bucketed table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10756) LLAP: Misc changes to daemon scheduling
[ https://issues.apache.org/jira/browse/HIVE-10756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551224#comment-14551224 ] Prasanth Jayachandran commented on HIVE-10756: -- LGTM, +1 LLAP: Misc changes to daemon scheduling --- Key: HIVE-10756 URL: https://issues.apache.org/jira/browse/HIVE-10756 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Fix For: llap Attachments: HIVE-10756.1.txt Running the completion callback in a separate thread to avoid potentially unnecessary preemptions. Sending out a kill to the AM only if the task was actually killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-10677) hive.exec.parallel=true has problem when it is used for analyze table column stats
[ https://issues.apache.org/jira/browse/HIVE-10677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong reassigned HIVE-10677: -- Assignee: Pengcheng Xiong hive.exec.parallel=true has problem when it is used for analyze table column stats -- Key: HIVE-10677 URL: https://issues.apache.org/jira/browse/HIVE-10677 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong To reproduce it, in q tests. {code} hive set hive.exec.parallel; hive.exec.parallel=true hive analyze table src compute statistics for columns; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.ColumnStatsTask java.lang.RuntimeException: Error caching map.xml: java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:747) at org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:682) at org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:674) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:375) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:75) Caused by: java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:541) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.util.Shell.execCommand(Shell.java:791) at org.apache.hadoop.util.Shell.execCommand(Shell.java:774) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:646) at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:472) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:460) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:426) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:773) at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:715) ... 7 more hive Job Submission failed with exception 'java.lang.RuntimeException(Error caching map.xml: java.io.IOException: java.lang.InterruptedException)' {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10760) Templeton: HCatalog Get Column for Non-existent column returns Server Error (500) rather than Not Found(404)
[ https://issues.apache.org/jira/browse/HIVE-10760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lekha Thota updated HIVE-10760: --- Attachment: 0001-Change-HCatalog-Get-Column-error-for-Non-existent-co.patch Templeton: HCatalog Get Column for Non-existent column returns Server Error (500) rather than Not Found(404) Key: HIVE-10760 URL: https://issues.apache.org/jira/browse/HIVE-10760 Project: Hive Issue Type: Bug Components: HCatalog, Hive, WebHCat Reporter: Lekha Thota Assignee: Lekha Thota Priority: Minor Attachments: 0001-Change-HCatalog-Get-Column-error-for-Non-existent-co.patch Apache Jira for https://hwxmonarch.atlassian.net/browse/HIVE-578 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-9069) Simplify filter predicates for CBO
[ https://issues.apache.org/jira/browse/HIVE-9069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-9069: -- Attachment: HIVE-9069.08.patch Simplify filter predicates for CBO -- Key: HIVE-9069 URL: https://issues.apache.org/jira/browse/HIVE-9069 Project: Hive Issue Type: Bug Components: CBO Affects Versions: 0.14.0 Reporter: Mostafa Mokhtar Assignee: Jesus Camacho Rodriguez Fix For: 0.14.1 Attachments: HIVE-9069.01.patch, HIVE-9069.02.patch, HIVE-9069.03.patch, HIVE-9069.04.patch, HIVE-9069.05.patch, HIVE-9069.06.patch, HIVE-9069.07.patch, HIVE-9069.08.patch, HIVE-9069.patch Simplify predicates for disjunctive predicates so that can get pushed down to the scan. Looks like this is still an issue, some of the filters can be pushed down to the scan. {code} set hive.cbo.enable=true set hive.stats.fetch.column.stats=true set hive.exec.dynamic.partition.mode=nonstrict set hive.tez.auto.reducer.parallelism=true set hive.auto.convert.join.noconditionaltask.size=32000 set hive.exec.reducers.bytes.per.reducer=1 set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager set hive.support.concurrency=false set hive.tez.exec.print.summary=true explain select substr(r_reason_desc,1,20) as r ,avg(ws_quantity) wq ,avg(wr_refunded_cash) ref ,avg(wr_fee) fee from web_sales, web_returns, web_page, customer_demographics cd1, customer_demographics cd2, customer_address, date_dim, reason where web_sales.ws_web_page_sk = web_page.wp_web_page_sk and web_sales.ws_item_sk = web_returns.wr_item_sk and web_sales.ws_order_number = web_returns.wr_order_number and web_sales.ws_sold_date_sk = date_dim.d_date_sk and d_year = 1998 and cd1.cd_demo_sk = web_returns.wr_refunded_cdemo_sk and cd2.cd_demo_sk = web_returns.wr_returning_cdemo_sk and customer_address.ca_address_sk = web_returns.wr_refunded_addr_sk and reason.r_reason_sk = web_returns.wr_reason_sk and ( ( cd1.cd_marital_status = 'M' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = '4 yr Degree' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 100.00 and 150.00 ) or ( cd1.cd_marital_status = 'D' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = 'Primary' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 50.00 and 100.00 ) or ( cd1.cd_marital_status = 'U' and cd1.cd_marital_status = cd2.cd_marital_status and cd1.cd_education_status = 'Advanced Degree' and cd1.cd_education_status = cd2.cd_education_status and ws_sales_price between 150.00 and 200.00 ) ) and ( ( ca_country = 'United States' and ca_state in ('KY', 'GA', 'NM') and ws_net_profit between 100 and 200 ) or ( ca_country = 'United States' and ca_state in ('MT', 'OR', 'IN') and ws_net_profit between 150 and 300 ) or ( ca_country = 'United States' and ca_state in ('WI', 'MO', 'WV') and ws_net_profit between 50 and 250 ) ) group by r_reason_desc order by r, wq, ref, fee limit 100 OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 STAGE PLANS: Stage: Stage-1 Tez Edges: Map 9 - Map 1 (BROADCAST_EDGE) Reducer 3 - Map 13 (SIMPLE_EDGE), Map 2 (SIMPLE_EDGE) Reducer 4 - Map 9 (SIMPLE_EDGE), Reducer 3 (SIMPLE_EDGE) Reducer 5 - Map 14 (SIMPLE_EDGE), Reducer 4 (SIMPLE_EDGE) Reducer 6 - Map 10 (SIMPLE_EDGE), Map 11 (BROADCAST_EDGE), Map 12 (BROADCAST_EDGE), Reducer 5 (SIMPLE_EDGE) Reducer 7 - Reducer 6 (SIMPLE_EDGE) Reducer 8 - Reducer 7 (SIMPLE_EDGE) DagName: mmokhtar_2014161818_f5fd23ba-d783-4b13-8507-7faa65851798:1 Vertices: Map 1 Map Operator Tree: TableScan alias: web_page filterExpr: wp_web_page_sk is not null (type: boolean) Statistics: Num rows: 4602 Data size: 2696178 Basic stats: COMPLETE Column stats: COMPLETE Filter Operator predicate: wp_web_page_sk is not null (type: boolean) Statistics: Num rows: 4602 Data size: 18408 Basic stats: COMPLETE Column stats: COMPLETE Select Operator expressions: wp_web_page_sk (type: int)
[jira] [Updated] (HIVE-10404) hive.exec.parallel=true causes out of sequence response and SocketTimeoutException: Read timed out
[ https://issues.apache.org/jira/browse/HIVE-10404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-10404: --- Attachment: HIVE-10404.01.patch After discussing with [~ashutoshc], we would like to estimate the efforts needed if we would like to set hive.exec.parallel=true as default. hive.exec.parallel=true causes out of sequence response and SocketTimeoutException: Read timed out Key: HIVE-10404 URL: https://issues.apache.org/jira/browse/HIVE-10404 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Eugene Koifman Attachments: HIVE-10404.01.patch With hive.exec.parallel=true, Driver.lauchTask() calls Task.initialize() from 1 thread on several Tasks. It then starts new threads to run those tasks. Taks.initiazlie() gets an instance of Hive and holds on to it. Hive.java internally uses ThreadLocal to hand out instances, but since Task.initialize() is called by a single thread from the Driver multiple tasks share an instance of Hive. Each Hive instances has a single instance of MetaStoreClient; the later is not thread safe. With hive.exec.parallel=true, different threads actually execute the tasks, different threads end up sharing the same MetaStoreClient. If you make 2 concurrent calls, for example Hive.getTable(String), the Thrift responses may return to the wrong caller. Thus the first caller gets out of sequence response, drops this message and reconnects. If the timing is right, it will consume the other's response, but the the other caller will block for hive.metastore.client.socket.timeout since its response message has now been lost. This is just one concrete example. One possible fix is to make Task.db use ThreadLocal. This could be related to HIVE-6893 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10677) hive.exec.parallel=true has problem when it is used for analyze table column stats
[ https://issues.apache.org/jira/browse/HIVE-10677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551398#comment-14551398 ] Ashutosh Chauhan commented on HIVE-10677: - +1 hive.exec.parallel=true has problem when it is used for analyze table column stats -- Key: HIVE-10677 URL: https://issues.apache.org/jira/browse/HIVE-10677 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-10677.01.patch To reproduce it, in q tests. {code} hive set hive.exec.parallel; hive.exec.parallel=true hive analyze table src compute statistics for columns; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.ColumnStatsTask java.lang.RuntimeException: Error caching map.xml: java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:747) at org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:682) at org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:674) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:375) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:75) Caused by: java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:541) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.util.Shell.execCommand(Shell.java:791) at org.apache.hadoop.util.Shell.execCommand(Shell.java:774) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:646) at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:472) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:460) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:426) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:773) at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:715) ... 7 more hive Job Submission failed with exception 'java.lang.RuntimeException(Error caching map.xml: java.io.IOException: java.lang.InterruptedException)' {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10735) LLAP: Cached plan race condition - VectorMapJoinCommonOperator has no closeOp()
[ https://issues.apache.org/jira/browse/HIVE-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551330#comment-14551330 ] Sergey Shelukhin commented on HIVE-10735: - Yeah column vectors from VRBs are pooled and reused. We can remove that if needed... see swapColumnVector in LlapRecordReader. LLAP: Cached plan race condition - VectorMapJoinCommonOperator has no closeOp() --- Key: HIVE-10735 URL: https://issues.apache.org/jira/browse/HIVE-10735 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Gopal V Assignee: Matt McCline Priority: Critical Looks like some state is mutated during execution across threads in LLAP. Either we can't share the operator objects across threads, because they are tied to the data objects per invocation or this is missing a closeOp() which resets the common-setup between reuses. {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator.process(VectorMapJoinInnerBigOnlyLongOperator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:114) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:164) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) ... 18 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:379) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:599) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.generateHashMultiSetResultRepeatedAll(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:304) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.finishInnerBigOnlyRepeated(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:328) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator.process(VectorMapJoinInnerBigOnlyLongOperator.java:201) ... 24 more Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(BytesColumnVector.java:152) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow$StringReaderByValue.apply(VectorDeserializeRow.java:349) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeByValue(VectorDeserializeRow.java:688) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultSingleValue(VectorMapJoinGenerateResultOperator.java:177) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:201) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:359) ... 29 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10756) LLAP: Misc changes to daemon scheduling
[ https://issues.apache.org/jira/browse/HIVE-10756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-10756: -- Attachment: HIVE-10756.1.txt [~prasanth_j] - could you take a quick look please. LLAP: Misc changes to daemon scheduling --- Key: HIVE-10756 URL: https://issues.apache.org/jira/browse/HIVE-10756 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Fix For: llap Attachments: HIVE-10756.1.txt Running the completion callback in a separate thread to avoid potentially unnecessary preemptions. Sending out a kill to the AM only if the task was actually killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10735) LLAP: Cached plan race condition - VectorMapJoinCommonOperator has no closeOp()
[ https://issues.apache.org/jira/browse/HIVE-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551263#comment-14551263 ] Matt McCline commented on HIVE-10735: - The closeOp is in the VectorMapJoinGenerateResultOperator class, which overrides MapJoinOperator's closeOp. I knew we cached and shared hash tables, but was not aware we shared operators across threads? LLAP: Cached plan race condition - VectorMapJoinCommonOperator has no closeOp() --- Key: HIVE-10735 URL: https://issues.apache.org/jira/browse/HIVE-10735 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Gopal V Assignee: Matt McCline Priority: Critical Looks like some state is mutated during execution across threads in LLAP. Either we can't share the operator objects across threads, because they are tied to the data objects per invocation or this is missing a closeOp() which resets the common-setup between reuses. {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator.process(VectorMapJoinInnerBigOnlyLongOperator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:114) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:164) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) ... 18 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:379) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:599) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.generateHashMultiSetResultRepeatedAll(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:304) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.finishInnerBigOnlyRepeated(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:328) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator.process(VectorMapJoinInnerBigOnlyLongOperator.java:201) ... 24 more Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(BytesColumnVector.java:152) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow$StringReaderByValue.apply(VectorDeserializeRow.java:349) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeByValue(VectorDeserializeRow.java:688) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultSingleValue(VectorMapJoinGenerateResultOperator.java:177) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:201) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:359) ... 29 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10722) external table creation with msck in Hive can create unusable partition
[ https://issues.apache.org/jira/browse/HIVE-10722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-10722: Attachment: HIVE-10722.patch This fixes the standard formatter to provide proper output, and adds validation to msck to make sure metastore will actually create matching partitions external table creation with msck in Hive can create unusable partition --- Key: HIVE-10722 URL: https://issues.apache.org/jira/browse/HIVE-10722 Project: Hive Issue Type: Bug Affects Versions: 0.14.1, 1.0.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Attachments: HIVE-10722.patch There can be directories in HDFS containing unprintable characters; when doing hadoop fs -ls, these characters are not even visible, and can only be seen for example if output is piped thru od. When these are loaded via msck, they are stored in e.g. mysql as ? (literal question mark, findable via LIKE '%?%' in db) and show accordingly in Hive. However, datanucleus appears to encode it as %3F; this causes the partition to be unusable - it cannot be dropped, and other operations like drop table get stuck (didn't investigate in detail why; drop table got unstuck as soon as the partition was removed from metastore). We should probably have a 2-way option for such cases - error out on load (default), or convert to '?'/drop such characters (and have partition that actually works, too). We should also check if partitions with '?' inserted explicitly work at all with datanucleus. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-8222) CBO Trunk Merge: Fix Check Style issues
[ https://issues.apache.org/jira/browse/HIVE-8222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Francke resolved HIVE-8222. Resolution: Won't Fix CBO Trunk Merge: Fix Check Style issues --- Key: HIVE-8222 URL: https://issues.apache.org/jira/browse/HIVE-8222 Project: Hive Issue Type: Sub-task Components: CBO Reporter: Laljo John Pullokkaran Assignee: Lars Francke Attachments: HIVE-8222.1.patch, HIVE-8222.2.patch, HIVE-8222.3.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10747) Enable the cleanup of side effect for the Encryption related qfile test
[ https://issues.apache.org/jira/browse/HIVE-10747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551188#comment-14551188 ] Eugene Koifman commented on HIVE-10747: --- I tried the same change earlier. it fixes the leak problem Enable the cleanup of side effect for the Encryption related qfile test --- Key: HIVE-10747 URL: https://issues.apache.org/jira/browse/HIVE-10747 Project: Hive Issue Type: Sub-task Components: Testing Infrastructure Reporter: Ferdinand Xu Assignee: Ferdinand Xu Attachments: HIVE-10747.patch The hive conf is not reset in the clearTestSideEffects method which is involved from HIVE-8900. This will have pollute other qfile's settings running by TestEncryptedHDFSCliDriver -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-10756) LLAP: Misc changes to daemon scheduling
[ https://issues.apache.org/jira/browse/HIVE-10756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth resolved HIVE-10756. --- Resolution: Fixed Thanks. Committed. LLAP: Misc changes to daemon scheduling --- Key: HIVE-10756 URL: https://issues.apache.org/jira/browse/HIVE-10756 Project: Hive Issue Type: Sub-task Reporter: Siddharth Seth Assignee: Siddharth Seth Fix For: llap Attachments: HIVE-10756.1.txt Running the completion callback in a separate thread to avoid potentially unnecessary preemptions. Sending out a kill to the AM only if the task was actually killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10735) LLAP: Cached plan race condition - VectorMapJoinCommonOperator has no closeOp()
[ https://issues.apache.org/jira/browse/HIVE-10735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551349#comment-14551349 ] Gopal V commented on HIVE-10735: No, the pooling is not the bug - the vector row-batch has a definite lifetime till the end of the return of processOp(). the setVal() up there is the right solution for the issue (is hard to verify though) - the issue is that the unit-tests we run do not trigger the switch between files and the re-creation of new vector row-batches between invocations. LLAP: Cached plan race condition - VectorMapJoinCommonOperator has no closeOp() --- Key: HIVE-10735 URL: https://issues.apache.org/jira/browse/HIVE-10735 Project: Hive Issue Type: Sub-task Components: Vectorization Reporter: Gopal V Assignee: Matt McCline Priority: Critical Looks like some state is mutated during execution across threads in LLAP. Either we can't share the operator objects across threads, because they are tied to the data objects per invocation or this is missing a closeOp() which resets the common-setup between reuses. {code} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator.process(VectorMapJoinInnerBigOnlyLongOperator.java:380) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:114) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:97) at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:164) at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45) ... 18 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:379) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:850) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.forwardBigTableBatch(VectorMapJoinGenerateResultOperator.java:599) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.generateHashMultiSetResultRepeatedAll(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:304) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyGenerateResultOperator.finishInnerBigOnlyRepeated(VectorMapJoinInnerBigOnlyGenerateResultOperator.java:328) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerBigOnlyLongOperator.process(VectorMapJoinInnerBigOnlyLongOperator.java:201) ... 24 more Caused by: java.lang.ArrayIndexOutOfBoundsException at org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.setVal(BytesColumnVector.java:152) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow$StringReaderByValue.apply(VectorDeserializeRow.java:349) at org.apache.hadoop.hive.ql.exec.vector.VectorDeserializeRow.deserializeByValue(VectorDeserializeRow.java:688) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinGenerateResultOperator.generateHashMapResultSingleValue(VectorMapJoinGenerateResultOperator.java:177) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.finishInner(VectorMapJoinInnerGenerateResultOperator.java:201) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerLongOperator.process(VectorMapJoinInnerLongOperator.java:359) ... 29 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10745) Better null handling by Vectorizer
[ https://issues.apache.org/jira/browse/HIVE-10745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551350#comment-14551350 ] Hive QA commented on HIVE-10745: {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12733919/HIVE-10745.2.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 8946 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_static org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2 {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3948/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/3948/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-3948/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12733919 - PreCommit-HIVE-TRUNK-Build Better null handling by Vectorizer -- Key: HIVE-10745 URL: https://issues.apache.org/jira/browse/HIVE-10745 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.2.0 Reporter: Jagruti Varia Assignee: Ashutosh Chauhan Attachments: HIVE-10745.1.patch, HIVE-10745.2.patch, HIVE-10745.patch Minor refactoring around null handling in Vectorization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10722) external table creation with msck in Hive can create unusable partition
[ https://issues.apache.org/jira/browse/HIVE-10722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551170#comment-14551170 ] Sergey Shelukhin commented on HIVE-10722: - cannot post RB, fails to validate diff... probably because of unprintable characters external table creation with msck in Hive can create unusable partition --- Key: HIVE-10722 URL: https://issues.apache.org/jira/browse/HIVE-10722 Project: Hive Issue Type: Bug Affects Versions: 0.14.1, 1.0.0 Reporter: Sergey Shelukhin Assignee: Sergey Shelukhin Priority: Critical Attachments: HIVE-10722.patch There can be directories in HDFS containing unprintable characters; when doing hadoop fs -ls, these characters are not even visible, and can only be seen for example if output is piped thru od. When these are loaded via msck, they are stored in e.g. mysql as ? (literal question mark, findable via LIKE '%?%' in db) and show accordingly in Hive. However, datanucleus appears to encode it as %3F; this causes the partition to be unusable - it cannot be dropped, and other operations like drop table get stuck (didn't investigate in detail why; drop table got unstuck as soon as the partition was removed from metastore). We should probably have a 2-way option for such cases - error out on load (default), or convert to '?'/drop such characters (and have partition that actually works, too). We should also check if partitions with '?' inserted explicitly work at all with datanucleus. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-8007) Use proper Thrift comments
[ https://issues.apache.org/jira/browse/HIVE-8007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Francke updated HIVE-8007: --- Attachment: HIVE-8007.2.patch Use proper Thrift comments -- Key: HIVE-8007 URL: https://issues.apache.org/jira/browse/HIVE-8007 Project: Hive Issue Type: Improvement Reporter: Lars Francke Assignee: Lars Francke Priority: Minor Attachments: HIVE-8007.1.patch, HIVE-8007.2.patch Currently the thrift file uses {{//}} to denote comments. Thrift understands the {{/** ... */}} syntax and converts that into documentation in the generated code. This patch changes the syntax. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10327) Remove ExprNodeNullDesc
[ https://issues.apache.org/jira/browse/HIVE-10327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10327: Fix Version/s: (was: 1.2.0) 1.2.1 Remove ExprNodeNullDesc --- Key: HIVE-10327 URL: https://issues.apache.org/jira/browse/HIVE-10327 Project: Hive Issue Type: Task Components: Query Planning Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.2.1 Attachments: HIVE-10327.1.patch, HIVE-10327.2.patch, HIVE-10327.patch Its purpose can be served by ExprNodeConstantDesc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10760) Templeton: HCatalog Get Column for Non-existent column returns Server Error (500) rather than Not Found(404)
[ https://issues.apache.org/jira/browse/HIVE-10760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lekha Thota updated HIVE-10760: --- Description: Apache Jira for https://hwxmonarch.atlassian.net/browse/HIVE-578 (was: Apache Jira for HIVE-578) Templeton: HCatalog Get Column for Non-existent column returns Server Error (500) rather than Not Found(404) Key: HIVE-10760 URL: https://issues.apache.org/jira/browse/HIVE-10760 Project: Hive Issue Type: Bug Components: HCatalog, Hive, WebHCat Reporter: Lekha Thota Assignee: Lekha Thota Priority: Minor Apache Jira for https://hwxmonarch.atlassian.net/browse/HIVE-578 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10745) Better null handling by Vectorizer
[ https://issues.apache.org/jira/browse/HIVE-10745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551087#comment-14551087 ] Swarnim Kulkarni commented on HIVE-10745: - [~ashutoshc] Want to update the RB real quick? I can help. review this then. Better null handling by Vectorizer -- Key: HIVE-10745 URL: https://issues.apache.org/jira/browse/HIVE-10745 Project: Hive Issue Type: Bug Components: Vectorization Affects Versions: 1.2.0 Reporter: Jagruti Varia Assignee: Ashutosh Chauhan Attachments: HIVE-10745.1.patch, HIVE-10745.2.patch, HIVE-10745.patch Minor refactoring around null handling in Vectorization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-6867) Bucketized Table feature fails in some cases
[ https://issues.apache.org/jira/browse/HIVE-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-6867: -- Attachment: (was: HIVE-6867.02.patch) Bucketized Table feature fails in some cases Key: HIVE-6867 URL: https://issues.apache.org/jira/browse/HIVE-6867 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.12.0 Reporter: Laljo John Pullokkaran Assignee: Pengcheng Xiong Attachments: HIVE-6867.01.patch, HIVE-6867.02.patch Bucketized Table feature fails in some cases. if src destination is bucketed on same key, and if actual data in the src is not bucketed (because data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be bucketed while writing to destination. Example -- CREATE TABLE P1(key STRING, val STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE; LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE P1; – perform an insert to make sure there are 2 files INSERT OVERWRITE TABLE P1 select key, val from P1; -- This is not a regression. This has never worked. This got only discovered due to Hadoop2 changes. In Hadoop1, in local mode, number of reducers will always be 1, regardless of what is requested by app. Hadoop2 now honors the number of reducer setting in local mode (by spawning threads). Long term solution seems to be to prevent load data for bucketed table. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10636) CASE comparison operator rotation optimization
[ https://issues.apache.org/jira/browse/HIVE-10636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-10636: Fix Version/s: (was: 1.2.0) 1.2.1 CASE comparison operator rotation optimization -- Key: HIVE-10636 URL: https://issues.apache.org/jira/browse/HIVE-10636 Project: Hive Issue Type: New Feature Components: Logical Optimizer Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 1.2.1 Attachments: HIVE-10636.1.patch, HIVE-10636.2.patch, HIVE-10636.3.patch, HIVE-10636.patch Step 1 as outlined in description of HIVE-9644 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10677) hive.exec.parallel=true has problem when it is used for analyze table column stats
[ https://issues.apache.org/jira/browse/HIVE-10677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pengcheng Xiong updated HIVE-10677: --- Attachment: HIVE-10677.01.patch hive.exec.parallel=true has problem when it is used for analyze table column stats -- Key: HIVE-10677 URL: https://issues.apache.org/jira/browse/HIVE-10677 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-10677.01.patch To reproduce it, in q tests. {code} hive set hive.exec.parallel; hive.exec.parallel=true hive analyze table src compute statistics for columns; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.ColumnStatsTask java.lang.RuntimeException: Error caching map.xml: java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:747) at org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:682) at org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:674) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:375) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:75) Caused by: java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:541) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.util.Shell.execCommand(Shell.java:791) at org.apache.hadoop.util.Shell.execCommand(Shell.java:774) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:646) at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:472) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:460) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:426) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:773) at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:715) ... 7 more hive Job Submission failed with exception 'java.lang.RuntimeException(Error caching map.xml: java.io.IOException: java.lang.InterruptedException)' {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10677) hive.exec.parallel=true has problem when it is used for analyze table column stats
[ https://issues.apache.org/jira/browse/HIVE-10677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14551310#comment-14551310 ] Pengcheng Xiong commented on HIVE-10677: [~ashutoshc] and [~jpullokkaran], could you please review the patch? Thanks. hive.exec.parallel=true has problem when it is used for analyze table column stats -- Key: HIVE-10677 URL: https://issues.apache.org/jira/browse/HIVE-10677 Project: Hive Issue Type: Bug Reporter: Pengcheng Xiong Assignee: Pengcheng Xiong Attachments: HIVE-10677.01.patch To reproduce it, in q tests. {code} hive set hive.exec.parallel; hive.exec.parallel=true hive analyze table src compute statistics for columns; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.ColumnStatsTask java.lang.RuntimeException: Error caching map.xml: java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:747) at org.apache.hadoop.hive.ql.exec.Utilities.setMapWork(Utilities.java:682) at org.apache.hadoop.hive.ql.exec.Utilities.setMapRedWork(Utilities.java:674) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:375) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:75) Caused by: java.io.IOException: java.lang.InterruptedException at org.apache.hadoop.util.Shell.runCommand(Shell.java:541) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) at org.apache.hadoop.util.Shell.execCommand(Shell.java:791) at org.apache.hadoop.util.Shell.execCommand(Shell.java:774) at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:646) at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:472) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:460) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:426) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:773) at org.apache.hadoop.hive.ql.exec.Utilities.setBaseWork(Utilities.java:715) ... 7 more hive Job Submission failed with exception 'java.lang.RuntimeException(Error caching map.xml: java.io.IOException: java.lang.InterruptedException)' {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10593) Support creating table from a file schema: CREATE TABLE ... LIKE file_format '/path/to/file'
[ https://issues.apache.org/jira/browse/HIVE-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550725#comment-14550725 ] Ryan Blue commented on HIVE-10593: -- +1 to file only. Merging file schemas is way out of scope. I like the syntax, though it would be nice if we could detect the file type in most cases by magic bytes. Avro and Parquet work that way, so we could have a map of magic - format. Anything else, like delimited text, doesn't work with this feature anyway. Support creating table from a file schema: CREATE TABLE ... LIKE file_format '/path/to/file' -- Key: HIVE-10593 URL: https://issues.apache.org/jira/browse/HIVE-10593 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 1.2.0 Reporter: Lenni Kuff It would be useful if Hive could infer the column definitions in a create table statement from the underlying data file. For example: CREATE TABLE new_tbl LIKE PARQUET '/path/to/file.parquet'; If the targeted file is not the specified file format, the statement should fail analysis. In addition to PARQUET, it would be useful to support other formats such as AVRO, JSON, and ORC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10731) NullPointerException in HiveParser.g
[ https://issues.apache.org/jira/browse/HIVE-10731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550917#comment-14550917 ] Laljo John Pullokkaran commented on HIVE-10731: --- +1 NullPointerException in HiveParser.g Key: HIVE-10731 URL: https://issues.apache.org/jira/browse/HIVE-10731 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 1.2.0 Reporter: Xiu Assignee: Pengcheng Xiong Priority: Minor Attachments: HIVE-10731.01.patch In HiveParser.g: {code:Java} protected boolean useSQL11ReservedKeywordsForIdentifier() { return !HiveConf.getBoolVar(hiveConf, HiveConf.ConfVars.HIVE_SUPPORT_SQL11_RESERVED_KEYWORDS); } {code} NullPointerException is thrown when hiveConf is not set. Stack trace: {code:Java} java.lang.NullPointerException at org.apache.hadoop.hive.conf.HiveConf.getBoolVar(HiveConf.java:2583) at org.apache.hadoop.hive.ql.parse.HiveParser.useSQL11ReservedKeywordsForIdentifier(HiveParser.java:1000) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.useSQL11ReservedKeywordsForIdentifier(HiveParser_IdentifiersParser.java:726) at org.apache.hadoop.hive.ql.parse.HiveParser_IdentifiersParser.identifier(HiveParser_IdentifiersParser.java:10922) at org.apache.hadoop.hive.ql.parse.HiveParser.identifier(HiveParser.java:45808) at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameType(HiveParser.java:38008) at org.apache.hadoop.hive.ql.parse.HiveParser.columnNameTypeList(HiveParser.java:36167) at org.apache.hadoop.hive.ql.parse.HiveParser.createTableStatement(HiveParser.java:5214) at org.apache.hadoop.hive.ql.parse.HiveParser.ddlStatement(HiveParser.java:2640) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:1650) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1109) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:202) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:161) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-10752) Revert HIVE-5193
[ https://issues.apache.org/jira/browse/HIVE-10752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550921#comment-14550921 ] Gopal V commented on HIVE-10752: Since this is a significant performance hit, is there a pig+hcatalog bug report with a backtrace or logs? Revert HIVE-5193 Key: HIVE-10752 URL: https://issues.apache.org/jira/browse/HIVE-10752 Project: Hive Issue Type: Sub-task Components: HCatalog Affects Versions: 1.2.0 Reporter: Aihua Xu Assignee: Aihua Xu Revert HIVE-5193 since it causes pig+hcatalog not working. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-10753) hs2 jdbc url - wrong connection string cause OOM error on beeline/jdbc/odbc client, misleading message
[ https://issues.apache.org/jira/browse/HIVE-10753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-10753: - Attachment: HIVE-10753.1.patch [~vgumashta] / [~thejas] Can you please review patch 1. Please let me know if you think if there is a better way of telling the user that they entered a bad URL. Thanks Hari hs2 jdbc url - wrong connection string cause OOM error on beeline/jdbc/odbc client, misleading message -- Key: HIVE-10753 URL: https://issues.apache.org/jira/browse/HIVE-10753 Project: Hive Issue Type: Bug Components: Beeline, JDBC Reporter: Hari Sankar Sivarama Subramaniyan Assignee: Hari Sankar Sivarama Subramaniyan Attachments: HIVE-10753.1.patch {noformat} beeline -u 'jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http' -n hdiuser scan complete in 15ms Connecting to jdbc:hive2://localhost:10001/default?httpPath=/;transportMode=http Java heap space Beeline version 0.14.0.2.2.4.1-1 by Apache Hive 0: jdbc:hive2://localhost:10001/default (closed) ^Chdiuser@headnode0:~$ But it works if I use the deprecated param - hdiuser@headnode0:~$ beeline -u 'jdbc:hive2://localhost:10001/default?hive.server2.transport.mode=http;httpPath=/' -n hdiuser scan complete in 12ms Connecting to jdbc:hive2://localhost:10001/default?hive.server2.transport.mode=http;httpPath=/ 15/04/28 23:16:46 [main]: WARN jdbc.Utils: * JDBC param deprecation * 15/04/28 23:16:46 [main]: WARN jdbc.Utils: The use of hive.server2.transport.mode is deprecated. 15/04/28 23:16:46 [main]: WARN jdbc.Utils: Please use transportMode like so: jdbc:hive2://host:port/dbName;transportMode=transport_mode_value Connected to: Apache Hive (version 0.14.0.2.2.4.1-1) Driver: Hive JDBC (version 0.14.0.2.2.4.1-1) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 0.14.0.2.2.4.1-1 by Apache Hive 0: jdbc:hive2://localhost:10001/default show tables; +--+--+ | tab_name | +--+--+ | hivesampletable | +--+--+ 1 row selected (18.181 seconds) 0: jdbc:hive2://localhost:10001/default ^Chdiuser@headnode0:~$ ^C {noformat} The reason for the above message is : The url is wrong. Correct one: {code} beeline -u 'jdbc:hive2://localhost:10001/default;httpPath=/;transportMode=http' -n hdiuser {code} Note the ; instead of ?. The deprecation msg prints the format as well: {code} Please use transportMode like so: jdbc:hive2://host:port/dbName;transportMode=transport_mode_value {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)