[jira] [Created] (HIVE-26218) Hive standalone metastore release version are not available after 3.0.0 in offical download location
Nagarajan Selvaraj created HIVE-26218: - Summary: Hive standalone metastore release version are not available after 3.0.0 in offical download location Key: HIVE-26218 URL: https://issues.apache.org/jira/browse/HIVE-26218 Project: Hive Issue Type: Bug Components: Standalone Metastore Affects Versions: 3.1.3, 3.1.2, 4.0.0, 4.0.0-alpha-1 Reporter: Nagarajan Selvaraj Hive standalone metastore release version are not available after 3.0.0 in official download location https://www.apache.org/dyn/closer.cgi/hive/ -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26217) Make CTAS use Direct Insert Semantics
Sourabh Badhya created HIVE-26217: - Summary: Make CTAS use Direct Insert Semantics Key: HIVE-26217 URL: https://issues.apache.org/jira/browse/HIVE-26217 Project: Hive Issue Type: Bug Reporter: Sourabh Badhya Assignee: Sourabh Badhya -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26216) CTAS tries to use committed table information when queries are run concurrently
Sourabh Badhya created HIVE-26216: - Summary: CTAS tries to use committed table information when queries are run concurrently Key: HIVE-26216 URL: https://issues.apache.org/jira/browse/HIVE-26216 Project: Hive Issue Type: Bug Components: Hive Reporter: Sourabh Badhya Assignee: Sourabh Badhya -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26215) Expose the MIN_HISTORY_LEVEL table through Hive sys database
Simhadri G created HIVE-26215: - Summary: Expose the MIN_HISTORY_LEVEL table through Hive sys database Key: HIVE-26215 URL: https://issues.apache.org/jira/browse/HIVE-26215 Project: Hive Issue Type: Improvement Reporter: Simhadri G Assignee: Simhadri G While we still (partially) use MIN_HISTORY_LEVEL for the cleaner, we should expose it as a sys table so we can see what might be blocking the Cleaner thread. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26214) Hive 3.1.3 Release Notes
Anmol Sundaram created HIVE-26214: - Summary: Hive 3.1.3 Release Notes Key: HIVE-26214 URL: https://issues.apache.org/jira/browse/HIVE-26214 Project: Hive Issue Type: Improvement Components: Documentation, Hive Affects Versions: 3.1.3 Reporter: Anmol Sundaram The Hive Release Notes as mentioned in [here|https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12346277=Html=12310843] does not seem to be accurate when compared with the [commit logs|https://github.com/apache/hive/commits/rel/release-3.1.3]. Can we please get this updated, if applicable ? -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26213) "hive.limit.pushdown.memory.usage" better not be equal to 1.0, otherwise it will raise an error
Jingxuan Fu created HIVE-26213: -- Summary: "hive.limit.pushdown.memory.usage" better not be equal to 1.0, otherwise it will raise an error Key: HIVE-26213 URL: https://issues.apache.org/jira/browse/HIVE-26213 Project: Hive Issue Type: Bug Affects Versions: 3.1.2 Environment: Hive 3.1.2 os.name=Linux os.arch=amd64 os.version=5.4.0-72-generic java.version=1.8.0_162 java.vendor=Oracle Corporation Reporter: Jingxuan Fu Assignee: Jingxuan Fu In hive-default.xml.template {code:java} hive.limit.pushdown.memory.usage 0.1 Expects value between 0.0f and 1.0f. The fraction of available memory to be used for buffering rows in Reducesink operator for limit pushdown optimization. {code} Based on the description of hive-default.xml.template, hive.limit.pushdown.memory.usage expects a value between 0.0 and 1.0, setting hive.limit.pushdown.memory.usage to 1.0 means that it expects the available memory of all buffered lines for the limit pushdown optimization, and successfully start hiveserver2. Then, call the java api to write a program to establish a jdbc connection as a client to access hive, using JDBCDemo as an example. {code:java} import demo.utils.JDBCUtils; public class JDBCDemo{ public static void main(String[] args) throws Exception { JDBCUtils.init(); JDBCUtils.createDatabase(); JDBCUtils.showDatabases(); JDBCUtils.createTable(); JDBCUtils.showTables(); JDBCUtils.descTable(); JDBCUtils.loadData(); JDBCUtils.selectData(); JDBCUtils.countData(); JDBCUtils.dropDatabase(); JDBCUtils.dropTable(); JDBCUtils.destory(); } } {code} After running the client program, both the client and the hiveserver throw exceptions. {code:java} 2022-05-09 19:05:36: Starting HiveServer2 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Hive Session ID = 67a6db8d-f957-4d5d-ac18-28403adab7f3 Hive Session ID = f9f8772c-5765-4c3e-bcff-ca605c667be7 OK OK OK OK OK OK OK Loading data to table default.emp OK FAILED: SemanticException Invalid memory usage value 1.0 for hive.limit.pushdown.memory.usage{code} {code:java} liky@ljq1:~/hive_jdbc_test$ ./startJDBC_0.sh SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/liky/.m2/repository/org/apache/logging/log4j/log4j-slf4j-impl/2.17.1/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/liky/.m2/repository/org/slf4j/slf4j-log4j12/1.7.25/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Running: drop database if exists hive_jdbc_test Running: create database hive_jdbc_test Running: show databases default hive_jdbc_test Running: drop table if exists emp Running: create table emp( empno int, ename string, job string, mgr int, hiredate string, sal double, comm double, deptno int ) row format delimited fields terminated by '\t' Running: show tables emp Running: desc emp empno int ename string job string mgr int hiredate string sal double comm double deptno int Running: load data local inpath '/home/liky/hiveJDBCTestData/data.txt' overwrite into table emp Running: select * from emp Exception in thread "main" org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException Invalid memory usage value 1.0 for hive.limit.pushdown.memory.usage at org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:380) at org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:366) at org.apache.hive.jdbc.HiveStatement.runAsyncOnServer(HiveStatement.java:354) at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:293) at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:509) at demo.utils.JDBCUtils.selectData(JDBCUtils.java:98) at demo.test.JDBCDemo.main(JDBCDemo.java:19){code} Setting hive.limit.pushdown.memory.usage to 0.0 has no exception. So, setting hive.limit.pushdown.memory.usage to 1.0 is not desirable, *hive-default.xml.template is not clear enough for the description of the boundary of the value, it is better to use the interval to indicate the value that is [0.0,1.0).* -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26212) hive fetch data timeout
royal created HIVE-26212: Summary: hive fetch data timeout Key: HIVE-26212 URL: https://issues.apache.org/jira/browse/HIVE-26212 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 1.1.0 Reporter: royal -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26211) "hive.server2.webui.max.historic.queries" should be avoided to be set too large, otherwise it will cause blocking
Jingxuan Fu created HIVE-26211: -- Summary: "hive.server2.webui.max.historic.queries" should be avoided to be set too large, otherwise it will cause blocking Key: HIVE-26211 URL: https://issues.apache.org/jira/browse/HIVE-26211 Project: Hive Issue Type: Bug Affects Versions: 3.1.2 Environment: Hive 3.1.2 os.name=Linux os.arch=amd64 os.version=5.4.0-72-generic java.version=1.8.0_162 java.vendor=Oracle Corporation Reporter: Jingxuan Fu Assignee: Jingxuan Fu In hive-default.xml.template hive.server2.webui.max.historic.queries 25 The maximum number of past queries to show in HiverSever2 WebUI. Set hive.server2.webui.max.historic.queries to a relatively large value, take 2000 as an example, start hiveserver2, it can start hiveserver normally, and logging without exception. liky@ljq1:/usr/local/hive/conf$ hiveserver2 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 2022-05-09 20:03:41: Starting HiveServer2 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Hive Session ID = 0b419706-4026-4a8b-80fe-b79fecbccd4f Hive Session ID = 0f9e28d7-0081-4b2f-a743-4093c38c152d Next, if you use beeline as a client to connect to hive and send a request for database related operations, for example, if you query all the databases, after successfully executing "show databases", beeline blocks and no other operations can be performed. liky@ljq1:/opt/hive$ beeline SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Beeline version 3.1.2 by Apache Hive beeline> !connect jdbc:hive2://192.168.1.194:1/default Connecting to jdbc:hive2://192.168.1.194:1/default Enter username for jdbc:hive2://192.168.1.194:1/default: hive Enter password for jdbc:hive2://192.168.1.194:1/default: * Connected to: Apache Hive (version 3.1.2) Driver: Hive JDBC (version 3.1.2) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://192.168.1.194:1/default> show databases . . . . . . . . . . . . . . . . . . . . . .> ; INFO : Compiling command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): show databases INFO : Concurrency mode is disabled, not creating a lock manager INFO : Semantic Analysis Completed (retrial = false) INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null) INFO : Completed compiling command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); Time taken: 0.393 seconds INFO : Concurrency mode is disabled, not creating a lock manager INFO : Executing command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b): show databases INFO : Starting task [Stage-0:DDL] in serial mode INFO : Completed executing command(queryId=liky_20220509202542_15382019-f07b-40ff-840d-1f720df77d8b); Time taken: 0.109 seconds INFO : OK INFO : Concurrency mode is disabled, not creating a lock manager ++ | database_name | ++ | default | ++ 1 row selected (1.374 seconds) Also, on the hiveserver side, a runtime null pointer exception is thrown, and the observation log throws no warnings or errors. liky@ljq1:/usr/local/hive/conf$ hiveserver2 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See ht
[jira] [Created] (HIVE-26210) Fix tests for Cleaner failed attempt threshold
László Végh created HIVE-26210: -- Summary: Fix tests for Cleaner failed attempt threshold Key: HIVE-26210 URL: https://issues.apache.org/jira/browse/HIVE-26210 Project: Hive Issue Type: Bug Reporter: László Végh -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26209) HIVE CBO. Wrong results with Hive SQL query with MULTIPLE IN conditions in where clause if CBO enabled
Danylo Krupnyk created HIVE-26209: - Summary: HIVE CBO. Wrong results with Hive SQL query with MULTIPLE IN conditions in where clause if CBO enabled Key: HIVE-26209 URL: https://issues.apache.org/jira/browse/HIVE-26209 Project: Hive Issue Type: Bug Components: CBO, Hive Affects Versions: 2.3.6 Reporter: Danylo Krupnyk I am running one SQL query in Hive and it gives different results with CBO enabled and disabled. The results are wrong when CBO is enabled (set hive.cbo.enable=true;). {*}Prerequisites{*}: Apache Hadoop 2.10.1 + Apache Hive 2.3.6 installed. (I tried to reproduce the issue with Apache Hive 3+ version and Hadoop 3+ version and they work fine.) Actions to reproduce you can find in stackoverflow ticket here - https://stackoverflow.com/questions/71825360/hive-cbo-wrong-results-with-hive-sql-query-with-multiple-in-conditions-in-where -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26208) Exception in Vectorization with Decimal64 to Decimal casting
Steve Carlin created HIVE-26208: --- Summary: Exception in Vectorization with Decimal64 to Decimal casting Key: HIVE-26208 URL: https://issues.apache.org/jira/browse/HIVE-26208 Project: Hive Issue Type: Bug Components: Vectorization Reporter: Steve Carlin The following query fails: {code:java} select count(*) from int_txt where (( 1.0 * i) / ( 1.0 * i)) > 1.2; {code} with the following exception: {code:java} Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.vector.Decimal64ColumnVector cannot be cast to org.apache.hadoop.hive.ql.exec.vector.DecimalColumnVector at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.DecimalColDivideDecimalColumn.evaluate(DecimalColDivideDecimalColumn.java:59) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.vector.expressions.VectorExpression.evaluateChildren(VectorExpression.java:334) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.vector.expressions.gen.FilterDecimalColGreaterDecimalScalar.evaluate(FilterDecimalColGreaterDecimalScalar.java:62) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:125) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:919) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:171) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.deliverVectorizedRowBatch(VectorMapOperator.java:809) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:900) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] ... 19 more {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26207) information_schema.tables : new field is table partitioned
Simon AUBERT created HIVE-26207: --- Summary: information_schema.tables : new field is table partitioned Key: HIVE-26207 URL: https://issues.apache.org/jira/browse/HIVE-26207 Project: Hive Issue Type: Improvement Reporter: Simon AUBERT Hello, A new field on the information_schema.tables table : is the table partitioned ? Best regards, Simon -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26206) information schema columns : new field is column a partition key?
Simon AUBERT created HIVE-26206: --- Summary: information schema columns : new field is column a partition key? Key: HIVE-26206 URL: https://issues.apache.org/jira/browse/HIVE-26206 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Simon AUBERT Hello all, I think a new field is needed on information_schema.columns : a flag if the field is a partition key. Very easy and very useful. Best regards, Simon -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26205) Remove the incorrect org.slf4j dependency in kafka-handler
Wechar created HIVE-26205: - Summary: Remove the incorrect org.slf4j dependency in kafka-handler Key: HIVE-26205 URL: https://issues.apache.org/jira/browse/HIVE-26205 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 4.0.0-alpha-2 Reporter: Wechar Assignee: Wechar Fix For: 4.0.0-alpha-2 Get a compile error while executing: {code:bash} mvn clean install -DskipTests {code} The error message is: {code:bash} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project kafka-handler: Compilation failure: Compilation failure: [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[53,17] package org.slf4j does not exist [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[54,17] package org.slf4j does not exist [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaStorageHandler.java:[73,24] cannot find symbol [ERROR] symbol: class Logger [ERROR] location: class org.apache.hadoop.hive.kafka.KafkaStorageHandler [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/VectorizedKafkaRecordReader.java:[37,17] package org.slf4j does not exist [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/VectorizedKafkaRecordReader.java:[47,24] cannot find symbol [ERROR] symbol: class Logger [ERROR] location: class org.apache.hadoop.hive.kafka.VectorizedKafkaRecordReader [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaJsonSerDe.java:[63,17] package org.slf4j does not exist [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/SimpleKafkaWriter.java:[35,17] package org.slf4j does not exist [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/SimpleKafkaWriter.java:[50,24] cannot find symbol [ERROR] symbol: class Logger [ERROR] location: class org.apache.hadoop.hive.kafka.SimpleKafkaWriter [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaOutputFormat.java:[34,17] package org.slf4j does not exist [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaOutputFormat.java:[43,24] cannot find symbol [ERROR] symbol: class Logger [ERROR] location: class org.apache.hadoop.hive.kafka.KafkaOutputFormat [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/RetryUtils.java:[24,17] package org.slf4j does not exist [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/RetryUtils.java:[34,24] cannot find symbol [ERROR] symbol: class Logger [ERROR] location: class org.apache.hadoop.hive.kafka.RetryUtils [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaScanTrimmer.java:[51,17] package org.slf4j does not exist [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaScanTrimmer.java:[65,24] cannot find symbol [ERROR] symbol: class Logger [ERROR] location: class org.apache.hadoop.hive.kafka.KafkaScanTrimmer [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/TransactionalKafkaWriter.java:[45,17] package org.slf4j does not exist [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/TransactionalKafkaWriter.java:[65,24] cannot find symbol [ERROR] symbol: class Logger [ERROR] location: class org.apache.hadoop.hive.kafka.TransactionalKafkaWriter [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/HiveKafkaProducer.java:[37,17] package org.slf4j does not exist [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/HiveKafkaProducer.java:[59,24] cannot find symbol [ERROR] symbol: class Logger [ERROR] location: class org.apache.hadoop.hive.kafka.HiveKafkaProducer [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaRecordIterator.java:[30,17] package org.slf4j does not exist [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaRecordIterator.java:[56,24] cannot find symbol [ERROR] symbol: class Logger [ERROR] location: class org.apache.hadoop.hive.kafka.KafkaRecordIterator [ERROR] /ldap_home/weiqiang.yu/forked-hive/kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaUtils.java:[52,17] package org.slf4j does not exist
[jira] [Created] (HIVE-26204) Unrecognized format specifier error when running hmsbench-jar
Yu-Wen Lai created HIVE-26204: - Summary: Unrecognized format specifier error when running hmsbench-jar Key: HIVE-26204 URL: https://issues.apache.org/jira/browse/HIVE-26204 Project: Hive Issue Type: Improvement Components: Standalone Metastore Reporter: Yu-Wen Lai Assignee: Yu-Wen Lai Here is the logging error. {code:java} ERROR StatusLogger Unrecognized format specifier [d] ERROR StatusLogger Unrecognized conversion specifier [d] starting at position 16 in conversion pattern. ERROR StatusLogger Unrecognized format specifier [thread] ERROR StatusLogger Unrecognized conversion specifier [thread] starting at position 25 in conversion pattern. ERROR StatusLogger Unrecognized format specifier [level] ERROR StatusLogger Unrecognized conversion specifier [level] starting at position 35 in conversion pattern. ERROR StatusLogger Unrecognized format specifier [logger] ERROR StatusLogger Unrecognized conversion specifier [logger] starting at position 47 in conversion pattern. ERROR StatusLogger Unrecognized format specifier [msg] ERROR StatusLogger Unrecognized conversion specifier [msg] starting at position 54 in conversion pattern. ERROR StatusLogger Unrecognized format specifier [n] ERROR StatusLogger Unrecognized conversion specifier [n] starting at position 56 in conversion pattern.{code} The issue is because only one Log4j2Plugins.dat is survived when the jar file is assembled. One way to avoid this issue is that we exclude this file and let log4j to load plugins on the fly. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26203) Implement alter iceberg table metadata location
László Pintér created HIVE-26203: Summary: Implement alter iceberg table metadata location Key: HIVE-26203 URL: https://issues.apache.org/jira/browse/HIVE-26203 Project: Hive Issue Type: Improvement Reporter: László Pintér -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26202) Refactor Iceberg HiveFileWriterFactory
Peter Vary created HIVE-26202: - Summary: Refactor Iceberg HiveFileWriterFactory Key: HIVE-26202 URL: https://issues.apache.org/jira/browse/HIVE-26202 Project: Hive Issue Type: Improvement Reporter: Peter Vary -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26201) QueryResultsCache may have wrong permission if umask is too strict
Zhang Dongsheng created HIVE-26201: -- Summary: QueryResultsCache may have wrong permission if umask is too strict Key: HIVE-26201 URL: https://issues.apache.org/jira/browse/HIVE-26201 Project: Hive Issue Type: Bug Components: Query Processor, Tez Affects Versions: 3.1.3 Reporter: Zhang Dongsheng TezSessionState, QueryResultsCache and Context use mkdirs(path, permission) to create directory with special permission. But If the umask is too restrictive, permissions may not work as expected. So we need to check if permission is set as expected. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26200) Add tests for Iceberg DELETE statements for every supported type
Peter Vary created HIVE-26200: - Summary: Add tests for Iceberg DELETE statements for every supported type Key: HIVE-26200 URL: https://issues.apache.org/jira/browse/HIVE-26200 Project: Hive Issue Type: Test Reporter: Peter Vary It would be good to check if we are able to delete every supported column type. I have found issues with updates, and I think it would be good to have additional tests with delete as well (even though they are working correctly ATM) -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26199) Reduce FileSystem init during user impersonation
Ramesh Kumar Thangarajan created HIVE-26199: --- Summary: Reduce FileSystem init during user impersonation Key: HIVE-26199 URL: https://issues.apache.org/jira/browse/HIVE-26199 Project: Hive Issue Type: Task Reporter: Ramesh Kumar Thangarajan Assignee: Ramesh Kumar Thangarajan -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26198) Hive - Upgrade cron-utils to 9.1.6+ due to CVE-2021-41269
Hongdan Zhu created HIVE-26198: -- Summary: Hive - Upgrade cron-utils to 9.1.6+ due to CVE-2021-41269 Key: HIVE-26198 URL: https://issues.apache.org/jira/browse/HIVE-26198 Project: Hive Issue Type: Bug Reporter: Hongdan Zhu Assignee: Hongdan Zhu Hive is currently pulling in cron-utils 9.1.3. This is vulnerable due to CVE-2021-41269. Please upgrade to 9.1.6 or later -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26197) Hive - Upgrade Ant to 1.10.11 due to CVE-2021-36373 and CVE-2021-36374
Hongdan Zhu created HIVE-26197: -- Summary: Hive - Upgrade Ant to 1.10.11 due to CVE-2021-36373 and CVE-2021-36374 Key: HIVE-26197 URL: https://issues.apache.org/jira/browse/HIVE-26197 Project: Hive Issue Type: Bug Reporter: Hongdan Zhu Assignee: Hongdan Zhu Hive is currently pulling in 1.10.9. Upgrade Ant to 1.10.11 due to CVE-2021-36373 and CVE-2021-36374 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26196) Integrate Sonarcloud analysis for master branch
Alessandro Solimando created HIVE-26196: --- Summary: Integrate Sonarcloud analysis for master branch Key: HIVE-26196 URL: https://issues.apache.org/jira/browse/HIVE-26196 Project: Hive Issue Type: Improvement Components: Build Infrastructure Affects Versions: 4.0.0-alpha-2 Reporter: Alessandro Solimando Assignee: Alessandro Solimando The aim of the ticket is to integrate SonarCloud analysis for the master branch. The ticket does not cover: * test coverage * analysis on PRs and other branches Those aspects can be added in follow-up tickets, if there is enough interest. >From preliminary tests, the analysis step requires 30 additional minutes for >the pipeline. The idea for this first integration is to track code quality metrics over new commits in the master branch, without any quality gate rules (i.e., the analysis will never fail, independently of the values of the quality metrics). An example of analysis is available in my personal Sonar account: [https://sonarcloud.io/summary/new_code?id=asolimando_hive] ASF offers SonarCloud accounts for Apache projects, and Hive already has one, for completing the present ticket, somebody having admin permissions in that repo should generated an authentication token, which should replace the _SONAR_TOKEN_ secret. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26195) Keep Kafka handler naming style consistent with others
zhangbutao created HIVE-26195: - Summary: Keep Kafka handler naming style consistent with others Key: HIVE-26195 URL: https://issues.apache.org/jira/browse/HIVE-26195 Project: Hive Issue Type: Improvement Components: StorageHandler Affects Versions: 4.0.0-alpha-2 Reporter: zhangbutao Keep Kafka handler naming style consistent with others (JDBC, Hbase, Kudu, Druid) -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26194) Unable to interrupt query in the middle of long compilation
Rajesh Balamohan created HIVE-26194: --- Summary: Unable to interrupt query in the middle of long compilation Key: HIVE-26194 URL: https://issues.apache.org/jira/browse/HIVE-26194 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Rajesh Balamohan *Issue:* * Certain queries can take lot longer time to compile, depending on the number of interactions with HMS. * When user tries to cancel such queries in the middle of compilation, it doesn't work. It interrupts the process only when the entire compilation phase is complete. * Example is given below (Q66 at 10 TB TPCDS) {noformat} . . . . . . . . . . . . . . . . . . . . . . .>,d_year . . . . . . . . . . . . . . . . . . . . . . .> ) . . . . . . . . . . . . . . . . . . . . . . .> ) x . . . . . . . . . . . . . . . . . . . . . . .> group by . . . . . . . . . . . . . . . . . . . . . . .> w_warehouse_name . . . . . . . . . . . . . . . . . . . . . . .>,w_warehouse_sq_ft . . . . . . . . . . . . . . . . . . . . . . .>,w_city . . . . . . . . . . . . . . . . . . . . . . .>,w_county . . . . . . . . . . . . . . . . . . . . . . .>,w_state . . . . . . . . . . . . . . . . . . . . . . .>,w_country . . . . . . . . . . . . . . . . . . . . . . .>,ship_carriers . . . . . . . . . . . . . . . . . . . . . . .>,year . . . . . . . . . . . . . . . . . . . . . . .> order by w_warehouse_name . . . . . . . . . . . . . . . . . . . . . . .> limit 100; Interrupting... Please be patient this may take some time. Interrupting... Please be patient this may take some time. Interrupting... Please be patient this may take some time. Interrupting... Please be patient this may take some time. Interrupting... Please be patient this may take some time. Interrupting... Please be patient this may take some time. ... ... ... ,w_city ,w_county ,w_state ,w_country ,ship_carriers ,year order by w_warehouse_name limit 100 INFO : Semantic Analysis Completed (retrial = false) ERROR : FAILED: command has been interrupted: after analyzing query. INFO : Compiling command(queryId=hive_20220502040541_14c76b6f-f6d2-4ab3-ad82-522f17ede63a) has been interrupted after 32.872 seconds <<<<<<<<<<< Notice that it interrupted only after entire compilation is done at 32 seconds. Error: Query was cancelled. Illegal Operation state transition from CANCELED to ERROR (state=01000,code=0) {noformat} This becomes an issue in busy cluster. Interrupt handling should be fixed in compilation phase. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26193) Fix Iceberg partitioned tables null bucket handling
Peter Vary created HIVE-26193: - Summary: Fix Iceberg partitioned tables null bucket handling Key: HIVE-26193 URL: https://issues.apache.org/jira/browse/HIVE-26193 Project: Hive Issue Type: Bug Reporter: Peter Vary Inserting null values into a partition column should write the rows into null partitions. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26192) JDBC data connector queries occur exception at cbo stage
zhangbutao created HIVE-26192: - Summary: JDBC data connector queries occur exception at cbo stage Key: HIVE-26192 URL: https://issues.apache.org/jira/browse/HIVE-26192 Project: Hive Issue Type: Bug Reporter: zhangbutao -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26191) Missing catalog name in GetTableRequest results in duplicate values in query level cache
Soumyakanti Das created HIVE-26191: -- Summary: Missing catalog name in GetTableRequest results in duplicate values in query level cache Key: HIVE-26191 URL: https://issues.apache.org/jira/browse/HIVE-26191 Project: Hive Issue Type: Bug Reporter: Soumyakanti Das Assignee: Soumyakanti Das Attachments: query_cache.png {{getTableInternal}} method in {{SessionHiveMetaStoreClient}} accepts a {{GetTableRequest}} object with a {{null}} value for catalog. Because of this we can store 2 entries in the query level cache. !query_cache.png! -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26190) Implement create iceberg table with metadata location
László Pintér created HIVE-26190: Summary: Implement create iceberg table with metadata location Key: HIVE-26190 URL: https://issues.apache.org/jira/browse/HIVE-26190 Project: Hive Issue Type: Improvement Reporter: László Pintér Assignee: László Pintér We should support the following syntax {code:sql} CREATE TABLE ... TBLPROPERTIES('metadata_location='some_location'){code} This would allow us to create an iceberg table by reusing an already existing metadata. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26189) Iceberg metadata query throws exceptions after partition evolution
Ádám Szita created HIVE-26189: - Summary: Iceberg metadata query throws exceptions after partition evolution Key: HIVE-26189 URL: https://issues.apache.org/jira/browse/HIVE-26189 Project: Hive Issue Type: Bug Reporter: Ádám Szita The following test case surfaced two issues with metadata table queries: {code:java} CREATE EXTERNAL TABLE `partev`( `id` int, `ts` timestamp, `ts2` timestamp) STORED BY ICEBERG STORED AS ORC; ALTER TABLE partev SET PARTITION SPEC (id); INSERT INTO partev VALUES (1, current_timestamp(), current_timestamp()); INSERT INTO partev VALUES (2, current_timestamp(), current_timestamp()); ALTER TABLE partev SET PARTITION SPEC (year(ts)); INSERT INTO partev VALUES (10, current_timestamp(), current_timestamp()); ALTER TABLE partev SET PARTITION SPEC (month(ts)); INSERT INTO partev VALUES (100, current_timestamp(), current_timestamp()); ALTER TABLE partev SET PARTITION SPEC (day(ts)); INSERT INTO partev VALUES (1000, current_timestamp(), current_timestamp()); ALTER TABLE partev SET PARTITION SPEC (hour(ts)); INSERT INTO partev VALUES (1, current_timestamp(), current_timestamp()); ALTER TABLE partev SET PARTITION SPEC (bucket(2,id)); INSERT INTO partev VALUES (10, current_timestamp(), current_timestamp()); select * from default.partev.partitions; ALTER TABLE partev SET PARTITION SPEC (id, year(ts2)); INSERT INTO partev VALUES (20, current_timestamp(), current_timestamp()); select * from default.partev.partitions; {code} NPE for removed partition columns from new specs, and class cast exceptions for day transform (Integer to LocalDate) -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26188) Query level cache and HMS local cache doesn't work locally and with Explain statements.
Soumyakanti Das created HIVE-26188: -- Summary: Query level cache and HMS local cache doesn't work locally and with Explain statements. Key: HIVE-26188 URL: https://issues.apache.org/jira/browse/HIVE-26188 Project: Hive Issue Type: Bug Reporter: Soumyakanti Das Assignee: Soumyakanti Das {{ExplainSemanticAnalyzer}} should override {{startAnalysis()}} method that creates the query level cache. This is important because after https://issues.apache.org/jira/browse/HIVE-25918, the HMS local cache only works if the query level cache is also initialized. Also, {{data/conf/llap/hive-site.xml}} properties for the HMS cache are incorrect which should be fixed to enable the cache during qtests. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26187) Set operations and time travel is not working
Zoltán Borók-Nagy created HIVE-26187: Summary: Set operations and time travel is not working Key: HIVE-26187 URL: https://issues.apache.org/jira/browse/HIVE-26187 Project: Hive Issue Type: Bug Reporter: Zoltán Borók-Nagy Set operations doesn't work well with time travel queries. Repro: {noformat} select * from t FOR SYSTEM_VERSION AS OF MINUS select * from t FOR SYSTEM_VERSION AS OF ; {noformat} Returns 0 results because both selects use the same snapshot id, instead of snapshot_id_1 and snapshot_id_2. Probably there're issues with other queries as well, when the same table is used multiple times with different snapshot ids. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26186) Resultset returned by getTables does not order data per JDBC specification
N Campbell created HIVE-26186: - Summary: Resultset returned by getTables does not order data per JDBC specification Key: HIVE-26186 URL: https://issues.apache.org/jira/browse/HIVE-26186 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 3.1.3 Environment: !HiveMeta.png! Reporter: N Campbell Attachments: HiveMeta.png JDBC specification states that data in a Resultset must be ordered. A simple Java program issues a request to getTables ResultSet rs = dbMeta.getTables( {*}null{*}, "cert", "%", {*}null{*}); The Resultset is not order per JDBC spec [https://docs.oracle.com/javase/8/docs/api/java/sql/DatabaseMetaData.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A-] Happens with various releases including hive-jdbc-3.1.3000.7.1.7.0-551 hive-jdbc-3.1.3000.7.1.6.0-297 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26185) Need support for metadataonly operations with iceberg (e.g select distinct on partition column)
Rajesh Balamohan created HIVE-26185: --- Summary: Need support for metadataonly operations with iceberg (e.g select distinct on partition column) Key: HIVE-26185 URL: https://issues.apache.org/jira/browse/HIVE-26185 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Rajesh Balamohan {noformat} select distinct ss_sold_date_sk from store_sales {noformat} This query scans 1800+ rows in hive acid. But takes ages to process with NullScanOptimiser during compilation phase (https://issues.apache.org/jira/browse/HIVE-24262) {noformat} Hive ACID INFO : Executing command(queryId=hive_20220427233926_282bc9d8-220c-4a09-928d-411601c2ef14): select distinct ss_sold_date_sk from store_sales INFO : Compute 'ndembla-test2' is active. INFO : Query ID = hive_20220427233926_282bc9d8-220c-4a09-928d-411601c2ef14 INFO : Total jobs = 1 INFO : Launching Job 1 out of 1 INFO : Starting task [Stage-1:MAPRED] in serial mode INFO : Subscribed to counters: [] for queryId: hive_20220427233926_282bc9d8-220c-4a09-928d-411601c2ef14 INFO : Tez session hasn't been created yet. Opening session INFO : Dag name: select distinct ss_sold_date_s...store_sales (Stage-1) INFO : Status: Running (Executing on YARN cluster with App id application_1651102345385_) INFO : Status: DAG finished successfully in 1.81 seconds INFO : DAG ID: dag_1651102345385__5 INFO : INFO : Query Execution Summary INFO : -- INFO : OPERATIONDURATION INFO : -- INFO : Compile Query 55.47s INFO : Prepare Plan2.32s INFO : Get Query Coordinator (AM) 0.13s INFO : Submit Plan 0.03s INFO : Start DAG 0.09s INFO : Run DAG 1.80s INFO : -- INFO : INFO : Task Execution Summary INFO : -- INFO : VERTICES DURATION(ms) CPU_TIME(ms)GC_TIME(ms) INPUT_RECORDS OUTPUT_RECORDS INFO : -- INFO : Map 1 1009.00 0 0 1,8241,824 INFO : Reducer 2 0.00 0 0 1,8240 INFO : -- INFO : {noformat} However, same query scans *2.8 Billion records.* in iceberg format. This can be fixed. {noformat} INFO : Executing command(queryId=hive_20220427233519_cddc6dd1-95a3-4f0e-afa5-e11e9dc5fa72): select distinct ss_sold_date_sk from store_sales INFO : Compute 'ndembla-test2' is active. INFO : Query ID = hive_20220427233519_cddc6dd1-95a3-4f0e-afa5-e11e9dc5fa72 INFO : Total jobs = 1 INFO : Launching Job 1 out of 1 INFO : Starting task [Stage-1:MAPRED] in serial mode INFO : Subscribed to counters: [] for queryId: hive_20220427233519_cddc6dd1-95a3-4f0e-afa5-e11e9dc5fa72 INFO : Tez session hasn't been created yet. Opening session INFO : Dag name: select distinct ss_sold_date_s...store_sales (Stage-1) INFO : Status: Running (Executing on YARN cluster with App id application_1651102345385_) -- VERTICES MODESTATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -- Map 1 .. llap SUCCEEDED 7141 714100 0 0 Reducer 2 .. llap SUCCEEDED 2 200 0 0 -- VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 18.48 s -- INFO : Status: DAG finished successfully in 17.97 seconds INFO : DAG ID: dag_1651102345385__4 INFO : INFO : Query Execution Summary INFO : -- INFO : OPERATIONDURATION INFO : -- INFO : Compile Query 1.81s INFO : Prepare Plan0.04s INFO : Get Query Coord
[jira] [Created] (HIVE-26184) COLLECT_SET with GROUP BY is very slow when some keys are highly skewed
okumin created HIVE-26184: - Summary: COLLECT_SET with GROUP BY is very slow when some keys are highly skewed Key: HIVE-26184 URL: https://issues.apache.org/jira/browse/HIVE-26184 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.3, 2.3.8 Reporter: okumin Assignee: okumin I observed some reducers spend 98% of CPU time in invoking `java.util.HashMap#clear`. Looking the detail, I found COLLECT_SET reuses a LinkedHashSet and its `clear` can be quite heavy when a relation has a small number of highly skewed keys. To reproduce the issue, first, we will create rows with a skewed key. {code:java} INSERT INTO test_collect_set SELECT '----' AS key, CAST(UUID() AS VARCHAR) AS value FROM table_with_many_rows LIMIT 10;{code} Then, we will create many non-skewed rows. {code:java} INSERT INTO test_collect_set SELECT UUID() AS key, UUID() AS value FROM sample_datasets.nasdaq LIMIT 500;{code} We can observe the issue when we aggregate values by `key`. {code:java} SELECT key, COLLECT_SET(value) FROM group_by_skew GROUP BY key{code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26183) Create delete writer for the UPDATE statemens
Peter Vary created HIVE-26183: - Summary: Create delete writer for the UPDATE statemens Key: HIVE-26183 URL: https://issues.apache.org/jira/browse/HIVE-26183 Project: Hive Issue Type: Sub-task Reporter: Peter Vary During the investigation of the updates of partitioned table we had the following issue: - Iceberg inserts are needed to be sorted by the new partition keys - Iceberg deletes are needed to be sorted by the old partition keys and filenames This could contradict each other. OTOH Hive updates create a single query and writes out the insert/delete record for ever row. This would mean plenty of open writers. We might want to create something like a https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/io/SortedPosDeleteWriter.java, but we do not want to keep the whole rows in memory. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26182) Some improvements to make DPP more debuggable
László Bodor created HIVE-26182: --- Summary: Some improvements to make DPP more debuggable Key: HIVE-26182 URL: https://issues.apache.org/jira/browse/HIVE-26182 Project: Hive Issue Type: Improvement Reporter: László Bodor -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26181) Add details on the number of partitions/entries in dynamic partition pruning
Rajesh Balamohan created HIVE-26181: --- Summary: Add details on the number of partitions/entries in dynamic partition pruning Key: HIVE-26181 URL: https://issues.apache.org/jira/browse/HIVE-26181 Project: Hive Issue Type: Bug Reporter: Rajesh Balamohan Related ticket: HIVE-26008 It will be good to print details on the number of partition pruning entries for debugging and for understanding the eff* of the query. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26180) Change MySQLConnectorProvider driver from mariadb to mysql
zhangbutao created HIVE-26180: - Summary: Change MySQLConnectorProvider driver from mariadb to mysql Key: HIVE-26180 URL: https://issues.apache.org/jira/browse/HIVE-26180 Project: Hive Issue Type: Bug Components: StorageHandler Affects Versions: 4.0.0-alpha-1, 4.0.0-alpha-2 Reporter: zhangbutao -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26179) In tez reuse container mode, asyncInitOperations are not clear.
zhengchenyu created HIVE-26179: -- Summary: In tez reuse container mode, asyncInitOperations are not clear. Key: HIVE-26179 URL: https://issues.apache.org/jira/browse/HIVE-26179 Project: Hive Issue Type: Bug Components: Hive, Tez Affects Versions: 1.2.1 Environment: engine: Tez (Note: tez.am.container.reuse.enabled is true) Reporter: zhengchenyu Assignee: zhengchenyu Fix For: 4.0.0 In our cluster, we found error like this. {code} Vertex failed, vertexName=Map 1, vertexId=vertex_1650608671415_321290_1_11, diagnostics=[Task failed, taskId=task_1650608671415_321290_1_11_000422, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1650608671415_321290_1_11_000422_0:java.lang.RuntimeException: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:135) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.RuntimeException: Hive Runtime Error while closing operators at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:349) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:161) ... 16 more Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:488) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:684) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:698) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.close(MapRecordProcessor.java:338) ... 17 more {code} When tez reuse container is enable, and use MapJoinOperator, if same tasks's different taskattemp execute in same container, will throw NPE. By my debug, I found the second task attempt use first task's asyncInitOperations. asyncInitOperations are not clear when close op, then second taskattemp may use first taskattepmt's mapJoinTables which HybridHashTableContainer.HashPartition is closed, so throw NPE. We must clear asyncInitOperations when op is closed. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26178) Multiple version of woodstox jars found in spark class path
Sai Hemanth Gantasala created HIVE-26178: Summary: Multiple version of woodstox jars found in spark class path Key: HIVE-26178 URL: https://issues.apache.org/jira/browse/HIVE-26178 Project: Hive Issue Type: Bug Components: Hive, Spark Reporter: Sai Hemanth Gantasala Assignee: Sai Hemanth Gantasala In Spark the woodstox-core jar is coming from two sources: - hadoop-client (woodstox-core:jar:5.0.3) - hive-service (woodstox-core:jar:5.2.1) introduced via xml sec dependency. Woodstox jar is anyway not used in the hive. So we can remove this dependency in the hive. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26177) Create a new connection pool for compaction (DataNucleus)
Antal Sinkovits created HIVE-26177: -- Summary: Create a new connection pool for compaction (DataNucleus) Key: HIVE-26177 URL: https://issues.apache.org/jira/browse/HIVE-26177 Project: Hive Issue Type: Sub-task Reporter: Antal Sinkovits -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26176) Create a new connection pool for compaction (CompactionTxnHandler)
Antal Sinkovits created HIVE-26176: -- Summary: Create a new connection pool for compaction (CompactionTxnHandler) Key: HIVE-26176 URL: https://issues.apache.org/jira/browse/HIVE-26176 Project: Hive Issue Type: Sub-task Reporter: Antal Sinkovits -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26175) single quote in a comment causes parsing errors
renjianting created HIVE-26175: -- Summary: single quote in a comment causes parsing errors Key: HIVE-26175 URL: https://issues.apache.org/jira/browse/HIVE-26175 Project: Hive Issue Type: Improvement Components: CLI, Parser Affects Versions: 3.1.3 Reporter: renjianting Assignee: renjianting Fix For: 4.0.0 A single quote in a comment causes parsing errors: such as {code:java} select 1 -- I'm xxx from tbl; {code} Running a task like this will result in the following error: {code:java} NoViableAltException(377@[201:64: ( ( KW_AS )? alias= identifier )?]) at org.antlr.runtime.DFA.noViableAlt(DFA.java:158) at org.antlr.runtime.DFA.predict(DFA.java:116) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.tableSource(HiveParser_FromClauseParser.java:4220) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.atomjoinSource(HiveParser_FromClauseParser.java:1602) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.joinSource(HiveParser_FromClauseParser.java:1903) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromSource(HiveParser_FromClauseParser.java:1527) at org.apache.hadoop.hive.ql.parse.HiveParser_FromClauseParser.fromClause(HiveParser_FromClauseParser.java:1370) at org.apache.hadoop.hive.ql.parse.HiveParser.fromClause(HiveParser.java:45020) at org.apache.hadoop.hive.ql.parse.HiveParser.atomSelectStatement(HiveParser.java:39792) at org.apache.hadoop.hive.ql.parse.HiveParser.selectStatement(HiveParser.java:40044) at org.apache.hadoop.hive.ql.parse.HiveParser.regularBody(HiveParser.java:39690) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpressionBody(HiveParser.java:38900) at org.apache.hadoop.hive.ql.parse.HiveParser.queryStatementExpression(HiveParser.java:38788) at org.apache.hadoop.hive.ql.parse.HiveParser.execStatement(HiveParser.java:2396) at org.apache.hadoop.hive.ql.parse.HiveParser.statement(HiveParser.java:1420) at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:220) at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:74) at org.apache.hadoop.hive.ql.parse.ParseUtils.parse(ParseUtils.java:67) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:616) at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1826) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1773) at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1768) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:214) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:471
[jira] [Created] (HIVE-26174) ALTER TABLE RENAME TO should check new db location
Adrian Wang created HIVE-26174: -- Summary: ALTER TABLE RENAME TO should check new db location Key: HIVE-26174 URL: https://issues.apache.org/jira/browse/HIVE-26174 Project: Hive Issue Type: Improvement Reporter: Adrian Wang Assignee: Adrian Wang Currently, if we run ALTER TABLE db1.table1 RENAME TO db2.table2; and with `db1` and `db2` on different filesystem, for example `db1` as `"hdfs:/user/hive/warehouse/db1.db"`, and `db2` as `"s3://bucket/s3warehouse/db2.db"`, the new `db2.table2` will be under location `hdfs:/s3warehouse/db2.db/table2`, which looks quite strange. The idea is to ban this kind of operation, as we seem to intend to ban that, but the check was done after we changed file system scheme so it was always true. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26173) Upgrade derby to 10.14.2.0
Hemanth Boyina created HIVE-26173: - Summary: Upgrade derby to 10.14.2.0 Key: HIVE-26173 URL: https://issues.apache.org/jira/browse/HIVE-26173 Project: Hive Issue Type: Improvement Reporter: Hemanth Boyina Assignee: Hemanth Boyina upgrade derby from 10.14.1.0 to 10.14.2.0, to fix the vulnerability CVE-2018-1313 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26172) Upgrade ant to 1.10.12
Hemanth Boyina created HIVE-26172: - Summary: Upgrade ant to 1.10.12 Key: HIVE-26172 URL: https://issues.apache.org/jira/browse/HIVE-26172 Project: Hive Issue Type: Improvement Reporter: Hemanth Boyina Assignee: Hemanth Boyina Upgrade ant from 1.10.9 to 1.10.12 to fix the vulnerability CVE-2021-36373 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26171) HMSHandler get_all_tables method can not retrieve tables from remote database
zhangbutao created HIVE-26171: - Summary: HMSHandler get_all_tables method can not retrieve tables from remote database Key: HIVE-26171 URL: https://issues.apache.org/jira/browse/HIVE-26171 Project: Hive Issue Type: Bug Components: Standalone Metastore Affects Versions: 4.0.0-alpha-1, 4.0.0-alpha-2 Reporter: zhangbutao At present, get_all_tables method in HMSHandler would not get table from remote database. However, other component like presto and some jobs we developed have used this api instead of _get_tables_ which could retrieve all tables both native database and remote database __ . {code:java} // get_all_tables only can get tables from native database public List get_all_tables(final String dbname) throws MetaException {{code} {code:java} // get_tables can get tables from both native and remote database public List get_tables(final String dbname, final String pattern){code} I think we shoud fix get_all_tables to make it retrive tables from remote database. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26170) Code cleanup in jdbc dataconnector
zhangbutao created HIVE-26170: - Summary: Code cleanup in jdbc dataconnector Key: HIVE-26170 URL: https://issues.apache.org/jira/browse/HIVE-26170 Project: Hive Issue Type: Improvement Components: Standalone Metastore Affects Versions: 4.0.0-alpha-2 Reporter: zhangbutao Clean up unused import; Fix incorrect Logger in PostgreSQLConnectorProvider; -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26169) Set non-vectorized mode as default when accessing iceberg tables in avro fileformat
László Pintér created HIVE-26169: Summary: Set non-vectorized mode as default when accessing iceberg tables in avro fileformat Key: HIVE-26169 URL: https://issues.apache.org/jira/browse/HIVE-26169 Project: Hive Issue Type: Improvement Reporter: László Pintér Assignee: László Pintér Vectorization for iceberg tables in avro format is not yet supported. We should disable vectorization when we want to read/write avro tables. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26168) EXPLAIN DDL command output is not deterministic
Stamatis Zampetakis created HIVE-26168: -- Summary: EXPLAIN DDL command output is not deterministic Key: HIVE-26168 URL: https://issues.apache.org/jira/browse/HIVE-26168 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Stamatis Zampetakis The EXPLAIN DDL command (HIVE-24596) can be used to recreate the schema for a given query in order to debug planner issues. This is achieved by fetching information from the metastore and outputting series of DDL commands. The output commands though may appear in different order among runs since there is no mechanism to enforce an explicit order. Consider for instance the following scenario. {code:sql} CREATE TABLE customer ( `c_custkey` bigint, `c_name`string, `c_address` string ); INSERT INTO customer VALUES (1, 'Bob', '12 avenue Mansart'), (2, 'Alice', '24 avenue Mansart'); EXPLAIN DDL SELECT c_custkey FROM customer WHERE c_name = 'Bob'; {code} +Result 1+ {noformat} ALTER TABLE default.customer UPDATE STATISTICS SET('numRows'='2','rawDataSize'='48' ); ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_address SET('avgColLen'='17.0','maxColLen'='17','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_address BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwbec/QPAjtBF ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_custkey SET('lowValue'='1','highValue'='2','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_custkey BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwfO+SIOOofED ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_name SET('avgColLen'='4.0','maxColLen'='5','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_name BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAIChJLg1AGD1aCNBg== {noformat} +Result 2+ {noformat} ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_custkey SET('lowValue'='1','highValue'='2','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_custkey BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwfO+SIOOofED ALTER TABLE default.customer UPDATE STATISTICS SET('numRows'='2','rawDataSize'='48' ); ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_address SET('avgColLen'='17.0','maxColLen'='17','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_address BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAICwbec/QPAjtBF ALTER TABLE default.customer UPDATE STATISTICS FOR COLUMN c_name SET('avgColLen'='4.0','maxColLen'='5','numNulls'='0','numDVs'='2' ); -- BIT VECTORS PRESENT FOR default.customer FOR COLUMN c_name BUT THEY ARE NOT SUPPORTED YET. THE BASE64 VALUE FOR THE BITVECTOR IS SExMoAIChJLg1AGD1aCNBg== {noformat} The two results are equivalent but the statements appear in a different order. This is not a big issue cause the results remain correct but it may lead to test flakiness so it might be worth addressing. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26167) QueryStateMap is SessionState is maintained correctly
László Pintér created HIVE-26167: Summary: QueryStateMap is SessionState is maintained correctly Key: HIVE-26167 URL: https://issues.apache.org/jira/browse/HIVE-26167 Project: Hive Issue Type: Improvement Reporter: László Pintér Assignee: László Pintér When the Driver is the QueryStateMap is also initialized with the query ID and the current queryState object. This record is kept in the map until the execution of the query is completed. There are many unit tests that initialise the driver object once during the setup phase, and use the same object to execute all the different queries. As a consequence, after the first execution, the QueryStateMap will be cleaned and all subsequent queries will run into null pointer exception while trying to fetch the current querystate from the SessionState. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26166) Make website GDPR compliant
Stamatis Zampetakis created HIVE-26166: -- Summary: Make website GDPR compliant Key: HIVE-26166 URL: https://issues.apache.org/jira/browse/HIVE-26166 Project: Hive Issue Type: Task Components: Website Reporter: Stamatis Zampetakis Per the email that was sent out from privacy we need to make the Hive website GDPR compliant. # The link to privacy policy needs to be updated from [https://hive.apache.org/privacy_policy.html] to [https://privacy.apache.org/policies/privacy-policy-public.html] # The google analytics service must be removed -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26165) Remove READ locks for ACID tables with SoftDelete enabled
Denys Kuzmenko created HIVE-26165: - Summary: Remove READ locks for ACID tables with SoftDelete enabled Key: HIVE-26165 URL: https://issues.apache.org/jira/browse/HIVE-26165 Project: Hive Issue Type: Task Reporter: Denys Kuzmenko -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26164) jjhk
Meylis Matiyev created HIVE-26164: - Summary: jjhk Key: HIVE-26164 URL: https://issues.apache.org/jira/browse/HIVE-26164 Project: Hive Issue Type: Bug Reporter: Meylis Matiyev -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26163) Incorrect format in columnstats_columnname_parse.q's insert statement can cause exceptions
Soumyakanti Das created HIVE-26163: -- Summary: Incorrect format in columnstats_columnname_parse.q's insert statement can cause exceptions Key: HIVE-26163 URL: https://issues.apache.org/jira/browse/HIVE-26163 Project: Hive Issue Type: Improvement Reporter: Soumyakanti Das Assignee: Soumyakanti Das Change insert statement to {code:java} insert into table2 partition(t2_col3='2021-01-01') values('1','1'); {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26162) Documentation upgrade
Florian CASTELAIN created HIVE-26162: Summary: Documentation upgrade Key: HIVE-26162 URL: https://issues.apache.org/jira/browse/HIVE-26162 Project: Hive Issue Type: Wish Reporter: Florian CASTELAIN Hello. I have been looking for specific elements in the documentation, more specifically, the list of serdeproperties. So I was looking for an exhaustive list of serdeproperties and I cannot find one at all. This is very surprising as one would expect a tool to describe all of its features. Is it planned to create such a list ? If it already exists, where is it ? Because the official docs do not contain it (or it is well hidden, thus you should make it more accessible). Thank you. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26161) Use Hive's ORC dependency version when producing file footer for Iceberg
Ádám Szita created HIVE-26161: - Summary: Use Hive's ORC dependency version when producing file footer for Iceberg Key: HIVE-26161 URL: https://issues.apache.org/jira/browse/HIVE-26161 Project: Hive Issue Type: Bug Reporter: Ádám Szita Assignee: Ádám Szita For schema evolution and projection purposes we produce an ORC file footer byte buffer in VectorizedReadUtils. Currently Iceberg's bundled/shaded ORC is used to produce these file footer bytes when dealing with Iceberg/ORC tables. This version of ORC is newer (1.7.3) than what Hive uses (1.6.9). Later on we could face compatibility issues when trying to reconstruct an OrcTail object with a 1.6.9 reader from the bytes that the 1.7.3 reader serialized. We need to invert the direction as we can rely more on backward compatibility than on forward compatibility of ORC. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26160) Materialized View rewrite does not check tables scanned in sub-query expressions
Krisztian Kasa created HIVE-26160: - Summary: Materialized View rewrite does not check tables scanned in sub-query expressions Key: HIVE-26160 URL: https://issues.apache.org/jira/browse/HIVE-26160 Project: Hive Issue Type: Bug Components: CBO, Materialized views Reporter: Krisztian Kasa Assignee: Krisztian Kasa Materialized View rewrite based on exact sql text match uses the initial CBO plan to explore possibilities to change the query plan or part of the plan to an MV scan. This algorithm requires the tables scanned by the original query plan. If the query contains sub query expressions the tables scanned by the sub query are not listed which can lead to rewrite the original plan to scan an outdated MV. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26159) hive cli is unavailable from hive command
Wechar created HIVE-26159: - Summary: hive cli is unavailable from hive command Key: HIVE-26159 URL: https://issues.apache.org/jira/browse/HIVE-26159 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 4.0.0-alpha-1 Reporter: Wechar Assignee: Wechar Fix For: 4.0.0 Hive cli is a convenient tool to connect to hive metastore service, but now hive cli can not start even if we use *--service cli* option, it should be a bug of ticket [HIVE-24348|https://issues.apache.org/jira/browse/HIVE-24348]. *Step to reproduce:* {code:bash} hive@hive:/root$ /usr/share/hive/bin/hive --service cli --hiveconf hive.metastore.uris=thrift://hive:9084 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/apache-hive-4.0.0-alpha-2-SNAPSHOT-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-3.3.1/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Beeline version 4.0.0-alpha-2-SNAPSHOT by Apache Hive beeline> {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26158) TRANSLATED_TO_EXTERNAL partition tables cannot query partition data after rename
tanghui created HIVE-26158: -- Summary: TRANSLATED_TO_EXTERNAL partition tables cannot query partition data after rename Key: HIVE-26158 URL: https://issues.apache.org/jira/browse/HIVE-26158 Project: Hive Issue Type: Bug Affects Versions: 4.0.0, 4.0.0-alpha-1, 4.0.0-alpha-2 Reporter: tanghui After the patch is updated, the partition table location and hdfs data directory are displayed normally, but the partition location of the table in the SDS in the Hive metabase is still displayed as the location of the old table, resulting in no data in the query partition. set hive.create.as.external.legacy=true; CREATE TABLE part_test( c1 string ,c2 string )PARTITIONED BY (dat string) insert into part_test values ("11","th","20220101") insert into part_test values ("22","th","20220102") alter table part_test rename to part_test11; --this resulting in no data in the query partition. select * from part_test11 where dat="20220101"; - SDS in the Hive metabase: select SDS.LOCATION from TBLS,SDS where TBLS.TBL_NAME="part_test11" AND TBLS.TBL_ID=SDS.CD_ID; --- |LOCATION| |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test11| |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test/dat=20220101| |hdfs://nameservice1/warehouse/tablespace/external/hive/part_test/dat=20220102| --- We need to modify the partition location of the table in SDS to ensure that the query results are normal -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26157) Change Iceberg storage handler authz URI to metadata location
László Pintér created HIVE-26157: Summary: Change Iceberg storage handler authz URI to metadata location Key: HIVE-26157 URL: https://issues.apache.org/jira/browse/HIVE-26157 Project: Hive Issue Type: Improvement Reporter: László Pintér Assignee: László Pintér In HIVE-25964, the authz URI has been changed to "iceberg://db.table". It is possible to set the metadata pointers of table A to point to table B, and therefore you could read table B's data via querying table A. {code:sql} alter table A set tblproperties ('metadata_location'='/path/to/B/snapshot.json', 'previous_metadata_location'='/path/to/B/prev_snapshot.json'); {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26156) Iceberg delete writer should handle deleting from old partition specs
Marton Bod created HIVE-26156: - Summary: Iceberg delete writer should handle deleting from old partition specs Key: HIVE-26156 URL: https://issues.apache.org/jira/browse/HIVE-26156 Project: Hive Issue Type: Bug Reporter: Marton Bod Assignee: Marton Bod While {{HiveIcebergRecordWriter}} always writes data out according to the latest spec, the {{HiveIcebergDeleteWriter}} might have to write delete files into partitions that correspond to a variety of specs, both old and new. Therefore we should pass the {{{}table.specs(){}}}map into the {{HiveIcebergWriter}} so that the delete writer can choose the appropriate spec on a per-record basis. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26155) Create a new connection pool for compaction
Antal Sinkovits created HIVE-26155: -- Summary: Create a new connection pool for compaction Key: HIVE-26155 URL: https://issues.apache.org/jira/browse/HIVE-26155 Project: Hive Issue Type: Improvement Components: Standalone Metastore Reporter: Antal Sinkovits Assignee: Antal Sinkovits Currently the TxnHandler uses 2 connection pools to communicate with the HMS: the default one and one for mutexing. If compaction is configured incorrectly (e.g. too many Initiators are running on the same db) then compaction can use up all the connections in the default connection pool and all user queries can get stuck. We should have a separate connection pool (configurable size) just for compaction-related activities. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26154) CLONE - Upgrade cron-utils to 9.1.6
Asif Saleh created HIVE-26154: - Summary: CLONE - Upgrade cron-utils to 9.1.6 Key: HIVE-26154 URL: https://issues.apache.org/jira/browse/HIVE-26154 Project: Hive Issue Type: Task Components: Hive Affects Versions: 3.1.3, 4.0.0 Reporter: Asif Saleh To fix [CVE-2021-41269|https://nvd.nist.gov/vuln/detail/CVE-2021-41269] issue. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26153) CVE-2021-27568
Asif Saleh created HIVE-26153: - Summary: CVE-2021-27568 Key: HIVE-26153 URL: https://issues.apache.org/jira/browse/HIVE-26153 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.1.3 Reporter: Asif Saleh Address the vulnerability CVE-2021-27568. Hive jdbc driver is packaged with json-smart version which has the above vulnerability. An issue was discovered in netplex json-smart-v1 through 2015-10-23 and json-smart-v2 through 2.4. An exception is thrown from a function, but it is not caught, as demonstrated by NumberFormatException. When it is not caught, it may cause programs using the library to crash or expose sensitive information. Fix: Upgrade {{net.minidev:json-smart}} to version 1.3.2, 2.4.1 or higher. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26152) Compilation fails with Maven 3.8.5
Tony Torralba created HIVE-26152: Summary: Compilation fails with Maven 3.8.5 Key: HIVE-26152 URL: https://issues.apache.org/jira/browse/HIVE-26152 Project: Hive Issue Type: Bug Affects Versions: 3.1.3 Reporter: Tony Torralba When trying to build Hive with Maven 3.8.5 on latest {{{}master{}}}, the build fails because of a name clash in a class between {{kyro4}} and {{{}kyro5{}}}: {code:java} [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.1:compile (default-compile) on project hive-kryo-registrator: Compilation failure: Compilation failure: [ERROR] /tmp/hive/kryo-registrator/src/main/java/org/apache/hive/spark/HiveKryoRegistrator.java:[41,18] org.apache.hive.spark.HiveKryoRegistrator.HiveKeySerializer is not abstract and does not override abstract method read(com.esotericsoftware.kryo.Kryo,com.esotericsoftware.kryo.io.Input,java.lang.Class) in com.esotericsoftware.kryo.Serializer[ERROR] /tmp/hive/kryo-registrator/src/main/java/org/apache/hive/spark/HiveKryoRegistrator.java:[49,20] name clash: read(com.esotericsoftware.kryo.Kryo,com.esotericsoftware.kryo.io.Input,java.lang.Class) in org.apache.hive.spark.HiveKryoRegistrator.HiveKeySerializer and read(com.esotericsoftware.kryo.Kryo,com.esotericsoftware.kryo.io.Input,java.lang.Class) in com.esotericsoftware.kryo.Serializer have the same erasure, yet neither overrides the other[ERROR] /tmp/hive/kryo-registrator/src/main/java/org/apache/hive/spark/HiveKryoRegistrator.java:[57,10] org.apache.hive.spark.HiveKryoRegistrator.BytesWritableSerializer is not abstract and does not override abstract method read(com.esotericsoftware.kryo.Kryo,com.esotericsoftware.kryo.io.Input,java.lang.Class) in com.esotericsoftware.kryo.Serializer[ERROR] /tmp/hive/kryo-registrator/src/main/java/org/apache/hive/spark/HiveKryoRegistrator.java:[64,26] name clash: read(com.esotericsoftware.kryo.Kryo,com.esotericsoftware.kryo.io.Input,java.lang.Class) in org.apache.hive.spark.HiveKryoRegistrator.BytesWritableSerializer and read(com.esotericsoftware.kryo.Kryo,com.esotericsoftware.kryo.io.Input,java.lang.Class) in com.esotericsoftware.kryo.Serializer have the same erasure, yet neither overrides the other [ERROR] /tmp/hive/kryo-registrator/src/main/java/org/apache/hive/spark/NoHashCodeKryoSerializer.java:[51,18] org.apache.hive.spark.NoHashCodeKryoSerializer.HiveKeySerializer is not abstract and does not override abstract method read(com.esotericsoftware.kryo.Kryo,com.esotericsoftware.kryo.io.Input,java.lang.Class) in com.esotericsoftware.kryo.Serializer[ERROR] /tmp/hive/kryo-registrator/src/main/java/org/apache/hive/spark/NoHashCodeKryoSerializer.java:[58,20] name clash: read(com.esotericsoftware.kryo.Kryo,com.esotericsoftware.kryo.io.Input,java.lang.Class) in org.apache.hive.spark.NoHashCodeKryoSerializer.HiveKeySerializer and read(com.esotericsoftware.kryo.Kryo,com.esotericsoftware.kryo.io.Input,java.lang.Class) in com.esotericsoftware.kryo.Serializer have the same erasure, yet neither overrides the other{code} Build command: {code:java} mvn clean package -DskipTests -Dmaven.javadoc.skip=true -Drat.skip=true {code} (Had to skip the RAT check because it complained about a lot of files not having approved licenses). -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26151) Support range-based time travel queries for Iceberg
Marton Bod created HIVE-26151: - Summary: Support range-based time travel queries for Iceberg Key: HIVE-26151 URL: https://issues.apache.org/jira/browse/HIVE-26151 Project: Hive Issue Type: New Feature Reporter: Marton Bod Assignee: Marton Bod Allow querying which records have been inserted during a certain time window for Iceberg tables. The Iceberg TableScan API provides an implementation for that, so most of the work would go into adding syntax support and transporting the startTime and endTime parameters to the Iceberg input format. Proposed new syntax: SELECT * FROM table FOR SYSTEM_TIME FROM '' TO '' SELECT * FROM table FOR SYSTEM_VERSION FROM TO (the TO clause is optional in both cases) -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (HIVE-26150) OrcRawRecordMerger reads each row twice
Alessandro Solimando created HIVE-26150: --- Summary: OrcRawRecordMerger reads each row twice Key: HIVE-26150 URL: https://issues.apache.org/jira/browse/HIVE-26150 Project: Hive Issue Type: Bug Components: ORC, Transactions Affects Versions: 4.0.0-alpha-2 Reporter: Alessandro Solimando OrcRawRecordMerger reads each row twice, the issue does not surface since the merger is only used with the parameter "collapseEvents" as true, which filters out one of the two rows. collapseEvents true and false should produce the same result, since in current acid implementation, each event has a distinct rowid, so two identical rows cannot be there, this is the case only for the bug. In order to reproduce the issue, it is sufficient to set the second parameter to false [here|https://github.com/apache/hive/blob/61d4ff2be48b20df9fd24692c372ee9c2606babe/ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java#L2103-L2106], and run tests in TestOrcRawRecordMerger and observe two tests failing. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26149) Non blocking DROP DATABASE implementation
Denys Kuzmenko created HIVE-26149: - Summary: Non blocking DROP DATABASE implementation Key: HIVE-26149 URL: https://issues.apache.org/jira/browse/HIVE-26149 Project: Hive Issue Type: Task Reporter: Denys Kuzmenko -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26148) Keep MetaStoreFilterHook interface compatibility after introducing catalogs
Wechar created HIVE-26148: - Summary: Keep MetaStoreFilterHook interface compatibility after introducing catalogs Key: HIVE-26148 URL: https://issues.apache.org/jira/browse/HIVE-26148 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 3.0.0 Reporter: Wechar Assignee: Wechar Fix For: 4.0.0-alpha-1 Hive 3.0 introduce catalog concept, when we upgrade hive dependency version from 2.3 to 3.x, we found some interfaces of *MetaStoreFilterHook* are not compatible: {code:bash} git show ba8a99e115 -- standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreFilterHook.java {code} {code:bash} --- a/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreFilterHook.java +++ b/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreFilterHook.java /** * Filter given list of tables - * @param dbName - * @param tableList + * @param catName catalog name + * @param dbName database name + * @param tableList list of table returned by the metastore * @return List of filtered table names */ - public List filterTableNames(String dbName, List tableList) throws MetaException; + List filterTableNames(String catName, String dbName, List tableList) + throws MetaException; {code} We can remain the previous interfaces and use the default catalog to implement. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26147) OrcRawRecordMerger throws NPE when hive.acid.key.index is missing for an acid file
Alessandro Solimando created HIVE-26147: --- Summary: OrcRawRecordMerger throws NPE when hive.acid.key.index is missing for an acid file Key: HIVE-26147 URL: https://issues.apache.org/jira/browse/HIVE-26147 Project: Hive Issue Type: Bug Components: ORC, Transactions Affects Versions: 4.0.0-alpha-2 Reporter: Alessandro Solimando Assignee: Alessandro Solimando When _hive.acid.key.index_ is missing for an acid ORC file _OrcRawRecordMerger_ throws as follows: {noformat} Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.discoverKeyBounds(OrcRawRecordMerger.java:795) ~[hive-exec-4.0.0-alpha-2-SNAPS HOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.io.orc.OrcRawRecordMerger.(OrcRawRecordMerger.java:1053) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4. 0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getReader(OrcInputFormat.java:2096) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-a lpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1991) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4 .0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.FetchOperator$FetchInputFormatSplit.getRecordReader(FetchOperator.java:769) ~[hive-exec-4.0.0-alpha -2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:335) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0- alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:560) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha -2-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:529) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2- SNAPSHOT] at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:150) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.getFetchingTableResults(Driver.java:719) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNA PSHOT] at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:671) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha-2-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.getResults(ReExecDriver.java:233) ~[hive-exec-4.0.0-alpha-2-SNAPSHOT.jar:4.0.0-alpha -2-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.getNextRowSet(SQLOperation.java:489) ~[hive-service-4.0.0-alpha-2-SNAPSHOT.jar: 4.0.0-alpha-2-SNAPSHOT] ... 24 more {noformat} For this situation to happen, the ORC file must have more than one stripe, and the offset of the element to seek should be locate it beyond the first stripe but before the tail one, as the code clearly suggests: {code:java} if (firstStripe != 0) { minKey = keyIndex[firstStripe - 1]; } if (!isTail) { maxKey = keyIndex[firstStripe + stripeCount - 1]; } {code} However, in the context of the detection of the original issue, the NPE was triggered even by a simple "select *" over a table with ORC files missing the _hive.acid.key.index_ metadata information, but it was never failing for ORC files with a single stripe. The file was generated after a major compaction of acid and non-acid data. In order to force an offset located in a stripe in the middle, one can use the following query, knowing in what stripe a particular value exists: {code:sql} select * from $table where c = $value {code} _OrcRawRecordMerger_ should simply leave as "null" the min and max keys when the _hive.acid.key.index_ metadata is missing. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26146) Handle missing hive.acid.key.index in the fixacidkeyindex utility
Alessandro Solimando created HIVE-26146: --- Summary: Handle missing hive.acid.key.index in the fixacidkeyindex utility Key: HIVE-26146 URL: https://issues.apache.org/jira/browse/HIVE-26146 Project: Hive Issue Type: Improvement Components: ORC, Transactions Affects Versions: 4.0.0-alpha-2 Reporter: Alessandro Solimando Assignee: Alessandro Solimando There is a utility in hive which can validate/fix corrupted _hive.acid.key.index_: {code:bash} hive --service fixacidkeyindex {code} At the moment the utility throws a NPE if the _hive.acid.key.index_ metadata entry is missing: {noformat} ERROR checking /hive-dev-box/multistripe_ko_acid.orc java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.orc.FixAcidKeyIndex.validate(FixAcidKeyIndex.java:183) at org.apache.hadoop.hive.ql.io.orc.FixAcidKeyIndex.checkFile(FixAcidKeyIndex.java:147) at org.apache.hadoop.hive.ql.io.orc.FixAcidKeyIndex.checkFiles(FixAcidKeyIndex.java:130) at org.apache.hadoop.hive.ql.io.orc.FixAcidKeyIndex.main(FixAcidKeyIndex.java:106) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:308) at org.apache.hadoop.util.RunJar.main(RunJar.java:222) {noformat} The aim of this ticket is to handle such case in order to support re-generating this metadata entry even when it is missing. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26145) Disable notification cleaner if interval is zero
Janos Kovacs created HIVE-26145: --- Summary: Disable notification cleaner if interval is zero Key: HIVE-26145 URL: https://issues.apache.org/jira/browse/HIVE-26145 Project: Hive Issue Type: Sub-task Components: Metastore Reporter: Janos Kovacs Assignee: Janos Kovacs Many of the housekeeping/background tasks can be turned off in case of having multiple instances running parallel. Some are controlled via the housekeeping node configuration, others are not started if their frequency is set to zero. The DB-Notification cleaner unfortunately doesn't have this functionality which makes all instances to race for the lock on the backend HMS database. Goal is to add change to be able to turn cleaner off in case if there are multiple instances running (be able to bound it to the housekeeping instance). -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26144) Add keys/indexes to support highly concurrent workload
Janos Kovacs created HIVE-26144: --- Summary: Add keys/indexes to support highly concurrent workload Key: HIVE-26144 URL: https://issues.apache.org/jira/browse/HIVE-26144 Project: Hive Issue Type: Sub-task Components: Database/Schema Reporter: Janos Kovacs Assignee: Janos Kovacs The following indexes are added to avoid full table-scan in backend rdbms: - primary key for COMPLETED_TXN_COMPONENTS - primary key for TXN_COMPONENTS - index for TXN_WRITE_NOTIFICATION_LOG -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26143) Performance enhancements for highly concurrent use-cases
Janos Kovacs created HIVE-26143: --- Summary: Performance enhancements for highly concurrent use-cases Key: HIVE-26143 URL: https://issues.apache.org/jira/browse/HIVE-26143 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Janos Kovacs Assignee: Janos Kovacs These changes are from a recent high-concurrency test-run with - 4x HS2 (each with dedicated HMS instances), each with up to 2k sessions - 1x HS2 (with dedicated HMS instance) for housekeeping (~all background tasks) The tests brought two issues to the surface: - missing indexes caused long running queries in the backend database (with table scans) which then lead to query locks - the notification log cleaners can't be configured to a single node -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26142) Extend the hidden conf list with webui keystore pwd
Janos Kovacs created HIVE-26142: --- Summary: Extend the hidden conf list with webui keystore pwd Key: HIVE-26142 URL: https://issues.apache.org/jira/browse/HIVE-26142 Project: Hive Issue Type: Bug Components: Security Reporter: Janos Kovacs Assignee: Janos Kovacs The SSL keystore configuration is separated for HS2 itself and the WebUI. The hidden configuration list only contains server2.keystore.password but should also contain server2.webui.keystore.password. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26141) Fix vector_ptf_part_simple_all_datatypes source file
Ayush Saxena created HIVE-26141: --- Summary: Fix vector_ptf_part_simple_all_datatypes source file Key: HIVE-26141 URL: https://issues.apache.org/jira/browse/HIVE-26141 Project: Hive Issue Type: Bug Reporter: Ayush Saxena Assignee: Ayush Saxena The source file has issues while parsing into a hive table due to tab/spaces irregularities . -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26140) Hive reports IndexOutOfBoundsException when access a database mapped to HBase
Chen He created HIVE-26140: -- Summary: Hive reports IndexOutOfBoundsException when access a database mapped to HBase Key: HIVE-26140 URL: https://issues.apache.org/jira/browse/HIVE-26140 Project: Hive Issue Type: Bug Reporter: Chen He ``` 2022-04-14 02:44:26,587 ERROR [c50cc557-0b28-4d31-80e9-8eae45cd1499 main] exec.Task: Failed to execute tez graph. java.lang.IndexOutOfBoundsException: Index: 12, Size: 12 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:634) at org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:371) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:195) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2664) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2335) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2011) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1709) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1703) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:821) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26139) URL Encoding from HIVE-26015 was a bit too aggressive
Steve Carlin created HIVE-26139: --- Summary: URL Encoding from HIVE-26015 was a bit too aggressive Key: HIVE-26139 URL: https://issues.apache.org/jira/browse/HIVE-26139 Project: Hive Issue Type: Bug Reporter: Steve Carlin The fix for HIVE-26015 was a bit too aggressive in the URL encoding. We should only encode space characters for now since this was the bug that was originally reported. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26138) Fix mapjoin_memcheck
Zoltan Haindrich created HIVE-26138: --- Summary: Fix mapjoin_memcheck Key: HIVE-26138 URL: https://issues.apache.org/jira/browse/HIVE-26138 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich this test fails very frequently http://ci.hive.apache.org/job/hive-precommit/job/master/1169/testReport/junit/org.apache.hadoop.hive.cli.split7/TestCliDriver/Testing___split_01___PostProcess___testCliDriver_mapjoin_memcheck_/ -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26137) Optimized transfer of Iceberg residual expressions from AM to execution
Ádám Szita created HIVE-26137: - Summary: Optimized transfer of Iceberg residual expressions from AM to execution Key: HIVE-26137 URL: https://issues.apache.org/jira/browse/HIVE-26137 Project: Hive Issue Type: Improvement Reporter: Ádám Szita HIVE-25967 introduced a hack to prevent Iceberg filter expressions to be serialized into splits. This temporary fix was to avoid OOM problems on Tez AM side, but at the same time prevented predicate pushdowns to work on the execution side too. This ticket intends to incorporate the long term solution. It turns out that the file scan tasks created by Iceberg actually don't contain a "residual" expressions, but rather a complete/original one. It becomes residual only when it is evaluated against the tasks' partition value, which only happens on the execution site. This means that the original filter is the same expression for all splits in Tez AM, so we can transfer it via job conf instead. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26136) Implement UPDATE statements for Iceberg tables
Peter Vary created HIVE-26136: - Summary: Implement UPDATE statements for Iceberg tables Key: HIVE-26136 URL: https://issues.apache.org/jira/browse/HIVE-26136 Project: Hive Issue Type: Task Reporter: Peter Vary Assignee: Peter Vary -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26135) Invalid Anti join conversion may cause missing results
Zoltan Haindrich created HIVE-26135: --- Summary: Invalid Anti join conversion may cause missing results Key: HIVE-26135 URL: https://issues.apache.org/jira/browse/HIVE-26135 Project: Hive Issue Type: Bug Reporter: Zoltan Haindrich Assignee: Zoltan Haindrich right now I think the following is needed to trigger the issue: * left outer join * only select left hand side columns * conditional which is using some udf * the nullness of the udf is checked repro sql; in case the conversion happens the row with 'a' will be missing {code} drop table if exists t; drop table if exists n; create table t(a string) stored as orc; create table n(a string) stored as orc; insert into t values ('a'),('1'),('2'),(null); insert into n values ('a'),('b'),('1'),('3'),(null); explain select n.* from n left outer join t on (n.a=t.a) where assert_true(t.a is null) is null; explain select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is null; select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is null; set hive.auto.convert.anti.join=false; select n.* from n left outer join t on (n.a=t.a) where cast(t.a as float) is null; {code} workaround could be to disable the feature: {code} set hive.auto.convert.anti.join=false; {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26133) Insert overwrite on Iceberg tables can result in duplicate entries after partition evolution
László Pintér created HIVE-26133: Summary: Insert overwrite on Iceberg tables can result in duplicate entries after partition evolution Key: HIVE-26133 URL: https://issues.apache.org/jira/browse/HIVE-26133 Project: Hive Issue Type: Improvement Reporter: László Pintér Assignee: László Pintér Insert overwrite commands in Hive only rewrite partitions affected by the query. If we write out a record with specA (e.g. day(ts)), resulting in a datafile: "/tableRoot/data/ts_day="2020-10-24"/.orc If you then change to specB (e.g. day(ts), name), the same record would go to a different partition: "/tableRoot/data/ts_day="2020-10-24"/name="Mike"/.orc If you then want to overwrite the table with itself, it will detect these two records to belong to different partitions (as they do), and therefore does not overwrite the original record with the new one, resulting in duplicate entries. {code:java} create table testice1000 (a int, b string) stored by iceberg stored as orc location 'file:/tmp/testice1000'; insert into testice1000 values (11, 'ddd'), (22, 'ttt'); alter table testice1000 set partition spec(truncate(2, b)); insert into testice1000 values (33, 'rrfdfdf'); insert overwrite table testice1000 select * from testice1000; --+ testice1000.a testice1000.b --+ 11 ddd 11 ddd 22 ttt 22 ttt 33 rrfdfdf --+ {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26134) Remove Hive on Spark from the main branch branch
Peter Vary created HIVE-26134: - Summary: Remove Hive on Spark from the main branch branch Key: HIVE-26134 URL: https://issues.apache.org/jira/browse/HIVE-26134 Project: Hive Issue Type: Task Reporter: Peter Vary Based on this discussion [here|https://lists.apache.org/thread/nxg2jpngp72t6clo90407jgqxnmdm5g4] there is no activity on keeping the feature up-to-date. We should remove it from the main line to help ongoing development efforts and keep the testing cheaper/faster. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26132) Schematool upgradeSchema fails with nullPointerException
David created HIVE-26132: Summary: Schematool upgradeSchema fails with nullPointerException Key: HIVE-26132 URL: https://issues.apache.org/jira/browse/HIVE-26132 Project: Hive Issue Type: Bug Reporter: David When running schematool upgradeSchema against a mysql database with a metastore_db, I get a nullPointerException. The command is: {{schematool -dbType mysql -upgradeSchema -verbose}} The same exception can be created by running the relevant hive upgrade script directly in beeline with the following command: {{beeline -u jdbc:mysql://mysql:3306/metastore_db -n [USER] -p[PASS] -f /usr/local/hive/scripts/metastore/upgrade/mysql/upgrade-2.3.0-to-3.0.0.mysql.sql}} Removing the follow lines from the sql script fixes this: {{SELECT 'Upgrading MetaStore schema from 2.3.0 to 3.0.0' AS ' ';}} {{SELECT 'Finished upgrading MetaStore schema from 2.3.0 to 3.0.0' AS ' ';}} The beeline exception is: {quote}Connecting to jdbc:mysql://mysql:3306/metastore_db Connected to: MySQL (version 5.6.51) Driver: MySQL Connector/J (version mysql-connector-java-8.0.28 (Revision: 7ff2161da3899f379fb3171b6538b191b1c5c7e2)) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:mysql://mysql:3306/metastore_db> SELECT 'Finished upgrading MetaStore schema from 2.3.0 to 3.0.0' AS ' '; The statement instance is not HiveStatement type: class com.mysql.cj.jdbc.StatementImpl The statement instance is not HiveStatement type: class com.mysql.cj.jdbc.StatementImpl java.lang.NullPointerException at java.lang.StringBuilder.(StringBuilder.java:112) at org.apache.hive.beeline.ColorBuffer.center(ColorBuffer.java:81) at org.apache.hive.beeline.TableOutputFormat.getOutputString(TableOutputFormat.java:123) at org.apache.hive.beeline.TableOutputFormat.getOutputString(TableOutputFormat.java:108) at org.apache.hive.beeline.TableOutputFormat.print(TableOutputFormat.java:51) at org.apache.hive.beeline.BeeLine.print(BeeLine.java:2257) at org.apache.hive.beeline.Commands.executeInternal(Commands.java:1026) at org.apache.hive.beeline.Commands.execute(Commands.java:1201) at org.apache.hive.beeline.Commands.sql(Commands.java:1130) at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1425) at org.apache.hive.beeline.BeeLine.execute(BeeLine.java:1287) at org.apache.hive.beeline.BeeLine.executeFile(BeeLine.java:1261) at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1064) at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:538) at org.apache.hive.beeline.BeeLine.main(BeeLine.java:520) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:226) at org.apache.hadoop.util.RunJar.main(RunJar.java:141) Closing: 0: jdbc:mysql://mysql:3306/metastore_db {quote} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26131) Incorrect OutputFormat when describing jdbc connector table
zhangbutao created HIVE-26131: - Summary: Incorrect OutputFormat when describing jdbc connector table Key: HIVE-26131 URL: https://issues.apache.org/jira/browse/HIVE-26131 Project: Hive Issue Type: Bug Components: JDBC storage handler Affects Versions: 4.0.0-alpha-1, 4.0.0-alpha-2 Reporter: zhangbutao -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26130) Incorrect matching of external table when validating NOT NULL constraints
zhangbutao created HIVE-26130: - Summary: Incorrect matching of external table when validating NOT NULL constraints Key: HIVE-26130 URL: https://issues.apache.org/jira/browse/HIVE-26130 Project: Hive Issue Type: Bug Reporter: zhangbutao Assignee: zhangbutao -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26129) Non blocking DROP CONNECTOR
Denys Kuzmenko created HIVE-26129: - Summary: Non blocking DROP CONNECTOR Key: HIVE-26129 URL: https://issues.apache.org/jira/browse/HIVE-26129 Project: Hive Issue Type: Task Reporter: Denys Kuzmenko Use a less restrictive lock for data connectors, they do not have any dependencies on other tables. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26128) Enabling dynamic runtime filtering in iceberg tables throws exception at runtime
Rajesh Balamohan created HIVE-26128: --- Summary: Enabling dynamic runtime filtering in iceberg tables throws exception at runtime Key: HIVE-26128 URL: https://issues.apache.org/jira/browse/HIVE-26128 Project: Hive Issue Type: Bug Reporter: Rajesh Balamohan E.g TPCDS Q2 at 10 TB scale throws the following error when run with "hive.disable.unsafe.external.table.operations=false". Iceberg tables were created as external tables and setting "hive.disable.unsafe.external.table.operations=false" will enable it to have dynamic runtime filtering; but throws the following error at runtime {noformat} ]Vertex failed, vertexName=Map 6, vertexId=vertex_1649658279052__1_03, diagnostics=[Vertex vertex_1649658279052__1_03 [Map 6] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: date_dim initializer failed, vertex=vertex_1649658279052__1_03 [Map 6], java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:659) at java.util.ArrayList.get(ArrayList.java:435) at org.apache.iceberg.mr.hive.HiveIcebergFilterFactory.translateLeaf(HiveIcebergFilterFactory.java:114) at org.apache.iceberg.mr.hive.HiveIcebergFilterFactory.translate(HiveIcebergFilterFactory.java:86) at org.apache.iceberg.mr.hive.HiveIcebergFilterFactory.translate(HiveIcebergFilterFactory.java:80) at org.apache.iceberg.mr.hive.HiveIcebergFilterFactory.generateFilterExpression(HiveIcebergFilterFactory.java:59) at org.apache.iceberg.mr.hive.HiveIcebergInputFormat.getSplits(HiveIcebergInputFormat.java:92) at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:592) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:900) at org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:274) at org.apache.tez.dag.app.dag.RootInputInitializerManager.lambda$runInitializer$3(RootInputInitializerManager.java:199) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898) at org.apache.tez.dag.app.dag.RootInputInitializerManager.runInitializer(RootInputInitializerManager.java:192) at org.apache.tez.dag.app.dag.RootInputInitializerManager.runInitializerAndProcessResult(RootInputInitializerManager.java:173) at org.apache.tez.dag.app.dag.RootInputInitializerManager.lambda$createAndStartInitializing$2(RootInputInitializerManager.java:167) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) ]Vertex killed, vertexName=Map 13, vertexId=vertex_1649658279052__1_07, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1649658279052__1_07 [Map 13] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Map 10, vertexId=vertex_1649658279052__1_06, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1649658279052__1_06 [Map 10] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Map 5, vertexId=vertex_1649658279052__1_04, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1649658279052__1_04 [Map 5] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Reducer 4, vertexId=vertex_1649658279052__1_11, diagnostics=[Vertex received Kill in NEW state., Vertex vertex_1649658279052__1_11 [Reducer 4] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Reducer 3, vertexId=vertex_1649658279052__1_10, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1649658279052__1_10 [Reducer 3] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Reducer 12, vertexId=vertex_1649658279052__1_09, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1649658279052__1_09 [Reducer 12] killed/failed due to:OTHER_VERTEX_FAILURE]Vertex killed, vertexName=Map 1, vertexId=vertex_1649658279052__1_08, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1649658279052__1_08 [Map
[jira] [Created] (HIVE-26127) Insert overwrite throws FileNotFound when destination partition is deleted
Yu-Wen Lai created HIVE-26127: - Summary: Insert overwrite throws FileNotFound when destination partition is deleted Key: HIVE-26127 URL: https://issues.apache.org/jira/browse/HIVE-26127 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Yu-Wen Lai Assignee: Yu-Wen Lai Steps to reproduce: # create external table src (col int) partitioned by (year int); # create external table dest (col int) partitioned by (year int); # insert into src partition (year=2022) values (1); # insert into dest partition (year=2022) values (2); # hdfs dfs -rm -r ${hive.metastore.warehouse.external.dir}/dest/year=2022 # insert overwrite table dest select * from src; We will get FileNotFoundException when it tries to call {code:java} fs.listStatus(path, pathFilter){code} We should not fail insert overwrite because there is nothing to be clean up. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26126) Allow capturing/validating SQL generated from HMS calls in qtests
Stamatis Zampetakis created HIVE-26126: -- Summary: Allow capturing/validating SQL generated from HMS calls in qtests Key: HIVE-26126 URL: https://issues.apache.org/jira/browse/HIVE-26126 Project: Hive Issue Type: Improvement Components: Testing Infrastructure Reporter: Stamatis Zampetakis Assignee: Stamatis Zampetakis During the compilation/execution of a Hive command there are usually calls in the HiveMetastore (HMS). Most of the time these calls need to connect to the underlying database backend in order to return the requested information so they trigger the generation and execution of SQL queries. We have a lot of code in Hive which affects the generation and execution of these SQL queries and some vivid examples are the {{MetaStoreDirectSql}} and {{CachedStore}} classes. [MetaStoreDirectSql|https://github.com/apache/hive/blob/e8f3a6cdc22c6a4681af2ea5763c80a5b76e310b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java] is responsible for building explicitly SQL queries for performance reasons. [CachedStore|https://github.com/apache/hive/blob/e8f3a6cdc22c6a4681af2ea5763c80a5b76e310b/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java] is responsible for caching certain requests to avoid going to the database on every call. Ensuring that the generated SQL is the expected one and/or that certain queries are hitting (or not) the DB is valuable for catching regressions or evaluating the effectiveness of caches. The idea is that for each Hive command/query in some qtest there is an option to include in the output (.q.out) the list of SQL queries that were generated by HMS calls. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26125) sysdb fails with mysql as metastore db
Alessandro Solimando created HIVE-26125: --- Summary: sysdb fails with mysql as metastore db Key: HIVE-26125 URL: https://issues.apache.org/jira/browse/HIVE-26125 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 4.0.0-alpha-2 Reporter: Alessandro Solimando --- Test set: org.apache.hadoop.hive.cli.TestMysqlMetastoreCliDriver --- Tests run: 3, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 282.638 s <<< FAILURE! - in org.apache.hadoop.hive.cli.TestMysqlMetastoreCliDriver org.apache.hadoop.hive.cli.TestMysqlMetastoreCliDriver.testCliDriver[strict_managed_tables_sysdb] Time elapsed: 41.104 s <<< FAILURE! java.lang.AssertionError: Client execution failed with error code = 2 running select tbl_name, tbl_type from tbls where tbl_name like 'smt_sysdb%' order by tbl_name fname=strict_managed_tables_sysdb.q See ./ql/target/tmp/log/hive.log or ./itests/qtest/target/tmp/log/hive.log, or check ./ql/target/surefire-reports or ./itests/qtest/target/surefire-reports/ for specific test cases logs. org.apache.hadoop.hive.ql.metadata.HiveException: Vertex failed, vertexName=Map 1, vertexId=vertex_1649344918728_0001_33_00, diagnostics=[Task failed, taskId=task_1649344918728_0001_33_00_00, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : attempt_1649344918728_0001_33_00_00_0:java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught exception while trying to execute query:You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '"TBLS"' at line 14 at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:348) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:276) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:381) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:82) at org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:69) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:69) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:39) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at org.apache.hadoop.hive.llap.daemon.impl.StatsRecordingThreadPool$WrappedCallable.call(StatsRecordingThreadPool.java:118) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught exception while trying to execute query:You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '"TBLS"' at line 14 at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:89) at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:414) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:293) ... 15 more Caused by: java.io.IOException: java.io.IOException: org.apache.hive.storage.jdbc.exception.HiveJdbcDatabaseAccessException: Caught exception while trying to execute query:You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '"TBLS"' at line 14 at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:380) at org.
[jira] [Created] (HIVE-26124) Upgrade HBase from 2.0.0-alpha4 to 2.0.0
Peter Vary created HIVE-26124: - Summary: Upgrade HBase from 2.0.0-alpha4 to 2.0.0 Key: HIVE-26124 URL: https://issues.apache.org/jira/browse/HIVE-26124 Project: Hive Issue Type: Task Reporter: Peter Vary We should remove the alpha version to the stable one -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26123) Introduce test coverage for sysdb for the different metastores
Alessandro Solimando created HIVE-26123: --- Summary: Introduce test coverage for sysdb for the different metastores Key: HIVE-26123 URL: https://issues.apache.org/jira/browse/HIVE-26123 Project: Hive Issue Type: Test Components: Testing Infrastructure Affects Versions: 4.0.0-alpha-1 Reporter: Alessandro Solimando Assignee: Alessandro Solimando Fix For: 4.0.0-alpha-2 _sydb_ provides a view over (some) metastore tables from Hive via JDBC queries. Existing tests are running only against Derby, meaning that any change against sysdb query mapping are not covered by CI. The present ticket aims at bridging this gap by introducing test coverage for the different supported metastore for sydb. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26122) Factorize out common docker code between DatabaseRule and AbstractExternalDB
Alessandro Solimando created HIVE-26122: --- Summary: Factorize out common docker code between DatabaseRule and AbstractExternalDB Key: HIVE-26122 URL: https://issues.apache.org/jira/browse/HIVE-26122 Project: Hive Issue Type: Improvement Components: Testing Infrastructure Affects Versions: 4.0.0-alpha-1 Reporter: Alessandro Solimando Assignee: Alessandro Solimando Fix For: 4.0.0-alpha-2 Currently there is a lot of shared code between the two classes which could be extracted into a utility class called DockerUtils, since all this code pertains docker. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26121) Hive transaction rollback should be thread-safe
Denys Kuzmenko created HIVE-26121: - Summary: Hive transaction rollback should be thread-safe Key: HIVE-26121 URL: https://issues.apache.org/jira/browse/HIVE-26121 Project: Hive Issue Type: Task Reporter: Denys Kuzmenko -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (HIVE-26120) beeline return 0 when Could not open connection to the HS2 server
MK created HIVE-26120: - Summary: beeline return 0 when Could not open connection to the HS2 server Key: HIVE-26120 URL: https://issues.apache.org/jira/browse/HIVE-26120 Project: Hive Issue Type: Bug Components: Beeline Reporter: MK when execute : beeline -u 'jdbc:hive2://bigdata-hs111:10003' -n 'etl' -p '**' -f /opt/project/DWD/SPD/xxx.sql and bigdata-hs111 doesn't exists or can't connect , the command return code is 0 , NOT a Non-zero value . SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/data/programs/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.17.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/data/programs/hadoop-3.1.4/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Connecting to jdbc:hive2://bigdata-hs111:10003 2022-04-06T17:28:04,247 WARN [main] org.apache.hive.jdbc.Utils - Could not retrieve canonical hostname for bigdata-hs111 java.net.UnknownHostException: bigdata-hs111: Name or service not known at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) ~[?:1.8.0_191] at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) ~[?:1.8.0_191] at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) ~[?:1.8.0_191] at java.net.InetAddress.getAllByName0(InetAddress.java:1277) ~[?:1.8.0_191] at java.net.InetAddress.getAllByName(InetAddress.java:1193) ~[?:1.8.0_191] at java.net.InetAddress.getAllByName(InetAddress.java:1127) ~[?:1.8.0_191] at java.net.InetAddress.getByName(InetAddress.java:1077) ~[?:1.8.0_191] at org.apache.hive.jdbc.Utils.getCanonicalHostName(Utils.java:701) [hive-jdbc-3.1.2.jar:3.1.2] at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:178) [hive-jdbc-3.1.2.jar:3.1.2] at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107) [hive-jdbc-3.1.2.jar:3.1.2] at java.sql.DriverManager.getConnection(DriverManager.java:664) [?:1.8.0_191] at java.sql.DriverManager.getConnection(DriverManager.java:208) [?:1.8.0_191] at org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:145) [hive-beeline-3.1.2.jar:3.1.2] at org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:209) [hive-beeline-3.1.2.jar:3.1.2] at org.apache.hive.beeline.Commands.connect(Commands.java:1641) [hive-beeline-3.1.2.jar:3.1.2] at org.apache.hive.beeline.Commands.connect(Commands.java:1536) [hive-beeline-3.1.2.jar:3.1.2] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_191] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_191] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_191] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_191] at org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:56) [hive-beeline-3.1.2.jar:3.1.2] at org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1384) [hive-beeline-3.1.2.jar:3.1.2] at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1423) [hive-beeline-3.1.2.jar:3.1.2] at org.apache.hive.beeline.BeeLine.connectUsingArgs(BeeLine.java:900) [hive-beeline-3.1.2.jar:3.1.2] at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:795) [hive-beeline-3.1.2.jar:3.1.2] at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1048) [hive-beeline-3.1.2.jar:3.1.2] at org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:538) [hive-beeline-3.1.2.jar:3.1.2] at org.apache.hive.beeline.BeeLine.main(BeeLine.java:520) [hive-beeline-3.1.2.jar:3.1.2] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_191] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_191] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_191] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_191] at org.apache.hadoop.util.RunJar.run(RunJar.java:318) [hadoop-common-3.1.4.jar:?] at org.apache.hadoop.util.RunJar.main(RunJar.java:232) [hadoop-common-3.1.4.jar:?] 2022-04-06T17:28:04,335 WARN [main] org.apache.hive.jdbc.HiveConnection - Failed to connect to bigdata-hs111:10003 Could not open connection to the HS2 server. Please check the server URI and if the URI is correct, then ask the administrator to check the server status
[jira] [Created] (HIVE-26119) Remove unnecessary Exceptions from DDLPlanUtils
Soumyakanti Das created HIVE-26119: -- Summary: Remove unnecessary Exceptions from DDLPlanUtils Key: HIVE-26119 URL: https://issues.apache.org/jira/browse/HIVE-26119 Project: Hive Issue Type: Improvement Reporter: Soumyakanti Das Assignee: Soumyakanti Das There are a few {{HiveExceptions}} which were added to a few methods like {{{}getCreateTableCommand{}}}, \{getColumns}}, {{{}formatType{}}}, etc, which can be removed. Some methods in \{{ExplainTask} can also be cleaned up which are related. -- This message was sent by Atlassian Jira (v8.20.1#820001)