[jira] [Created] (HIVE-21686) Brute Force eviction can lead to a random uncontrolled eviction pattern.
slim bouguerra created HIVE-21686: - Summary: Brute Force eviction can lead to a random uncontrolled eviction pattern. Key: HIVE-21686 URL: https://issues.apache.org/jira/browse/HIVE-21686 Project: Hive Issue Type: Bug Reporter: slim bouguerra Assignee: slim bouguerra Current logic used by brute force eviction can lead to a perpetual random eviction pattern. For instance if the cache build a small pocket of free memory where the total size is greater than incoming allocation request, the allocator will randomly evict block that fits a particular size. This can happen over and over therefore all the eviction will be random. In Addition this random eviction will lead a leak in the linked list maintained by the policy since it does not know anymore about what is evicted and what not. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21685) Wrong simplification in query with multiple IN clauses
Jesus Camacho Rodriguez created HIVE-21685: -- Summary: Wrong simplification in query with multiple IN clauses Key: HIVE-21685 URL: https://issues.apache.org/jira/browse/HIVE-21685 Project: Hive Issue Type: Bug Components: CBO Reporter: Oliver Draese Assignee: Jesus Camacho Rodriguez Simple test to reproduce: {code} select * from table1 where name IN(‘g’,‘r’) AND name IN(‘a’,‘b’); {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21684) tmp table space directory should be removed on session close
Rajesh Balamohan created HIVE-21684: --- Summary: tmp table space directory should be removed on session close Key: HIVE-21684 URL: https://issues.apache.org/jira/browse/HIVE-21684 Project: Hive Issue Type: Bug Reporter: Rajesh Balamohan `_tmp_space.db` folder should be deleted on session close. {noformat} org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException): The directory item limit of... {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21683) ProxyFileSystem breaks with Hadoop trunk
Todd Lipcon created HIVE-21683: -- Summary: ProxyFileSystem breaks with Hadoop trunk Key: HIVE-21683 URL: https://issues.apache.org/jira/browse/HIVE-21683 Project: Hive Issue Type: Bug Reporter: Todd Lipcon Assignee: Todd Lipcon When trying to run with a recent build of Hadoop which includes HADOOP-15229 I ran into the following stack: {code} Caused by: java.lang.IllegalArgumentException: Wrong FS: pfile:/src/hive/itests/qtest/target/warehouse/src/kv1.txt, expected: file:/// at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:793) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:456) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:153) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:354) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.fs.ChecksumFileSystem.lambda$openFileWithOptions$0(ChecksumFileSystem.java:846) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.util.LambdaUtils.eval(LambdaUtils.java:52) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.fs.ChecksumFileSystem.openFileWithOptions(ChecksumFileSystem.java:845) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.fs.FileSystem$FSDataInputStreamBuilder.build(FileSystem.java:4522) ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?] at org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:115) ~[hadoop-mapreduce-client-core-3.1.1.6.0.99.0-135.jar:?]{code} We need to add appropriate path-swizzling wrappers for the new APIs in ProxyFileSystem23 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21682) Concurrent queries in tez local mode fail
Todd Lipcon created HIVE-21682: -- Summary: Concurrent queries in tez local mode fail Key: HIVE-21682 URL: https://issues.apache.org/jira/browse/HIVE-21682 Project: Hive Issue Type: Bug Reporter: Todd Lipcon Assignee: Todd Lipcon As noted in TEZ-3420, Hive running with Tez local mode breaks if multiple queries are submitted concurrently. As I noted [there|https://issues.apache.org/jira/browse/TEZ-3420?focusedCommentId=16831937=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16831937] it seems part of the problem is Hive's use of static global state for IOContext in the case of Tez. Another issue is the use of a JVM-wide ObjectRegistry -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21681) Describe formatted shows incorrect information for multiple primary keys
Adam Szita created HIVE-21681: - Summary: Describe formatted shows incorrect information for multiple primary keys Key: HIVE-21681 URL: https://issues.apache.org/jira/browse/HIVE-21681 Project: Hive Issue Type: Bug Reporter: Adam Szita Assignee: Adam Szita In tables with primary key spanning across multiple columns 'describe formatted' shows a maximum of two column names only: In the ASCII art table of 3 columns it will show: {{Column name|p1|p2}} Example queries: {code:java} CREATE TABLE test ( p1 string, p2 string, p3 string, c0 int, PRIMARY KEY(p1,p2,p3) DISABLE NOVALIDATE ); describe formatted test;{code} I propose we fix this so that primary key columns get listed one-by-one in separate rows, somewhat like how foreign keys are listed too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21680) Backport HIVE-17644 to branch-2 and branch-2.3
Yuming Wang created HIVE-21680: -- Summary: Backport HIVE-17644 to branch-2 and branch-2.3 Key: HIVE-21680 URL: https://issues.apache.org/jira/browse/HIVE-21680 Project: Hive Issue Type: Bug Reporter: Yuming Wang Assignee: Yuming Wang {code:scala} test("get statistics when not analyzed in Hive or Spark") { val tabName = "tab1" withTable(tabName) { createNonPartitionedTable(tabName, analyzedByHive = false, analyzedBySpark = false) checkTableStats(tabName, hasSizeInBytes = true, expectedRowCounts = None) // ALTER TABLE SET TBLPROPERTIES invalidates some contents of Hive specific statistics // This is triggered by the Hive alterTable API val describeResult = hiveClient.runSqlHive(s"DESCRIBE FORMATTED $tabName") val rawDataSize = extractStatsPropValues(describeResult, "rawDataSize") val numRows = extractStatsPropValues(describeResult, "numRows") val totalSize = extractStatsPropValues(describeResult, "totalSize") assert(rawDataSize.isEmpty, "rawDataSize should not be shown without table analysis") assert(numRows.isEmpty, "numRows should not be shown without table analysis") assert(totalSize.isDefined && totalSize.get > 0, "totalSize is lost") } } // https://github.com/apache/spark/blob/43dcb91a4cb25aa7e1cc5967194f098029a0361e/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala#L789-L806 {code} {noformat} 06:23:46.103 WARN org.apache.hadoop.hive.metastore.MetaStoreDirectSql: Failed to execute [SELECT "DBS"."NAME", "TBLS"."TBL_NAME", "COLUMNS_V2"."COLUMN_NAME","KEY_CONSTRAINTS"."POSITION", "KEY_CONSTRAINTS"."CONSTRAINT_NAME", "KEY_CONSTRAINTS"."ENABLE_VALIDATE_RELY" FROM "TBLS" INNER JOIN "KEY_CONSTRAINTS" ON "TBLS"."TBL_ID" = "KEY_CONSTRAINTS"."PARENT_TBL_ID" INNER JOIN "DBS" ON "TBLS"."DB_ID" = "DBS"."DB_ID" INNER JOIN "COLUMNS_V2" ON "COLUMNS_V2"."CD_ID" = "KEY_CONSTRAINTS"."PARENT_CD_ID" AND "COLUMNS_V2"."INTEGER_IDX" = "KEY_CONSTRAINTS"."PARENT_INTEGER_IDX" WHERE "KEY_CONSTRAINTS"."CONSTRAINT_TYPE" = 0 AND "DBS"."NAME" = ? AND "TBLS"."TBL_NAME" = ?] with parameters [default, tab1] javax.jdo.JDODataStoreException: Error executing SQL query "SELECT "DBS"."NAME", "TBLS"."TBL_NAME", "COLUMNS_V2"."COLUMN_NAME","KEY_CONSTRAINTS"."POSITION", "KEY_CONSTRAINTS"."CONSTRAINT_NAME", "KEY_CONSTRAINTS"."ENABLE_VALIDATE_RELY" FROM "TBLS" INNER JOIN "KEY_CONSTRAINTS" ON "TBLS"."TBL_ID" = "KEY_CONSTRAINTS"."PARENT_TBL_ID" INNER JOIN "DBS" ON "TBLS"."DB_ID" = "DBS"."DB_ID" INNER JOIN "COLUMNS_V2" ON "COLUMNS_V2"."CD_ID" = "KEY_CONSTRAINTS"."PARENT_CD_ID" AND "COLUMNS_V2"."INTEGER_IDX" = "KEY_CONSTRAINTS"."PARENT_INTEGER_IDX" WHERE "KEY_CONSTRAINTS"."CONSTRAINT_TYPE" = 0 AND "DBS"."NAME" = ? AND "TBLS"."TBL_NAME" = ?". at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543) at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:267) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:1750) at org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPrimaryKeys(MetaStoreDirectSql.java:1939) at org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:8213) at org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:8209) at org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2719) at org.apache.hadoop.hive.metastore.ObjectStore.getPrimaryKeysInternal(ObjectStore.java:8221) at org.apache.hadoop.hive.metastore.ObjectStore.getPrimaryKeys(ObjectStore.java:8199) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101) at com.sun.proxy.$Proxy24.getPrimaryKeys(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_primary_keys(HiveMetaStore.java:6830) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148) at
[jira] [Created] (HIVE-21679) Replicating a CTAS event creating an MM partitioned table fails
Ashutosh Bapat created HIVE-21679: - Summary: Replicating a CTAS event creating an MM partitioned table fails Key: HIVE-21679 URL: https://issues.apache.org/jira/browse/HIVE-21679 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 4.0.0 Reporter: Ashutosh Bapat use dumpdb; create table t1 (a int, b int); insert into t1 values (1, 2), (3, 4); create table t6_mm_part partitioned by (a) stored as orc tblproperties ("transactional"="true", "transactional_properties"="insert_only") as select * from t1 create table t6_mm stored as orc tblproperties ("transactional"="true", "transactional_properties"="insert_only") as select * from t1; repl dump dumpdb; create table t6_mm_part_2 partitioned by (a) stored as orc tblproperties ("transactional"="true", "transactional_properties"="insert_only") as select * from t1; create table t6_mm_2 partitioned by (a) stored as orc tblproperties ("transactional"="true", "transactional_properties"="insert_only") as select * from t1; repl dump dumpdb from repl load loaddb from '/tmp/dump/next'; ERROR : failed replication org.apache.hadoop.hive.ql.parse.SemanticException: Invalid table name loaddb.dumpdb.t6_mm_part_2 at org.apache.hadoop.hive.ql.exec.Utilities.getDbTableName(Utilities.java:2253) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Utilities.getDbTableName(Utilities.java:2239) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.plan.AlterTableDesc.setOldName(AlterTableDesc.java:419) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.tableUpdateReplStateTask(IncrementalLoadTasksBuilder.java:286) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.addUpdateReplStateTasks(IncrementalLoadTasksBuilder.java:371) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.analyzeEventLoad(IncrementalLoadTasksBuilder.java:244) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.build(IncrementalLoadTasksBuilder.java:139) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.repl.ReplLoadTask.executeIncrementalLoad(ReplLoadTask.java:488) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.repl.ReplLoadTask.execute(ReplLoadTask.java:102) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162) ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:233) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation.access$600(SQLOperation.java:88) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:332) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_191] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_191] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688) ~[hadoop-common-3.1.0.3.0.0.0-1634.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:350) ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_191] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_191] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_191] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_191] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191] ERROR : FAILED: Execution Error, return
[jira] [Created] (HIVE-21678) CTAS creating a partitioned table fails because of no writeId
Ashutosh Bapat created HIVE-21678: - Summary: CTAS creating a partitioned table fails because of no writeId Key: HIVE-21678 URL: https://issues.apache.org/jira/browse/HIVE-21678 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 4.0.0 Reporter: Ashutosh Bapat create table t1(a int, b int); insert into t1 values (1, 2), (3, 4); create table t6_part partitioned by (a) stored as orc tblproperties ("transactional"="true") as select * from t1; ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. MoveTask : Write id is not set in the config by open txn task for migration Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. MoveTask : Write id is not set in the config by open txn task for migration (state=08S01,code=1) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21677) Using strict managed tables for ACID table testing (Replication tests)
Ashutosh Bapat created HIVE-21677: - Summary: Using strict managed tables for ACID table testing (Replication tests) Key: HIVE-21677 URL: https://issues.apache.org/jira/browse/HIVE-21677 Project: Hive Issue Type: Bug Components: HiveServer2, repl Affects Versions: 4.0.0 Reporter: Ashutosh Bapat The replication tests which exclusively test ACID table replication are adding transactional properties to the create table/alter table statements when creating the table. Instead they should use hive.strict.managed.tables = true in those tests. Tests derived from BaseReplicationScenariosAcidTables, and org.apache.hadoop.hive.ql.parse.TestReplicationScenariosIncrementalLoadAcidTables are examples of those. Change all such tests use hive.strict.managed.tables = true. Some of these tests create non-acid tables for testing, which will then require explicit 'transactional'=false set when creating the tables. With this change we might see some test failures (See subtasks). Please create subtasks for those so that it can be tracked within this JIRA. -- This message was sent by Atlassian JIRA (v7.6.3#76005)