[jira] [Created] (HIVE-21686) Brute Force eviction can lead to a random uncontrolled eviction pattern.

2019-05-02 Thread slim bouguerra (JIRA)
slim bouguerra created HIVE-21686:
-

 Summary: Brute Force eviction can lead to a random uncontrolled 
eviction pattern.
 Key: HIVE-21686
 URL: https://issues.apache.org/jira/browse/HIVE-21686
 Project: Hive
  Issue Type: Bug
Reporter: slim bouguerra
Assignee: slim bouguerra


Current logic used by brute force eviction can lead to a perpetual random 
eviction pattern.
For instance if the cache build a small pocket of free memory where the total 
size is greater than incoming allocation request, the allocator will randomly 
evict block that fits a particular size.
This can happen over and over therefore all the eviction will be random.
In Addition this random eviction will lead a leak in the linked list maintained 
by the policy since it does not know anymore about what is evicted and what not.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21685) Wrong simplification in query with multiple IN clauses

2019-05-02 Thread Jesus Camacho Rodriguez (JIRA)
Jesus Camacho Rodriguez created HIVE-21685:
--

 Summary: Wrong simplification in query with multiple IN clauses
 Key: HIVE-21685
 URL: https://issues.apache.org/jira/browse/HIVE-21685
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Oliver Draese
Assignee: Jesus Camacho Rodriguez


Simple test to reproduce:
{code}
select * from table1 where name IN(‘g’,‘r’) AND name IN(‘a’,‘b’);
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21684) tmp table space directory should be removed on session close

2019-05-02 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created HIVE-21684:
---

 Summary: tmp table space directory should be removed on session 
close
 Key: HIVE-21684
 URL: https://issues.apache.org/jira/browse/HIVE-21684
 Project: Hive
  Issue Type: Bug
Reporter: Rajesh Balamohan


`_tmp_space.db` folder should be deleted on session close. 

{noformat}
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
 The directory item limit of...
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21683) ProxyFileSystem breaks with Hadoop trunk

2019-05-02 Thread Todd Lipcon (JIRA)
Todd Lipcon created HIVE-21683:
--

 Summary: ProxyFileSystem breaks with Hadoop trunk
 Key: HIVE-21683
 URL: https://issues.apache.org/jira/browse/HIVE-21683
 Project: Hive
  Issue Type: Bug
Reporter: Todd Lipcon
Assignee: Todd Lipcon


When trying to run with a recent build of Hadoop which includes HADOOP-15229 I 
ran into the following stack:
{code}
Caused by: java.lang.IllegalArgumentException: Wrong FS: 
pfile:/src/hive/itests/qtest/target/warehouse/src/kv1.txt, expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:793) 
~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86) 
~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
 ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
 ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
 ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:456) 
~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:153)
 ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at 
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:354) 
~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at 
org.apache.hadoop.fs.ChecksumFileSystem.lambda$openFileWithOptions$0(ChecksumFileSystem.java:846)
 ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at org.apache.hadoop.util.LambdaUtils.eval(LambdaUtils.java:52) 
~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at 
org.apache.hadoop.fs.ChecksumFileSystem.openFileWithOptions(ChecksumFileSystem.java:845)
 ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at 
org.apache.hadoop.fs.FileSystem$FSDataInputStreamBuilder.build(FileSystem.java:4522)
 ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
at 
org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:115) 
~[hadoop-mapreduce-client-core-3.1.1.6.0.99.0-135.jar:?]{code}

We need to add appropriate path-swizzling wrappers for the new APIs in 
ProxyFileSystem23



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21682) Concurrent queries in tez local mode fail

2019-05-02 Thread Todd Lipcon (JIRA)
Todd Lipcon created HIVE-21682:
--

 Summary: Concurrent queries in tez local mode fail
 Key: HIVE-21682
 URL: https://issues.apache.org/jira/browse/HIVE-21682
 Project: Hive
  Issue Type: Bug
Reporter: Todd Lipcon
Assignee: Todd Lipcon


As noted in TEZ-3420, Hive running with Tez local mode breaks if multiple 
queries are submitted concurrently. As I noted 
[there|https://issues.apache.org/jira/browse/TEZ-3420?focusedCommentId=16831937=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16831937]
 it seems part of the problem is Hive's use of static global state for 
IOContext in the case of Tez. Another issue is the use of a JVM-wide 
ObjectRegistry



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21681) Describe formatted shows incorrect information for multiple primary keys

2019-05-02 Thread Adam Szita (JIRA)
Adam Szita created HIVE-21681:
-

 Summary: Describe formatted shows incorrect information for 
multiple primary keys
 Key: HIVE-21681
 URL: https://issues.apache.org/jira/browse/HIVE-21681
 Project: Hive
  Issue Type: Bug
Reporter: Adam Szita
Assignee: Adam Szita


In tables with primary key spanning across multiple columns 'describe 
formatted' shows a maximum of two column names only: In the ASCII art table of 
3 columns it will show:

{{Column name|p1|p2}}

Example queries:
{code:java}
CREATE TABLE test (
  p1 string,
  p2 string,
  p3 string,
  c0 int,
  PRIMARY KEY(p1,p2,p3) DISABLE NOVALIDATE
);
describe formatted test;{code}
I propose we fix this so that primary key columns get listed one-by-one in 
separate rows, somewhat like how foreign keys are listed too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21680) Backport HIVE-17644 to branch-2 and branch-2.3

2019-05-02 Thread Yuming Wang (JIRA)
Yuming Wang created HIVE-21680:
--

 Summary: Backport HIVE-17644 to branch-2 and branch-2.3
 Key: HIVE-21680
 URL: https://issues.apache.org/jira/browse/HIVE-21680
 Project: Hive
  Issue Type: Bug
Reporter: Yuming Wang
Assignee: Yuming Wang



{code:scala}
  test("get statistics when not analyzed in Hive or Spark") {
val tabName = "tab1"
withTable(tabName) {
  createNonPartitionedTable(tabName, analyzedByHive = false, 
analyzedBySpark = false)
  checkTableStats(tabName, hasSizeInBytes = true, expectedRowCounts = None)

  // ALTER TABLE SET TBLPROPERTIES invalidates some contents of Hive 
specific statistics
  // This is triggered by the Hive alterTable API
  val describeResult = hiveClient.runSqlHive(s"DESCRIBE FORMATTED $tabName")

  val rawDataSize = extractStatsPropValues(describeResult, "rawDataSize")
  val numRows = extractStatsPropValues(describeResult, "numRows")
  val totalSize = extractStatsPropValues(describeResult, "totalSize")
  assert(rawDataSize.isEmpty, "rawDataSize should not be shown without 
table analysis")
  assert(numRows.isEmpty, "numRows should not be shown without table 
analysis")
  assert(totalSize.isDefined && totalSize.get > 0, "totalSize is lost")
}
  }
// 
https://github.com/apache/spark/blob/43dcb91a4cb25aa7e1cc5967194f098029a0361e/sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala#L789-L806
{code}

{noformat}
06:23:46.103 WARN org.apache.hadoop.hive.metastore.MetaStoreDirectSql: Failed 
to execute [SELECT "DBS"."NAME", "TBLS"."TBL_NAME", 
"COLUMNS_V2"."COLUMN_NAME","KEY_CONSTRAINTS"."POSITION", 
"KEY_CONSTRAINTS"."CONSTRAINT_NAME", "KEY_CONSTRAINTS"."ENABLE_VALIDATE_RELY"  
FROM  "TBLS"  INNER  JOIN "KEY_CONSTRAINTS" ON "TBLS"."TBL_ID" = 
"KEY_CONSTRAINTS"."PARENT_TBL_ID"  INNER JOIN "DBS" ON "TBLS"."DB_ID" = 
"DBS"."DB_ID"  INNER JOIN "COLUMNS_V2" ON "COLUMNS_V2"."CD_ID" = 
"KEY_CONSTRAINTS"."PARENT_CD_ID" AND  "COLUMNS_V2"."INTEGER_IDX" = 
"KEY_CONSTRAINTS"."PARENT_INTEGER_IDX"  WHERE 
"KEY_CONSTRAINTS"."CONSTRAINT_TYPE" = 0 AND "DBS"."NAME" = ? AND 
"TBLS"."TBL_NAME" = ?] with parameters [default, tab1]
javax.jdo.JDODataStoreException: Error executing SQL query "SELECT 
"DBS"."NAME", "TBLS"."TBL_NAME", 
"COLUMNS_V2"."COLUMN_NAME","KEY_CONSTRAINTS"."POSITION", 
"KEY_CONSTRAINTS"."CONSTRAINT_NAME", "KEY_CONSTRAINTS"."ENABLE_VALIDATE_RELY"  
FROM  "TBLS"  INNER  JOIN "KEY_CONSTRAINTS" ON "TBLS"."TBL_ID" = 
"KEY_CONSTRAINTS"."PARENT_TBL_ID"  INNER JOIN "DBS" ON "TBLS"."DB_ID" = 
"DBS"."DB_ID"  INNER JOIN "COLUMNS_V2" ON "COLUMNS_V2"."CD_ID" = 
"KEY_CONSTRAINTS"."PARENT_CD_ID" AND  "COLUMNS_V2"."INTEGER_IDX" = 
"KEY_CONSTRAINTS"."PARENT_INTEGER_IDX"  WHERE 
"KEY_CONSTRAINTS"."CONSTRAINT_TYPE" = 0 AND "DBS"."NAME" = ? AND 
"TBLS"."TBL_NAME" = ?".
at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391)
at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:267)
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:1750)
at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPrimaryKeys(MetaStoreDirectSql.java:1939)
at 
org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:8213)
at 
org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:8209)
at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:2719)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPrimaryKeysInternal(ObjectStore.java:8221)
at 
org.apache.hadoop.hive.metastore.ObjectStore.getPrimaryKeys(ObjectStore.java:8199)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
at com.sun.proxy.$Proxy24.getPrimaryKeys(Unknown Source)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_primary_keys(HiveMetaStore.java:6830)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
at 

[jira] [Created] (HIVE-21679) Replicating a CTAS event creating an MM partitioned table fails

2019-05-02 Thread Ashutosh Bapat (JIRA)
Ashutosh Bapat created HIVE-21679:
-

 Summary: Replicating a CTAS event creating an MM partitioned table 
fails
 Key: HIVE-21679
 URL: https://issues.apache.org/jira/browse/HIVE-21679
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2, repl
Affects Versions: 4.0.0
Reporter: Ashutosh Bapat


use dumpdb;
create table t1 (a int, b int);
insert into t1 values (1, 2), (3, 4);
create table t6_mm_part partitioned by (a) stored as orc tblproperties 
("transactional"="true", "transactional_properties"="insert_only") as select * 
from t1
create table t6_mm stored as orc tblproperties ("transactional"="true", 
"transactional_properties"="insert_only") as select * from t1;
repl dump dumpdb;
create table t6_mm_part_2 partitioned by (a) stored as orc tblproperties 
("transactional"="true", "transactional_properties"="insert_only") as select * 
from t1;
create table t6_mm_2 partitioned by (a) stored as orc tblproperties 
("transactional"="true", "transactional_properties"="insert_only") as select * 
from t1;
repl dump dumpdb from 
repl load loaddb from '/tmp/dump/next';
ERROR : failed replication
org.apache.hadoop.hive.ql.parse.SemanticException: Invalid table name 
loaddb.dumpdb.t6_mm_part_2
 at 
org.apache.hadoop.hive.ql.exec.Utilities.getDbTableName(Utilities.java:2253) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.exec.Utilities.getDbTableName(Utilities.java:2239) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.plan.AlterTableDesc.setOldName(AlterTableDesc.java:419)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.tableUpdateReplStateTask(IncrementalLoadTasksBuilder.java:286)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.addUpdateReplStateTasks(IncrementalLoadTasksBuilder.java:371)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.analyzeEventLoad(IncrementalLoadTasksBuilder.java:244)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.exec.repl.incremental.IncrementalLoadTasksBuilder.build(IncrementalLoadTasksBuilder.java:139)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.exec.repl.ReplLoadTask.executeIncrementalLoad(ReplLoadTask.java:488)
 ~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hadoop.hive.ql.exec.repl.ReplLoadTask.execute(ReplLoadTask.java:102) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162) 
~[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:233)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hive.service.cli.operation.SQLOperation.access$600(SQLOperation.java:88)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:332)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_191]
 at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_191]
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1688)
 ~[hadoop-common-3.1.0.3.0.0.0-1634.jar:?]
 at 
org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:350)
 ~[hive-service-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_191]
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_191]
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_191]
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_191]
 at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]
ERROR : FAILED: Execution Error, return 

[jira] [Created] (HIVE-21678) CTAS creating a partitioned table fails because of no writeId

2019-05-02 Thread Ashutosh Bapat (JIRA)
Ashutosh Bapat created HIVE-21678:
-

 Summary: CTAS creating a partitioned table fails because of no 
writeId
 Key: HIVE-21678
 URL: https://issues.apache.org/jira/browse/HIVE-21678
 Project: Hive
  Issue Type: Sub-task
  Components: HiveServer2, repl
Affects Versions: 4.0.0
Reporter: Ashutosh Bapat


create table t1(a int, b int);
insert into t1 values (1, 2), (3, 4);
create table t6_part partitioned by (a) stored as orc tblproperties 
("transactional"="true") as select * from t1;
ERROR : FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.MoveTask. MoveTask : Write id is not set in the 
config by open txn task for migration
Error: Error while processing statement: FAILED: Execution Error, return code 1 
from org.apache.hadoop.hive.ql.exec.MoveTask. MoveTask : Write id is not set in 
the config by open txn task for migration (state=08S01,code=1)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21677) Using strict managed tables for ACID table testing (Replication tests)

2019-05-02 Thread Ashutosh Bapat (JIRA)
Ashutosh Bapat created HIVE-21677:
-

 Summary: Using strict managed tables for ACID table testing 
(Replication tests)
 Key: HIVE-21677
 URL: https://issues.apache.org/jira/browse/HIVE-21677
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, repl
Affects Versions: 4.0.0
Reporter: Ashutosh Bapat


The replication tests which exclusively test ACID table replication are adding 
transactional properties to the create table/alter table statements when 
creating the table. Instead they should use hive.strict.managed.tables = true 
in those tests. Tests derived from BaseReplicationScenariosAcidTables, and 
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosIncrementalLoadAcidTables
 are examples of those. Change all such tests use hive.strict.managed.tables = 
true. Some of these tests create non-acid tables for testing, which will then 
require explicit 'transactional'=false set when creating the tables.

With this change we might see some test failures (See subtasks). Please create 
subtasks for those so that it can be tracked within this JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)