[jira] [Created] (HIVE-25446) VectorMapJoinFastHashTable.validateCapacity AssertionError: Capacity must be a power of two

2021-08-11 Thread Matt McCline (Jira)
Matt McCline created HIVE-25446:
---

 Summary: VectorMapJoinFastHashTable.validateCapacity 
AssertionError: Capacity must be a power of two
 Key: HIVE-25446
 URL: https://issues.apache.org/jira/browse/HIVE-25446
 Project: Hive
  Issue Type: Bug
 Environment: Encountered this in a very large query:

Caused by: java.lang.AssertionError: Capacity must be a power of two

   at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.validateCapacity(VectorMapJoinFastHashTable.java:60)

   at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTable.(VectorMapJoinFastHashTable.java:77)

   at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashTable.(VectorMapJoinFastBytesHashTable.java:132)

   at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastBytesHashMap.(VectorMapJoinFastBytesHashMap.java:166)

   at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastStringHashMap.(VectorMapJoinFastStringHashMap.java:43)

   at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.createHashTable(VectorMapJoinFastTableContainer.java:137)

   at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.(VectorMapJoinFastTableContainer.java:86)

   at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:122)

   at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTableInternal(MapJoinOperator.java:344)

   at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:413)

   at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.lambda$initializeOp$0(MapJoinOperator.java:215)

   at 
org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:96)

   at 
org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:113)

   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
Reporter: Matt McCline
Assignee: Matt McCline
 Fix For: 4.0.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25445) Enable JdbcStorageHandler to get password from AWS Secrets Service.

2021-08-11 Thread Harish JP (Jira)
Harish JP created HIVE-25445:


 Summary: Enable JdbcStorageHandler to get password from AWS 
Secrets Service.
 Key: HIVE-25445
 URL: https://issues.apache.org/jira/browse/HIVE-25445
 Project: Hive
  Issue Type: New Feature
  Components: HiveServer2
Reporter: Harish JP
Assignee: Harish JP


Currently, password for JdbcStorageHandler can be set only via the password 
field or keystore. This Jira is to add framework to fetch password from any 
source and implement AWS Secrets Manager as a source.

 

The approach takes is to use a new table property dbcp.password.uri which will 
be used if password and keyfile are not available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25444) Use a config to disable authorization on storage handlers by default.

2021-08-11 Thread Sai Hemanth Gantasala (Jira)
Sai Hemanth Gantasala created HIVE-25444:


 Summary: Use a config to disable authorization on storage handlers 
by default.
 Key: HIVE-25444
 URL: https://issues.apache.org/jira/browse/HIVE-25444
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Reporter: Sai Hemanth Gantasala
Assignee: Sai Hemanth Gantasala


Using a config "hive.security.authorization.tables.on.storagehandlers" with a 
default false, we'll enable the authorization on storage handlers by default. 
Authorization is enabled if this config is set to true. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-25443) Arrow SerDe Cannot serialize/deserialize complex data types When there are more than 1024 values

2021-08-11 Thread Syed Shameerur Rahman (Jira)
Syed Shameerur Rahman created HIVE-25443:


 Summary: Arrow SerDe Cannot serialize/deserialize complex data 
types When there are more than 1024 values
 Key: HIVE-25443
 URL: https://issues.apache.org/jira/browse/HIVE-25443
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 3.1.2, 3.1.1, 3.0.0, 3.1.0
Reporter: Syed Shameerur Rahman
Assignee: Syed Shameerur Rahman
 Fix For: 4.0.0


Complex data types like MAP, STRUCT cannot be serialized/deserialzed using 
Arrow SerDe when there are more than 1024 values. This happens due to 
ColumnVector always being initialized with a size of 1024.

Issue #1 : 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/ArrowColumnarBatchSerDe.java#L213

Issue #2 : 
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/arrow/ArrowColumnarBatchSerDe.java#L215

Sample unit test to reproduce the case in TestArrowColumnarBatchSerDe :


{code:java}
@Test
   public void testListBooleanWithMoreThan1024Values() throws SerDeException {
 String[][] schema = {
 {"boolean_list", "array"},
 };
  
 Object[][] rows = new Object[1025][1];
 for (int i = 0; i < 1025; i++) {
   rows[i][0] = new BooleanWritable(true);
 }
  
 initAndSerializeAndDeserialize(schema, toList(rows));
   }
  
{code}





--
This message was sent by Atlassian Jira
(v8.3.4#803005)