[jira] [Created] (HIVE-23479) Avoid regenerating JdbcSchema for every table in a query

2020-05-15 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created HIVE-23479:
--

 Summary: Avoid regenerating JdbcSchema for every table in a query
 Key: HIVE-23479
 URL: https://issues.apache.org/jira/browse/HIVE-23479
 Project: Hive
  Issue Type: Improvement
  Components: Query Planning
Reporter: Stamatis Zampetakis


Currently {{CalcitePlanner}} generates a complete {{JdbcSchema}} for every 
{{JdbcTable}} in the query.

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java#L3174

This wastes some resources since every call to {{JdbcSchema#getTable}} needs to 
communicate with the database to bring back the tables belonging to the schema. 
Moreover, the fact that a schema is created during planning is 
counter-intuitive since in principle the schema shouldn't change.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23478) Fix flaky special_character_in_tabnames_quotes_1 test

2020-05-15 Thread John Sherman (Jira)
John Sherman created HIVE-23478:
---

 Summary: Fix flaky special_character_in_tabnames_quotes_1 test
 Key: HIVE-23478
 URL: https://issues.apache.org/jira/browse/HIVE-23478
 Project: Hive
  Issue Type: Improvement
  Components: Tests
Affects Versions: 4.0.0
Reporter: John Sherman
Assignee: John Sherman
 Fix For: 4.0.0


While testing https://issues.apache.org/jira/browse/HIVE-23354 
special_character_in_tabnames_quotes_1 failed. Searching for the test, it seems 
other patches have also had failures. I noticed that 
special_character_in_tabnames_1 and special_character_in_tabnames_quotes_1 use 
the same database/table names. I suspect this is responsible for some of the 
flakiness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23477) [LLAP] mmap allocation interruptions fails to notify other threads

2020-05-15 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-23477:


 Summary: [LLAP] mmap allocation interruptions fails to notify 
other threads
 Key: HIVE-23477
 URL: https://issues.apache.org/jira/browse/HIVE-23477
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


BuddyAllocator always uses lazy allocation is mmap is enabled. If query 
fragment is interrupted at the time of arena allocation 
ClosedByInterruptionException is thrown. This exception artificially triggers 
allocator OutOfMemoryError and fails to notify other threads waiting to 
allocate arenas. 
{code:java}
2020-05-15 00:03:23.254  WARN [TezTR-128417_1_3_1_1_0] LlapIoImpl: Failed 
trying to allocate memory mapped arena
java.nio.channels.ClosedByInterruptException
at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:970)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.preallocateArenaBuffer(BuddyAllocator.java:867)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.access$1100(BuddyAllocator.java:69)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.init(BuddyAllocator.java:900)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.allocateWithExpand(BuddyAllocator.java:1458)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator$Arena.access$800(BuddyAllocator.java:884)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateWithExpand(BuddyAllocator.java:740)
at 
org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:330)
at 
org.apache.hadoop.hive.llap.io.metadata.MetadataCache.wrapBbForFile(MetadataCache.java:257)
at 
org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:216)
at 
org.apache.hadoop.hive.llap.io.metadata.MetadataCache.putFileMetadata(MetadataCache.java:49)
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.readSplitFooter(VectorizedParquetRecordReader.java:343)
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.initialize(VectorizedParquetRecordReader.java:238)
at 
org.apache.hadoop.hive.ql.io.parquet.vector.VectorizedParquetRecordReader.(VectorizedParquetRecordReader.java:160)
at 
org.apache.hadoop.hive.ql.io.parquet.VectorizedParquetInputFormat.getRecordReader(VectorizedParquetInputFormat.java:50)
at 
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:87)
at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:427)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.initNextRecordReader(TezGroupedSplitsInputFormat.java:203)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.(TezGroupedSplitsInputFormat.java:145)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat.getRecordReader(TezGroupedSplitsInputFormat.java:111)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.setupOldRecordReader(MRReaderMapred.java:156)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.setSplit(MRReaderMapred.java:82)
at 
org.apache.tez.mapreduce.input.MRInput.initFromEventInternal(MRInput.java:703)
at 
org.apache.tez.mapreduce.input.MRInput.initFromEvent(MRInput.java:662)
at 
org.apache.tez.mapreduce.input.MRInputLegacy.checkAndAwaitRecordReaderInitialization(MRInputLegacy.java:150)
at 
org.apache.tez.mapreduce.input.MRInputLegacy.init(MRInputLegacy.java:114)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.getMRInput(MapRecordProcessor.java:532)
at 
org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:178)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:266)
at 
org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:75)
at 
org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:62)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:62)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:38

[jira] [Created] (HIVE-23476) [LLAP] Preallocate arenas for mmap case as well

2020-05-15 Thread Prasanth Jayachandran (Jira)
Prasanth Jayachandran created HIVE-23476:


 Summary: [LLAP] Preallocate arenas for mmap case as well
 Key: HIVE-23476
 URL: https://issues.apache.org/jira/browse/HIVE-23476
 Project: Hive
  Issue Type: Bug
Reporter: Prasanth Jayachandran
Assignee: Prasanth Jayachandran


BuddyAllocator pre-allocation of arenas does not happen for mmap cache case. 
Since we are not filling up the mmap'ed buffers the upfront allocations in 
constructor is cheap. This can avoid lock free allocation of arenas later in 
the code. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23475) Track MJ HashTable mem usage

2020-05-15 Thread Panagiotis Garefalakis (Jira)
Panagiotis Garefalakis created HIVE-23475:
-

 Summary: Track MJ HashTable mem usage
 Key: HIVE-23475
 URL: https://issues.apache.org/jira/browse/HIVE-23475
 Project: Hive
  Issue Type: Improvement
Reporter: Panagiotis Garefalakis
Assignee: Panagiotis Garefalakis






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HIVE-23474) Deny Repl Dump if the database is a target of replication

2020-05-15 Thread Aasha Medhi (Jira)
Aasha Medhi created HIVE-23474:
--

 Summary: Deny Repl Dump if the database is a target of replication
 Key: HIVE-23474
 URL: https://issues.apache.org/jira/browse/HIVE-23474
 Project: Hive
  Issue Type: Task
Reporter: Aasha Medhi
Assignee: Aasha Medhi






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72462: MSCK REPAIR cannot discover partitions with upper case directory names

2020-05-15 Thread Adesh Rao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72462/
---

(Updated May 15, 2020, 12:28 p.m.)


Review request for hive and Sankar Hariappan.


Changes
---

Validating that each partition should correspond to single path in fs


Repository: hive-git


Description
---

The fix converts partition keys to lowercase present in hdfs directory, but 
store the hdfs directory as is for partition path.


Diffs (updated)
-

  ql/src/test/queries/clientnegative/msck_repair_5.q PRE-CREATION 
  ql/src/test/queries/clientnegative/msck_repair_6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/msck_repair_4.q PRE-CREATION 
  ql/src/test/queries/clientpositive/msck_repair_5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/msck_repair_6.q PRE-CREATION 
  ql/src/test/results/clientnegative/msck_repair_5.q.out PRE-CREATION 
  ql/src/test/results/clientnegative/msck_repair_6.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/msck_repair_4.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/msck_repair_5.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/msck_repair_6.q.out PRE-CREATION 
  
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/CheckResult.java
 5287f47e21 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreChecker.java
 6f4400a8ef 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/Msck.java
 f4e109d1b0 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
 92d10cd0e1 


Diff: https://reviews.apache.org/r/72462/diff/3/

Changes: https://reviews.apache.org/r/72462/diff/2-3/


Testing
---


Thanks,

Adesh Rao



[jira] [Created] (HIVE-23473) Handle NPE when ObjectCache is null while getting DynamicValue during ORC split generation

2020-05-15 Thread Ganesha Shreedhara (Jira)
Ganesha Shreedhara created HIVE-23473:
-

 Summary: Handle NPE when ObjectCache is null while getting 
DynamicValue during ORC split generation
 Key: HIVE-23473
 URL: https://issues.apache.org/jira/browse/HIVE-23473
 Project: Hive
  Issue Type: Bug
Reporter: Ganesha Shreedhara
Assignee: Ganesha Shreedhara


NullPointerException is thrown in the following flow.

 

 
{code:java}
java.lang.RuntimeException: ORC split generation failed with exception: 
java.lang.NullPointerException
Caused by: java.lang.NullPointerException
at 
org.apache.orc.impl.RecordReaderImpl.compareToRange(RecordReaderImpl.java:312)
at 
org.apache.orc.impl.RecordReaderImpl.evaluatePredicateMinMax(RecordReaderImpl.java:559)
at 
org.apache.orc.impl.RecordReaderImpl.evaluatePredicateRange(RecordReaderImpl.java:463)
at 
org.apache.orc.impl.RecordReaderImpl.evaluatePredicate(RecordReaderImpl.java:440)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.isStripeSatisfyPredicate(OrcInputFormat.java:2214)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.pickStripesInternal(OrcInputFormat.java:2190)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.pickStripes(OrcInputFormat.java:2182)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.access$3000(OrcInputFormat.java:186)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.callInternal(OrcInputFormat.java:1477)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.access$2700(OrcInputFormat.java:1265)
at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator$1.run(OrcInputFormat.java:1446)
.
.
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1809)
 at 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1895)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:526)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:649)
 at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:206)
{code}
 

Shouldn't we just throw NoDynamicValuesException when 
[ObjectCache|[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/DynamicValue.java#L119]]
 is null instead of returning it similar to how we handled when [conf 
|[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/DynamicValue.java#L110]]or
 
[DynamicValueRegistry|[https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/plan/DynamicValue.java#L125]]
 is null while getting dynamic value?

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Review Request 72112: HIVE-22869 - Add locking benchmark to metastore-tools/metastore-benchmarks

2020-05-15 Thread Zoltan Chovan via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72112/
---

(Updated May 15, 2020, 9:12 a.m.)


Review request for hive, Denys Kuzmenko, Aron Hamvas, Marton Bod, and Peter 
Vary.


Repository: hive-git


Description
---

Add the possibility to run benchmarks on opening lock in the HMS. Currently 
this change only introduces single-threaded/single client testing. I'm planning 
to add multi-client support in a separate change.

Example parametrisation is as follows:
hbench -M ".*Lock.*" -N 10 -d hive_test --params 10 --params 100 -d hive_test

This will create N number (10) of locks for first --params number of tables 
(10) with second --params number of partitions (100) on T (8) threads where 
each thread will strart an HMS client and it'll use -d (hive_test) database;


Diffs
-

  
standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkTool.java
 2ab9388301 
  
standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
 d80c290b60 
  
standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkSuite.java
 5211082a7d 
  
standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
 4e75edeae6 
  
standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
 101d6759c5 


Diff: https://reviews.apache.org/r/72112/diff/2/


Testing
---


File Attachments (updated)


HIVE-22869.2.patch
  
https://reviews.apache.org/media/uploaded/files/2020/04/02/5e35e835-f383-495f-9964-e66773fd6a90__HIVE-22869.2.patch
HIVE-22869.3.patch
  
https://reviews.apache.org/media/uploaded/files/2020/04/09/458beaa7-4743-40fb-a213-1ae4527be823__HIVE-22869.3.patch
HIVE-22869.4.patch
  
https://reviews.apache.org/media/uploaded/files/2020/04/23/423c45d7-911e-4dd2-80b8-c6d3ad90633c__HIVE-22869.4.patch
HIVE-22869.5.patch
  
https://reviews.apache.org/media/uploaded/files/2020/05/12/a06f3b8c-f4ca-4067-a079-e0b6185266d4__HIVE-22869.5.patch
HIVE-22869.6.patch
  
https://reviews.apache.org/media/uploaded/files/2020/05/15/01254e94-1a8d-496d-ab31-628bd5584193__HIVE-22869.6.patch


Thanks,

Zoltan Chovan



Re: Review Request 72510: Move TestCliDriver tests to TestMiniTezCliDriver if they are failing with TestMiniLlapLocalCliDriver

2020-05-15 Thread Zoltan Haindrich


> On May 14, 2020, 7:56 p.m., Jesús Camacho Rodríguez wrote:
> > ql/src/test/results/clientpositive/tez/infer_bucket_sort.q.out
> > Line 53 (original), 53 (patched)
> > 
> >
> > Yeah... @Zoltan, what is the expectation here? Is this happening 
> > because there was a single bucket? Is behavior anyhow different in Tez vs 
> > MR after your patch went in?

I don't think this is related to my change; but this test seems to have lost 
the bucketing by key - I'm not familiar with that feature; but I'm sure it 
doesn't work correctly with tez


- Zoltan


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72510/#review220766
---


On May 14, 2020, 6:18 p.m., Miklos Gergely wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72510/
> ---
> 
> (Updated May 14, 2020, 6:18 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-23470
> https://issues.apache.org/jira/browse/HIVE-23470
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Move TestCliDriver tests to TestMiniTezCliDriver if they are failing with 
> TestMiniLlapLocalCliDriver
> 
> 
> Diffs
> -
> 
>   ql/src/test/results/clientpositive/tez/autoColumnStats_6.q.out ff708cb6b0 
>   ql/src/test/results/clientpositive/tez/binary_output_format.q.out 
> b414360855 
>   ql/src/test/results/clientpositive/tez/create_genericudaf.q.out 85d7850888 
>   ql/src/test/results/clientpositive/tez/create_udaf.q.out 7bfce125f0 
>   ql/src/test/results/clientpositive/tez/create_view.q.out 9a251fcd2f 
>   ql/src/test/results/clientpositive/tez/f_is_null.q.out e6862180b6 
>   ql/src/test/results/clientpositive/tez/gen_udf_example_add10.q.out 
> bfe313967b 
>   ql/src/test/results/clientpositive/tez/groupby_bigdata.q.out 90ccc8cdfb 
>   ql/src/test/results/clientpositive/tez/infer_bucket_sort.q.out bfdc84e24e 
>   ql/src/test/results/clientpositive/tez/input14.q.out 0e61434791 
>   ql/src/test/results/clientpositive/tez/input14_limit.q.out fe9d907663 
>   ql/src/test/results/clientpositive/tez/input17.q.out 9c03f5b0af 
>   ql/src/test/results/clientpositive/tez/input18.q.out ce731e6b2b 
>   ql/src/test/results/clientpositive/tez/input20.q.out d90b9083c3 
>   ql/src/test/results/clientpositive/tez/input33.q.out c8df2efede 
>   ql/src/test/results/clientpositive/tez/input34.q.out 00dd35d803 
>   ql/src/test/results/clientpositive/tez/input35.q.out cee491fc82 
>   ql/src/test/results/clientpositive/tez/input36.q.out 45289b2143 
>   ql/src/test/results/clientpositive/tez/input38.q.out d46ddf03ca 
>   ql/src/test/results/clientpositive/tez/input5.q.out becfc1876a 
>   ql/src/test/results/clientpositive/tez/insert_into3.q.out 60fd42d6fe 
>   ql/src/test/results/clientpositive/tez/insert_into4.q.out 031d562a43 
>   ql/src/test/results/clientpositive/tez/insert_into5.q.out 8ca94ee136 
>   ql/src/test/results/clientpositive/tez/insert_into6.q.out 2c6cab53e6 
>   ql/src/test/results/clientpositive/tez/load_binary_data.q.out b0d5c634b5 
>   ql/src/test/results/clientpositive/tez/localtimezone.q.out 6f85d87c18 
>   ql/src/test/results/clientpositive/tez/macro_1.q.out 28230f90e5 
>   ql/src/test/results/clientpositive/tez/macro_duplicate.q.out 9598126c92 
>   ql/src/test/results/clientpositive/tez/mapreduce3.q.out 9c0157c923 
>   ql/src/test/results/clientpositive/tez/mapreduce4.q.out a606df0894 
>   ql/src/test/results/clientpositive/tez/mapreduce7.q.out ab369e667b 
>   ql/src/test/results/clientpositive/tez/mapreduce8.q.out d00ede826b 
>   ql/src/test/results/clientpositive/tez/merge_test_dummy_operator.q.out 
> 31d4ae16f7 
>   ql/src/test/results/clientpositive/tez/newline.q.out bea4e6ce1c 
>   
> ql/src/test/results/clientpositive/tez/nonreserved_keywords_insert_into1.q.out
>  6435e8b5a3 
>   ql/src/test/results/clientpositive/tez/nullscript.q.out cd926aa170 
>   ql/src/test/results/clientpositive/tez/orc_createas1.q.out 6884e8654e 
>   ql/src/test/results/clientpositive/tez/partcols1.q.out edd7db2357 
>   ql/src/test/results/clientpositive/tez/partition_vs_table_metadata.q.out 
> 1b576ee10a 
>   ql/src/test/results/clientpositive/tez/ppd_transform.q.out a38042c6fe 
>   ql/src/test/results/clientpositive/tez/query_with_semi.q.out 93da006251 
>   ql/src/test/results/clientpositive/tez/rcfile_bigdata.q.out c1ada45ad0 
>   ql/src/test/results/clientpositive/tez/regexp_extract.q.out 95f7c22bc9 
>   ql/src/test/results/clientpositive/tez/script_env_var1.q.out c1181b2635 
>   ql/src/test/results/clientpositive/tez/script_env_var2.q.out 58a0936858 
>   ql/src/test/results/clientpositive/tez/script_pipe.q.out f56107ebb1 
>   ql/src/test/results/clientpositive/tez/scriptfile1.q.o