Re: Review Request 71091: ACID: getAcidState() should cache a recursive dir listing locally

2019-07-22 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71091/
---

(Updated July 22, 2019, 5:35 p.m.)


Review request for hive, Gopal V and Vineet Garg.


Bugs: HIVE-21225
https://issues.apache.org/jira/browse/HIVE-21225


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-21225


Diffs (updated)
-

  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 4dc04f46fd 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/StreamingAssert.java
 78cae7263b 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 d59cfe51e9 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 295fe7cbd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/HdfsUtils.java 9d5ba3d310 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java cff7e04b9a 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 707e38c321 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
b1ede0556f 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 15f1f945ce 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 9d631ed43d 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 57eb506996 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
67a5e6de46 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 6168fc0f79 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java d4abf4277b 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java ea31557741 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
b5958fa9cc 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java 
8451462023 
  streaming/src/test/org/apache/hive/streaming/TestStreaming.java c6d7e7f27c 


Diff: https://reviews.apache.org/r/71091/diff/3/

Changes: https://reviews.apache.org/r/71091/diff/2-3/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 71091: ACID: getAcidState() should cache a recursive dir listing locally

2019-07-18 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71091/
---

(Updated July 18, 2019, 10:21 p.m.)


Review request for hive, Gopal V and Vineet Garg.


Bugs: HIVE-21225
https://issues.apache.org/jira/browse/HIVE-21225


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-21225


Diffs (updated)
-

  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 4dc04f46fd 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/StreamingAssert.java
 78cae7263b 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 d59cfe51e9 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 295fe7cbd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/HdfsUtils.java 9d5ba3d310 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java cff7e04b9a 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 707e38c321 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
b1ede0556f 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 15f1f945ce 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 9d631ed43d 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 57eb506996 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
67a5e6de46 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 6168fc0f79 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java d4abf4277b 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java c5faec5e95 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
b5958fa9cc 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java 
8451462023 
  streaming/src/test/org/apache/hive/streaming/TestStreaming.java c6d7e7f27c 


Diff: https://reviews.apache.org/r/71091/diff/2/

Changes: https://reviews.apache.org/r/71091/diff/1-2/


Testing
---


Thanks,

Vaibhav Gumashta



Review Request 71091: ACID: getAcidState() should cache a recursive dir listing locally

2019-07-17 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71091/
---

Review request for hive, Gopal V and Vineet Garg.


Bugs: HIVE-21225
https://issues.apache.org/jira/browse/HIVE-21225


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-21225


Diffs
-

  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 4dc04f46fd 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/StreamingAssert.java
 78cae7263b 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 d59cfe51e9 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 295fe7cbd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/HdfsUtils.java 9d5ba3d310 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java cff7e04b9a 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 707e38c321 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
b1ede0556f 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 15f1f945ce 
  ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 9d631ed43d 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 57eb506996 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
67a5e6de46 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 6168fc0f79 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java d4abf4277b 
  ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java c5faec5e95 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
b5958fa9cc 
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java 
8451462023 
  streaming/src/test/org/apache/hive/streaming/TestStreaming.java c6d7e7f27c 


Diff: https://reviews.apache.org/r/71091/diff/1/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 71044: ACID: explore how we can avoid a move step during inserts/compaction

2019-07-12 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71044/
---

(Updated July 12, 2019, 9:40 p.m.)


Review request for hive and Gopal V.


Bugs: HIVE-21164
https://issues.apache.org/jira/browse/HIVE-21164


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-21164


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 5fd0ef9161 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 d59cfe51e9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
bb89f803d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 695d08bbe2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 437266355a 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 295fe7cbd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8477 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 3fa61d3560 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 691f3ee2e9 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
5d6143d6a4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java d47457857c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
757cb7af4d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 61ea28a5f5 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 78f25856a4 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
a75103d60d 
  ql/src/test/results/clientpositive/acid_subquery.q.out 1dc1775557 
  ql/src/test/results/clientpositive/create_transactional_full_acid.q.out 
e324d5ec43 
  
ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_dynamic.q.out
 61b0057adb 
  ql/src/test/results/clientpositive/llap/acid_no_buckets.q.out ae1d97fa21 
  ql/src/test/results/clientpositive/llap/insert_overwrite.q.out fbc3326b39 
  ql/src/test/results/clientpositive/llap/mm_all.q.out 6cb34e2c79 
  ql/src/test/results/clientpositive/mm_all.q.out 2c0247a539 


Diff: https://reviews.apache.org/r/71044/diff/4/

Changes: https://reviews.apache.org/r/71044/diff/3-4/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-21990) ACID: remove any difference between an mm table insert and full acid table insert

2019-07-12 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21990:
---

 Summary: ACID: remove any difference between an mm table insert 
and full acid table insert
 Key: HIVE-21990
 URL: https://issues.apache.org/jira/browse/HIVE-21990
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 4.0.0
Reporter: Vaibhav Gumashta


HIVE-21164 makes acid insert work like mm-table insert by writing directly to 
the destination and using manifest files to track committed files in a task and 
the job. After that, while there should be no difference in the insert code 
paths, there may be some cases where the difference remains (e.g. HIVE-17695). 
This jira will investigate any such issues and fix it.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: Review Request 71044: ACID: explore how we can avoid a move step during inserts/compaction

2019-07-12 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71044/
---

(Updated July 12, 2019, 10:58 a.m.)


Review request for hive and Gopal V.


Bugs: HIVE-21164
https://issues.apache.org/jira/browse/HIVE-21164


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-21164


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 5fd0ef9161 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 d59cfe51e9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
bb89f803d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 695d08bbe2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 437266355a 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 295fe7cbd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8477 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 3fa61d3560 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 691f3ee2e9 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
5d6143d6a4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java d47457857c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
757cb7af4d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 61ea28a5f5 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 78f25856a4 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
a75103d60d 


Diff: https://reviews.apache.org/r/71044/diff/3/

Changes: https://reviews.apache.org/r/71044/diff/2-3/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 71044: ACID: explore how we can avoid a move step during inserts/compaction

2019-07-10 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71044/
---

(Updated July 11, 2019, 5:02 a.m.)


Review request for hive and Gopal V.


Bugs: HIVE-21164
https://issues.apache.org/jira/browse/HIVE-21164


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-21164


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 5fd0ef9161 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 d59cfe51e9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
bb89f803d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 695d08bbe2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 437266355a 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 295fe7cbd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8477 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 3fa61d3560 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 691f3ee2e9 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
5d6143d6a4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java d47457857c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
757cb7af4d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 61ea28a5f5 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 78f25856a4 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
a75103d60d 


Diff: https://reviews.apache.org/r/71044/diff/2/

Changes: https://reviews.apache.org/r/71044/diff/1-2/


Testing
---


Thanks,

Vaibhav Gumashta



Review Request 71044: ACID: explore how we can avoid a move step during inserts/compaction

2019-07-10 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71044/
---

Review request for hive and Gopal V.


Bugs: HIVE-21164
https://issues.apache.org/jira/browse/HIVE-21164


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-21164


Diffs
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 5fd0ef9161 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 d59cfe51e9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
bb89f803d5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 695d08bbe2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1346bed5a7 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 295fe7cbd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8477 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 3fa61d3560 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 691f3ee2e9 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
5d6143d6a4 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7c58072413 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
757cb7af4d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 61ea28a5f5 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 78f25856a4 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
a75103d60d 


Diff: https://reviews.apache.org/r/71044/diff/1/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-21757) ACID: use a new write id for compaction's output instead of the visibility id

2019-05-20 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21757:
---

 Summary: ACID: use a new write id for compaction's output instead 
of the visibility id
 Key: HIVE-21757
 URL: https://issues.apache.org/jira/browse/HIVE-21757
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 4.0.0
Reporter: Vaibhav Gumashta


HIVE-20823 added support for running compaction within a transaction. To 
control the visibility of the output directory, it uses 
base_writeId_visibilityId, where visibilityId is the transaction id of the 
transaction that the compactor ran in. Perhaps we can keep using the 
base_writeId format, by allocating a new writeId for the compactor and creating 
the new base/delta with that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21749) ACID: Provide an option to run Cleaner thread from Hive client

2019-05-17 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21749:
---

 Summary: ACID: Provide an option to run Cleaner thread from Hive 
client
 Key: HIVE-21749
 URL: https://issues.apache.org/jira/browse/HIVE-21749
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 4.0.0
Reporter: Vaibhav Gumashta


In some cases, it could be useful to trigger the cleaner thread manually. We 
should provide an option for that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21513) ACID: Running merge concurrently with minor compaction causes a later select * to throw exception

2019-03-26 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21513:
---

 Summary: ACID: Running merge concurrently with minor compaction 
causes a later select * to throw exception 
 Key: HIVE-21513
 URL: https://issues.apache.org/jira/browse/HIVE-21513
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.1
Reporter: Vaibhav Gumashta


Repro steps:

- Create table 
- Load some data 
- Run merge so records gets updated and delete_delta dirs are created
- Manually initiate minor compaction: ALTER TABLE ... COMPACT 'minor';
- While the compaction is running keep executing the merge statement
- After some time try to do simple select *;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21470) ACID: Optimize RecordReader creation when SearchArgument is provided

2019-03-18 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21470:
---

 Summary: ACID: Optimize RecordReader creation when SearchArgument 
is provided
 Key: HIVE-21470
 URL: https://issues.apache.org/jira/browse/HIVE-21470
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 2.3.4, 3.1.1
Reporter: Vaibhav Gumashta


Consider the following query:
{code}
select col1 from tbl1 where year_partition=2019;
{code}
 
If the table has a lot of columns, currently we end up creating a TreeReader 
for each column, even when it won't pass the SearchArgument:
{code}
TreeReaderFactory.createTreeReader(TypeDescription, TreeReaderFactory$Context) 
line: 2339   
TreeReaderFactory$StructTreeReader.(int, TypeDescription, 
TreeReaderFactory$Context) line: 1974   
TreeReaderFactory.createTreeReader(TypeDescription, TreeReaderFactory$Context) 
line: 2390   
RecordReaderImpl(RecordReaderImpl).(ReaderImpl, Reader$Options) line: 267 
RecordReaderImpl.(ReaderImpl, Reader$Options, Configuration) line: 67 
ReaderImpl.rowsOptions(Reader$Options, Configuration) line: 83  
OrcRawRecordMerger$OriginalReaderPairToRead.(OrcRawRecordMerger$ReaderKey,
 Reader, int, RecordIdentifier, RecordIdentifier, Reader$Options, 
OrcRawRecordMerger$Options, Configuration, ValidWriteIdList, int) line: 446   
OrcRawRecordMerger.(Configuration, boolean, Reader, boolean, int, 
ValidWriteIdList, Reader$Options, Path[], OrcRawRecordMerger$Options) line: 
1057
OrcInputFormat.getReader(InputSplit, Options) line: 2108
OrcInputFormat.getRecordReader(InputSplit, JobConf, Reporter) line: 2006
FetchOperator$FetchInputFormatSplit.getRecordReader(JobConf) line: 776  
{code}

If the table has 1000 column, and spans N splits, we will end up creating 
1000*N TreeReader objects when we might need only N (1/split).




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21460) ACID: Load data followed by a select * query results in incorrect results

2019-03-16 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21460:
---

 Summary: ACID: Load data followed by a select * query results in 
incorrect results
 Key: HIVE-21460
 URL: https://issues.apache.org/jira/browse/HIVE-21460
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.1
Reporter: Vaibhav Gumashta


This affects current master as well. Created an orc file such that it spans 
multiple stripes and ran a simple select *, and got incorrect row counts (when 
comparing with select count(*). The problem seems to be that after split 
generation and creating min/max rowId for each row (note that since the loaded 
file is not written by Hive ACID, it does not have ROW__ID in the file; but the 
ROW__ID is applied on read by discovering min/max bounds which are used for 
calculating ROW__ID.rowId for each row of a split), Hive is only reading the 
last split.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21458) ACID: Optimize AcidUtils$MetaDataFile.isRawFormat check by caching the split reader

2019-03-15 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21458:
---

 Summary: ACID: Optimize AcidUtils$MetaDataFile.isRawFormat check 
by caching the split reader
 Key: HIVE-21458
 URL: https://issues.apache.org/jira/browse/HIVE-21458
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.1
Reporter: Vaibhav Gumashta


In the transactional subsystems, in several places we check to see if a data 
file has ROW__ID fields or not. Every time we do that (even within the context 
of the same query), we open a Reader for that file/split. We could optimize 
this by caching. Also, perhaps we don't need to do this for every split. An 
example call stack:
{code}
OrcFile.createReader(Path, OrcFile$ReaderOptions) line: 105 
AcidUtils$MetaDataFile.isRawFormatFile(Path, FileSystem) line: 2026 
AcidUtils$MetaDataFile.isRawFormat(Path, FileSystem) line: 2022 
AcidUtils.parsedDelta(Path, String, FileSystem) line: 1007  
OrcRawRecordMerger$TransactionMetaData.findWriteIDForSynthetcRowIDs(Path, Path, 
Configuration) line: 1231   
OrcRawRecordMerger.discoverOriginalKeyBounds(Reader, int, Reader$Options, 
Configuration, OrcRawRecordMerger$Options) line: 722  
OrcRawRecordMerger.(Configuration, boolean, Reader, boolean, int, 
ValidWriteIdList, Reader$Options, Path[], OrcRawRecordMerger$Options) line: 
1022
OrcInputFormat.getReader(InputSplit, Options) line: 2108
OrcInputFormat.getRecordReader(InputSplit, JobConf, Reporter) line: 2006
FetchOperator$FetchInputFormatSplit.getRecordReader(JobConf) line: 776  
FetchOperator.getRecordReader() line: 344   
FetchOperator.getNextRow() line: 540
FetchOperator.pushRow() line: 509   
FetchTask.fetch(List) line: 146 
{code} 

Here, for each split we'll make that check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21451) ACID: Avoid using hive.acid.key.index to determine if the file is original or not

2019-03-15 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21451:
---

 Summary: ACID: Avoid using hive.acid.key.index to determine if the 
file is original or not
 Key: HIVE-21451
 URL: https://issues.apache.org/jira/browse/HIVE-21451
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.1
Reporter: Vaibhav Gumashta


The transactional files written in hive have each row decorated with ROW__ID 
column. However, when we bring in files using LOAD DATA... command to the 
transactional tables, they do not have these metadata columns (in Hive ACID 
parlance, these are called original files). These original files are decorated 
with an inferred ROW__ID generated while reading these. However, after these 
are compacted, the ROW__ID metadata column, becomes part of the file itself.

To determine if a file is original or not, currently we use check for the 
presence of hive.acid.key.index. For query based compaction, currently we do 
not write hive.acid.key.index (HIVE-21165). This means, there is a possibility 
that that even after compaction, they get treated as original files.

Irrespective of HIVE-21165, we should avoid hive.acid.key.index to decide 
whether the file is original or not, and instead look for the presence of 
ROW__ID to do that. hive.acid.key.index should be treated as a performance 
optimization, as it was seemingly meant to be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Null pointer exception on running compaction against an MM table

2019-02-18 Thread Vaibhav Gumashta
The approach is similar, but it is not identical. Let me go over the query 
based compaction codepath to see if I spot this bug there.

Thanks,
--Vaibhav

From: Aditya Shah 
Date: Saturday, February 16, 2019 at 3:44 AM
To: Vaibhav Gumashta 
Cc: "dev@hive.apache.org" , Eugene Koifman 
, Gopal Vijayaraghavan 
Subject: Re: Null pointer exception on running compaction against an MM table

[mage removed by sender.]
Hi,

Thanks for the reply, have opened a JIRA (HIVE-21280) for the same and will 
upload a patch soon. But I further had doubts on the new query based compactor 
for full CRUD tables that has gone into master in HIVE-20699. Does major 
compaction work there using query based compactor similar to the one for MM 
table, because I expect the same problem to exist there as well?

Aditya


On Sat, Feb 16, 2019 at 2:34 AM Vaibhav Gumashta 
mailto:vgumas...@hortonworks.com>> wrote:
Aditya,

Thanks for reporting this. Would you like to create a jira for this 
(https://issues.apache.org/jira/projects/HIVE)? Additionally, if you would like 
to work on a fix, I’m happy to help in reviewing.

--Vaibhav

From: Aditya Shah mailto:adityashah3...@gmail.com>>
Date: Friday, February 15, 2019 at 2:05 AM
To: "dev@hive.apache.org<mailto:dev@hive.apache.org>" 
mailto:dev@hive.apache.org>>
Cc: Eugene Koifman mailto:ekoif...@hortonworks.com>>, 
Vaibhav Gumashta mailto:vgumas...@hortonworks.com>>, 
Gopal Vijayaraghavan mailto:go...@hortonworks.com>>
Subject: Null pointer exception on running compaction against an MM table

Error! Filename not specified.
Hi,

I was trying to run compaction on MM table but got a null pointer exception 
while getting HDFS session path. The error seemed to me that session state was 
not started for this queries. Am I missing something here? I do think session 
state needs to be started for each of the queries (insert into temp table etc) 
running for compaction (I'm also doubtful for statsupdater thread's queries) on 
HMS. Some details are as follows:

Env./Versions: Using Hive-3.1.1 (rel/release-3.1.1)

Steps to reproduce:
1) Using beeline with HS2 and HMS
2) create an MM table
3) Insert a few values in the table
4) alter table mm_table compact 'major' and wait;
Stack trace on HMS:

compactor.Worker: Caught exception while trying to compact 
id:8,dbname:default,tableName:acid_mm_orc,partName:null,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestWriteId:0.
  Marking failed to avoid repeated failures, java.io.IOException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run create 
temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` int, `b` 
string)  ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH 
SERDEPROPERTIES (
  'serialization.format'='1')STORED AS INPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 
'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base'
 TBLPROPERTIES ('transactional'='false')
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:373)
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:241)
at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run 
create temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` int, 
`b` string)  ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH 
SERDEPROPERTIES (
  'serialization.format'='1')STORED AS INPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 
'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base'
 TBLPROPERTIES ('transactional'='false')
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:525)
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:365)
... 2 more
Caused by: java.lang.NullPointerException: Non-local session path expected to 
be non-null
at 
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228)
at 
org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:815)
at org.apache.hadoop.hive.ql.Context.(Context.java:309)
at org.apache.hadoop.hive.ql.Context.(Context.java:295)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:591)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1684)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1807)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1567)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1556)
at 
org.apache.hadoop.hive

Re: Null pointer exception on running compaction against an MM table

2019-02-15 Thread Vaibhav Gumashta
Aditya,

Thanks for reporting this. Would you like to create a jira for this 
(https://issues.apache.org/jira/projects/HIVE)? Additionally, if you would like 
to work on a fix, I’m happy to help in reviewing.

--Vaibhav

From: Aditya Shah 
Date: Friday, February 15, 2019 at 2:05 AM
To: "dev@hive.apache.org" 
Cc: Eugene Koifman , Vaibhav Gumashta 
, Gopal Vijayaraghavan 
Subject: Null pointer exception on running compaction against an MM table

[mage removed by sender.]
Hi,

I was trying to run compaction on MM table but got a null pointer exception 
while getting HDFS session path. The error seemed to me that session state was 
not started for this queries. Am I missing something here? I do think session 
state needs to be started for each of the queries (insert into temp table etc) 
running for compaction (I'm also doubtful for statsupdater thread's queries) on 
HMS. Some details are as follows:

Env./Versions: Using Hive-3.1.1 (rel/release-3.1.1)

Steps to reproduce:
1) Using beeline with HS2 and HMS
2) create an MM table
3) Insert a few values in the table
4) alter table mm_table compact 'major' and wait;
Stack trace on HMS:

compactor.Worker: Caught exception while trying to compact 
id:8,dbname:default,tableName:acid_mm_orc,partName:null,state:^@,type:MAJOR,properties:null,runAs:null,tooManyAborts:false,highestWriteId:0.
  Marking failed to avoid repeated failures, java.io.IOException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run create 
temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` int, `b` 
string)  ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH 
SERDEPROPERTIES (
  'serialization.format'='1')STORED AS INPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 
'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base'
 TBLPROPERTIES ('transactional'='false')
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:373)
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.run(CompactorMR.java:241)
at org.apache.hadoop.hive.ql.txn.compactor.Worker.run(Worker.java:174)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to run 
create temporary table default.tmp_compactor_acid_mm_orc_1550222367257(`a` int, 
`b` string)  ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'WITH 
SERDEPROPERTIES (
  'serialization.format'='1')STORED AS INPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 
'hdfs://localhost:9000/user/hive/warehouse/acid_mm_orc/_tmp_2d8a096c-2db5-4ed8-921c-b3f6d31e079e/_base'
 TBLPROPERTIES ('transactional'='false')
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:525)
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runMmCompaction(CompactorMR.java:365)
... 2 more
Caused by: java.lang.NullPointerException: Non-local session path expected to 
be non-null
at 
com.google.common.base.Preconditions.checkNotNull(Preconditions.java:228)
at 
org.apache.hadoop.hive.ql.session.SessionState.getHDFSSessionPath(SessionState.java:815)
at org.apache.hadoop.hive.ql.Context.(Context.java:309)
at org.apache.hadoop.hive.ql.Context.(Context.java:295)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:591)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1684)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1807)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1567)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1556)
at 
org.apache.hadoop.hive.ql.txn.compactor.CompactorMR.runOnDriver(CompactorMR.java:522)
... 3 more

Observations:1) SessionState.start() initializes paths, hivehist etc.
  2) SessionState is not started in setupSessionState() in runMMCompaction(). 
(There is also a comment by Sergey in the code regarding the same)
  3) Even after making it start the session state it further fails in running a 
Teztask for insert overwrite on temp table with the contents of the original 
table.
  4) The cause for 3) is Tezsession state is not able to initialize due to 
Illegal Argument exception being thrown at the time of setting up caller 
context in Tez task due to caller id being empty
  5) Reason for 4) is queryid is an empty string for such queries.
  6) A possible solution for 5) Building querystate with queryid in 
runOnDriver() in DriverUtils.java

Do let me know if you need some more information for the same.

Thanks and Regards,
Aditya Shah
5th Year
M.Sc.(Hons.)  Mathematics & B.E.(Hons.) Computer Science and Engineering

Birla Institute of Technology & Science, Pilani
Vidhya Vihar

Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2019-02-01 Thread Vaibhav Gumashta


> On Jan. 29, 2019, 2:04 a.m., Eugene Koifman wrote:
> > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
> > Lines 299 (patched)
> > <https://reviews.apache.org/r/69367/diff/7-9/?file=2121174#file2121174line328>
> >
> > testMoreBucketsThanReducers/testMoreBucketsThanReducers2 in 
> > TestTxnCommands force a specific number of reducers

I used conf.setIntVar(HiveConf.ConfVars.HADOOPNUMREDUCERS, 2) on an unbucketed 
table. However, the inserts created 2 different buckets. For example:

vgumashta:hive vgumashta$ 
../orc-git/build/_CPack_Packages/Darwin/TGZ/ORC-1.5.4-Darwin/bin/orc-contents  
/Users/vgumashta/Documents/workspace/hive/itests/hive-unit/target/tmp/org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez-1549010487358_1258013156/warehouse/testcompactionwithschemaevolutionnobucketsmultiplereducers/ds\=today/delta_003_003_/bucket_0
 
{"operation": 0, "originalTransaction": 3, "bucket": 536870912, "rowId": 0, 
"currentTransaction": 3, "row": {"a": 3, "b": 3, "c": 1001}}
{"operation": 0, "originalTransaction": 3, "bucket": 536870912, "rowId": 1, 
"currentTransaction": 3, "row": {"a": 4, "b": 2, "c": 1003}}
{"operation": 0, "originalTransaction": 3, "bucket": 536870912, "rowId": 2, 
"currentTransaction": 3, "row": {"a": 4, "b": 4, "c": 1005}}

vgumashta:hive vgumashta$ 
../orc-git/build/_CPack_Packages/Darwin/TGZ/ORC-1.5.4-Darwin/bin/orc-contents  
/Users/vgumashta/Documents/workspace/hive/itests/hive-unit/target/tmp/org.apache.hadoop.hive.ql.txn.compactor.TestCrudCompactorOnTez-1549010487358_1258013156/warehouse/testcompactionwithschemaevolutionnobucketsmultiplereducers/ds\=yesterday/delta_003_003_/bucket_1
 
{"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 0, 
"currentTransaction": 3, "row": {"a": 3, "b": 2, "c": 1000}}
{"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 1, 
"currentTransaction": 3, "row": {"a": 3, "b": 4, "c": 1002}}
{"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 2, 
"currentTransaction": 3, "row": {"a": 4, "b": 3, "c": 1004}}


- Vaibhav


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/#review212399
---


On Jan. 28, 2019, 7:49 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69367/
> ---
> 
> (Updated Jan. 28, 2019, 7:49 p.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Bugs: HIVE-20699
> https://issues.apache.org/jira/browse/HIVE-20699
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://jira.apache.org/jira/browse/HIVE-20699
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b3a475478d 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
> d6a41919bf 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java e7aa041c25 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
> 15c14c9be5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> fbb931cbcd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 
> 6d4578e7a0 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> db3b427adc 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> dc05e1990e 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> a0df82cb20 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
>  PRE-CREATION 
>   ql/src/test/results/clientpositive/show_functions.q.out c9716e904c 
> 
> 
> Diff: https://reviews.apache.org/r/69367/diff/9/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2019-01-28 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/
---

(Updated Jan. 28, 2019, 7:49 p.m.)


Review request for hive and Eugene Koifman.


Bugs: HIVE-20699
https://issues.apache.org/jira/browse/HIVE-20699


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-20699


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b3a475478d 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
d6a41919bf 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java e7aa041c25 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
15c14c9be5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
fbb931cbcd 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 6d4578e7a0 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
db3b427adc 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
dc05e1990e 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java a0df82cb20 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
 PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out c9716e904c 


Diff: https://reviews.apache.org/r/69367/diff/9/

Changes: https://reviews.apache.org/r/69367/diff/8-9/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2019-01-25 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/
---

(Updated Jan. 26, 2019, 12:32 a.m.)


Review request for hive and Eugene Koifman.


Bugs: HIVE-20699
https://issues.apache.org/jira/browse/HIVE-20699


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-20699


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b3a475478d 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
d6a41919bf 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java e7aa041c25 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
15c14c9be5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
fbb931cbcd 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 6d4578e7a0 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
db3b427adc 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
dc05e1990e 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java a0df82cb20 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
 PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out c9716e904c 


Diff: https://reviews.apache.org/r/69367/diff/8/

Changes: https://reviews.apache.org/r/69367/diff/7-8/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2019-01-25 Thread Vaibhav Gumashta
.org/r/69367/diff/7/?file=2121179#file2121179line638>
> >
> > is there a followup Jira for this?

https://jira.apache.org/jira/browse/HIVE-21165


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
> > Lines 248 (patched)
> > <https://reviews.apache.org/r/69367/diff/7/?file=2121182#file2121182line248>
> >
> > What does this do for MM table?

H, bug - fixed it.


> On Jan. 23, 2019, 1:23 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
> > Lines 91 (patched)
> > <https://reviews.apache.org/r/69367/diff/7/?file=2121184#file2121184line91>
> >
> > when is it ok for 2 consecutive ROW_IDs to be equal?

Throwing an exception now if comparison returs 0.


- Vaibhav


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/#review212198
---


On Jan. 22, 2019, 7:04 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69367/
> ---
> 
> (Updated Jan. 22, 2019, 7:04 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Bugs: HIVE-20699
> https://issues.apache.org/jira/browse/HIVE-20699
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://jira.apache.org/jira/browse/HIVE-20699
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b213609f39 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
> d6a41919bf 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java bbe7fb0697 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
> 15c14c9be5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> fbb931cbcd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 
> 6d4578e7a0 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 0e5b3e5473 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> dc05e1990e 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> a0df82cb20 
>   
> ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
>  PRE-CREATION 
>   ql/src/test/results/clientpositive/show_functions.q.out 0fdcbda66f 
> 
> 
> Diff: https://reviews.apache.org/r/69367/diff/7/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



[jira] [Created] (HIVE-21167) Bucketing: Bucketing version 1 is incorrectly partitioning data

2019-01-25 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21167:
---

 Summary: Bucketing: Bucketing version 1 is incorrectly 
partitioning data
 Key: HIVE-21167
 URL: https://issues.apache.org/jira/browse/HIVE-21167
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.1.1
Reporter: Vaibhav Gumashta


Using murmur hash for bucketing columns was introduced in HIVE-18910, following 
which {{'bucketing_version'='1'}} stands for the old behaviour (where for 
example integer columns were partitioned based on mod values). Looks like we 
have a bug in the old bucketing scheme now. I could repro it when modified the 
existing schema using an alter table add column and adding new data. Repro:

{code}
0: jdbc:hive2://localhost:10010> create transactional table acid_ptn_bucket1 (a 
int, b int) partitioned by(ds string) clustered by (a) into 2 buckets stored as 
ORC TBLPROPERTIES('bucketing_version'='1', 'transactional'='true', 
'transactional_properties'='default');

No rows affected (0.418 seconds)

0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
values(1,2,'today'),(1,3,'today'),(1,4,'yesterday'),(2,2,'yesterday'),(2,3,'today'),(2,4,'today');
6 rows affected (3.695 seconds)
{code}

Data from ORC file (data as expected):
{code}
/apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_0
{"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 0, 
"currentTransaction": 1, "row": {"a": 2, "b": 4}}
{"operation": 0, "originalTransaction": 1, "bucket": 536870912, "rowId": 1, 
"currentTransaction": 1, "row": {"a": 2, "b": 3}}


/apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_001_001_/bucket_1
{"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 0, 
"currentTransaction": 1, "row": {"a": 1, "b": 3}}
{"operation": 0, "originalTransaction": 1, "bucket": 536936448, "rowId": 1, 
"currentTransaction": 1, "row": {"a": 1, "b": 2}}
{code}

Modifying table schema and inserting new data:
{code}
0: jdbc:hive2://localhost:10010> alter table acid_ptn_bucket1 add columns(c 
int);

No rows affected (0.541 seconds)

0: jdbc:hive2://localhost:10010> insert into acid_ptn_bucket1 partition (ds) 
values(3,2,1000,'yesterday'),(3,3,1001,'today'),(3,4,1002,'yesterday'),(4,2,1003,'today'),
 (4,3,1004,'yesterday'),(4,4,1005,'today');
6 rows affected (3.699 seconds)
{code}

Data from ORC file (wrong partitioning):
{code}
/apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_0
{"operation": 0, "originalTransaction": 3, "bucket": 536870912, "rowId": 0, 
"currentTransaction": 3, "row": {"a": 3, "b": 3, "c": 1001}}

/apps/hive/warehouse/acid_ptn_bucket1/ds=today/delta_003_003_/bucket_1
{"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 0, 
"currentTransaction": 3, "row": {"a": 4, "b": 4, "c": 1005}}
{"operation": 0, "originalTransaction": 3, "bucket": 536936448, "rowId": 1, 
"currentTransaction": 3, "row": {"a": 4, "b": 2, "c": 1003}}
{code}

As seen above, the expected behaviour is that new data with column 'a' being 3 
should go to bucket1 and column 'a' being 4 should go to bucket0, but the 
partitioning is wrong.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21165) ACID: pass query hint to the writers to write hive.acid.key.index

2019-01-24 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21165:
---

 Summary: ACID: pass query hint to the writers to write 
hive.acid.key.index
 Key: HIVE-21165
 URL: https://issues.apache.org/jira/browse/HIVE-21165
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.1
Reporter: Vaibhav Gumashta


For the query based compactor from HIVE-20699, the compaction runs as a sql 
query. However, this mechanism skips over writing hive.acid.key.index for each 
stripe, which is used to skip over stripes that are not supposed to be read. We 
need a way to pass a query hint to the writer so that it can write this index 
data, when invoked from a sql query.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-21164) ACID: explore how we can avoid a move step during compaction

2019-01-24 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21164:
---

 Summary: ACID: explore how we can avoid a move step during 
compaction
 Key: HIVE-21164
 URL: https://issues.apache.org/jira/browse/HIVE-21164
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.1
Reporter: Vaibhav Gumashta


Currently, we write compacted data to a temporary location and then move the 
files to a final location, which is an expensive operation on some cloud file 
systems. Since HIVE-20823 is already in, it can control the visibility of 
compacted data for the readers. Therefore, we can perhaps avoid writing data to 
a temporary location and directly write compacted data to the intended final 
path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2019-01-21 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/
---

(Updated Jan. 22, 2019, 7:04 a.m.)


Review request for hive and Eugene Koifman.


Bugs: HIVE-20699
https://issues.apache.org/jira/browse/HIVE-20699


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-20699


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b213609f39 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
d6a41919bf 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java bbe7fb0697 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
15c14c9be5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
fbb931cbcd 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 6d4578e7a0 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
0e5b3e5473 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
dc05e1990e 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java a0df82cb20 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
 PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out 0fdcbda66f 


Diff: https://reviews.apache.org/r/69367/diff/7/

Changes: https://reviews.apache.org/r/69367/diff/6-7/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-21137) JDBC: HiveDatabaseMetaData.getTables does not adhere to jdbc spec

2019-01-18 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21137:
---

 Summary: JDBC: HiveDatabaseMetaData.getTables does not adhere to 
jdbc spec
 Key: HIVE-21137
 URL: https://issues.apache.org/jira/browse/HIVE-21137
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 2.3.4, 3.1.1
Reporter: Vaibhav Gumashta
 Attachments: HiveJdbcClient.java

The {{types}} parameter in {{HiveDatabaseMetaData.getTable(String catalog, 
String schemaPattern, String tableNamePattern, String[] types)}} is supposed to 
honor only the return values from  {{HiveDatabaseMetaData.getTableTypes}}. 
However following is the output from the attached test jdbc programs: 

{code}
*** Using dbMetadata.getTables ***


*** With only EXTERNAL TABLE ***
Table: test1
Table: test_2
Table: names_text
Table: names_text_1


*** With only TABLE ***
Table: test1
Table: test_2
Table: names_text
Table: names_text_1


*** With only EXTERNAL_TABLE ***
Table: test1
Table: test_2
Table: names_text
Table: names_text_1


*** With empty array ***
Table: test1
Table: test_2
Table: names_text
Table: names_text_1


*** With VIEW ***


*** With INDEX_TABLE ***
Table: test1
Table: test_2
Table: names_text
Table: names_text_1


*** With VIEW, INDEX_TABLE ***


*** With EXTERNAL_TABLE, VIEW, INDEX_TABLE ***


*** With TABLE, VIEW, INDEX_TABLE ***
Table: test1
Table: test_2
Table: names_text
Table: names_text_1


*** With a random string ***
Table: test1
Table: test_2
Table: names_text
Table: names_text_1


*** getTableTypes ***
Table: TABLE
Table: TABLE
Table: VIEW
Table: MATERIALIZED_VIEW
{code}

We should fix the api so that clients can see expected behaviour.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2019-01-18 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/
---

(Updated Jan. 18, 2019, 10:41 p.m.)


Review request for hive and Eugene Koifman.


Changes
---

Rebased on master


Bugs: HIVE-20699
https://issues.apache.org/jira/browse/HIVE-20699


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-20699


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b213609f39 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
d6a41919bf 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java bbe7fb0697 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
15c14c9be5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
fbb931cbcd 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 6d4578e7a0 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
0e5b3e5473 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
dc05e1990e 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java a0df82cb20 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
 PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out 0fdcbda66f 


Diff: https://reviews.apache.org/r/69367/diff/6/

Changes: https://reviews.apache.org/r/69367/diff/5-6/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2019-01-18 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/
---

(Updated Jan. 18, 2019, 9:40 p.m.)


Review request for hive and Eugene Koifman.


Changes
---

Rebased on master


Bugs: HIVE-20699
https://issues.apache.org/jira/browse/HIVE-20699


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-20699


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b213609f39 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
d6a41919bf 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java bbe7fb0697 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
15c14c9be5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
fbb931cbcd 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 6d4578e7a0 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
0e5b3e5473 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
dc05e1990e 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java a0df82cb20 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
 PRE-CREATION 
  ql/src/test/results/clientpositive/show_functions.q.out 0fdcbda66f 


Diff: https://reviews.apache.org/r/69367/diff/5/

Changes: https://reviews.apache.org/r/69367/diff/4-5/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2019-01-10 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/
---

(Updated Jan. 10, 2019, 7:23 p.m.)


Review request for hive and Eugene Koifman.


Bugs: HIVE-20699
https://issues.apache.org/jira/browse/HIVE-20699


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-20699


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java b213609f39 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
d6a41919bf 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d7f069eaa7 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
15c14c9be5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
fbb931cbcd 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 6d4578e7a0 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
b477480e0a 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
7d5ee4a59f 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java a0df82cb20 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/69367/diff/4/

Changes: https://reviews.apache.org/r/69367/diff/3-4/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-21105) LLAP: provide an option to configure #retries and time between retries for LLAP-ZK SecretManager

2019-01-08 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21105:
---

 Summary: LLAP: provide an option to configure #retries and time 
between retries for LLAP-ZK SecretManager
 Key: HIVE-21105
 URL: https://issues.apache.org/jira/browse/HIVE-21105
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 2.3.4, 3.1.1
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2019-01-08 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/
---

(Updated Jan. 8, 2019, 7:18 p.m.)


Review request for hive and Eugene Koifman.


Bugs: HIVE-20699
https://issues.apache.org/jira/browse/HIVE-20699


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-20699


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65264f323f 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
40dd992455 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactorOnTez.java
 PRE-CREATION 
  pom.xml 26b662e4c3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 578b16cc7c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HiveSplitGenerator.java 
15c14c9be5 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
8cabf960db 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
6e7c78bd17 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
92c74e1d06 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java beb6902d7d 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/69367/diff/3/

Changes: https://reviews.apache.org/r/69367/diff/2-3/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-21080) Update Hive to use ORC-1.5.4

2019-01-02 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-21080:
---

 Summary: Update Hive to use ORC-1.5.4
 Key: HIVE-21080
 URL: https://issues.apache.org/jira/browse/HIVE-21080
 Project: Hive
  Issue Type: Bug
  Components: ORC
Reporter: Vaibhav Gumashta


Now that ORC-1.5.4 is released, we should update Hive's version of ORC so that 
HIVE-20699 can use it



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2018-11-19 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/
---

(Updated Nov. 19, 2018, 11:49 a.m.)


Review request for hive and Eugene Koifman.


Bugs: HIVE-20699
https://issues.apache.org/jira/browse/HIVE-20699


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-20699


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65264f323f 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
40dd992455 
  pom.xml 26b662e4c3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 578b16cc7c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
8cabf960db 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
6e7c78bd17 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
92c74e1d06 
  
ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFValidateAcidSortOrder.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/69367/diff/2/

Changes: https://reviews.apache.org/r/69367/diff/1-2/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2018-11-18 Thread Vaibhav Gumashta


> On Nov. 16, 2018, 3:02 a.m., Eugene Koifman wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 2685 (patched)
> > <https://reviews.apache.org/r/69367/diff/1/?file=2108425#file2108425line2685>
> >
> > "And minor compaction will be disabled." - should make sure Initiator 
> > doesn't start minor and that Alter Table commands requesting Minor are 
> > no-op or throw so that these don't get into the compactor queue.  We should 
> > also, perhaps think about how Initiator triggers Major compactions - are 
> > current config params adequate?  Should do at least the 2nd part in a 
> > follow up jira, maybe both.
> 
> Vaibhav Gumashta wrote:
> Created https://jira.apache.org/jira/browse/HIVE-20933

NM, addressing this as part of this jira itself as it's a small change and 
leaving this out might result in garbage data getting accumulated in metastore 
table


- Vaibhav


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/#review210589
-------


On Nov. 16, 2018, 12:59 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69367/
> ---
> 
> (Updated Nov. 16, 2018, 12:59 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Bugs: HIVE-20699
> https://issues.apache.org/jira/browse/HIVE-20699
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://jira.apache.org/jira/browse/HIVE-20699
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65264f323f 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
> 40dd992455 
>   pom.xml 26b662e4c3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 92c74e1d06 
> 
> 
> Diff: https://reviews.apache.org/r/69367/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



Re: Review Request 69367: Query based compactor for full CRUD Acid tables

2018-11-18 Thread Vaibhav Gumashta


> On Nov. 16, 2018, 3:02 a.m., Eugene Koifman wrote:
> >

Addressed the comments; will upload changes in the patch.


> On Nov. 16, 2018, 3:02 a.m., Eugene Koifman wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
> > Lines 2685 (patched)
> > <https://reviews.apache.org/r/69367/diff/1/?file=2108425#file2108425line2685>
> >
> > "And minor compaction will be disabled." - should make sure Initiator 
> > doesn't start minor and that Alter Table commands requesting Minor are 
> > no-op or throw so that these don't get into the compactor queue.  We should 
> > also, perhaps think about how Initiator triggers Major compactions - are 
> > current config params adequate?  Should do at least the 2nd part in a 
> > follow up jira, maybe both.

Created https://jira.apache.org/jira/browse/HIVE-20933


> On Nov. 16, 2018, 3:02 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
> > Lines 245 (patched)
> > <https://reviews.apache.org/r/69367/diff/1/?file=2108430#file2108430line252>
> >
> > Ideally this should be prevented before it gets into the 
> > compction_queue. throwing here will cause failed compactions to accumulate 
> > in SHOW COMPACTIONS and prevent auto-scheduling of new ones.

Created https://jira.apache.org/jira/browse/HIVE-20933 for this.


> On Nov. 16, 2018, 3:02 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
> > Lines 399 (patched)
> > <https://reviews.apache.org/r/69367/diff/1/?file=2108430#file2108430line406>
> >
> > should this be in a finally{}?  SessionState is threadLocal so it may 
> > get reused... or do we shutdown the session each time?

Made the change. We do remove the threadlocal via SessionState.detachSession() 
in the finally block in DriverUtils.runOnDriver


> On Nov. 16, 2018, 3:02 a.m., Eugene Koifman wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
> > Lines 513 (patched)
> > <https://reviews.apache.org/r/69367/diff/1/?file=2108430#file2108430line521>
> >
> > why do you need partition key/values in the query? we are always 
> > reading a single partition.  This is achieved by getAcidState() which takes 
> > partition dir as input (i.e. all the files it returns are within a given 
> > partition)

As discussed, I'll throw an exeption from SplitGrouper if we somehow end up 
getting splits across partitions in there (unlikely, since the sql query we run 
adds a partition filter).


- Vaibhav


-------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/#review210589
---


On Nov. 16, 2018, 12:59 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69367/
> ---
> 
> (Updated Nov. 16, 2018, 12:59 a.m.)
> 
> 
> Review request for hive and Eugene Koifman.
> 
> 
> Bugs: HIVE-20699
> https://issues.apache.org/jira/browse/HIVE-20699
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://jira.apache.org/jira/browse/HIVE-20699
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65264f323f 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
> 40dd992455 
>   pom.xml 26b662e4c3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 92c74e1d06 
> 
> 
> Diff: https://reviews.apache.org/r/69367/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



[jira] [Created] (HIVE-20934) Query based compactor for full CRUD Acid tables

2018-11-16 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20934:
---

 Summary: Query based compactor for full CRUD Acid tables
 Key: HIVE-20934
 URL: https://issues.apache.org/jira/browse/HIVE-20934
 Project: Hive
  Issue Type: Bug
  Components: Transactions
Affects Versions: 3.1.1
Reporter: Vaibhav Gumashta


Follow up of HIVE-20699. This is to enable running minor compactions as a 
HiveQL query 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20933) Disable minor compactions when using query based compactor

2018-11-16 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20933:
---

 Summary: Disable minor compactions  when using query based 
compactor
 Key: HIVE-20933
 URL: https://issues.apache.org/jira/browse/HIVE-20933
 Project: Hive
  Issue Type: Sub-task
  Components: Transactions
Affects Versions: 3.1.1
Reporter: Vaibhav Gumashta


HIVE-20699 introduces a new config, which can enable major compaction to be run 
as a HiveQL query, so that it can take advantage of the underlying execution 
engine and not run compaction as an MR job. This is particularly useful for 
cloud deployments. For now, we will disable minor compactions if this config 
flag is enabled. We will work on enabling minor compactions in a separate jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 69367: Query based compactor for full CRUD Acid tables

2018-11-15 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69367/
---

Review request for hive and Eugene Koifman.


Bugs: HIVE-20699
https://issues.apache.org/jira/browse/HIVE-20699


Repository: hive-git


Description
---

https://jira.apache.org/jira/browse/HIVE-20699


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 65264f323f 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
40dd992455 
  pom.xml 26b662e4c3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 7f8bd229a6 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 4d55592b63 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
92c74e1d06 


Diff: https://reviews.apache.org/r/69367/diff/1/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 69143: CachedStore: Add more UT coverage (outside of .q files)

2018-11-01 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69143/
---

(Updated Nov. 2, 2018, 12:46 a.m.)


Review request for hive, Daniel Dai and Thejas Nair.


Bugs: HIVE-20613
https://issues.apache.org/jira/browse/HIVE-20613


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-20613


Diffs (updated)
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
 944c81313a 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 70490f09e7 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 c24e7160ac 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 bb20d9f42a 
  standalone-metastore/metastore-server/src/test/resources/log4j2.properties 
365687e1c9 


Diff: https://reviews.apache.org/r/69143/diff/2/

Changes: https://reviews.apache.org/r/69143/diff/1-2/


Testing
---


Thanks,

Vaibhav Gumashta



Review Request 69143: CachedStore: Add more UT coverage (outside of .q files)

2018-10-24 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69143/
---

Review request for hive, Daniel Dai and Thejas Nair.


Bugs: HIVE-20613
https://issues.apache.org/jira/browse/HIVE-20613


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-20613


Diffs
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
 944c81313a 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 70490f09e7 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 c24e7160ac 
  
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 bb20d9f42a 
  standalone-metastore/metastore-server/src/test/resources/log4j2.properties 
365687e1c9 


Diff: https://reviews.apache.org/r/69143/diff/1/


Testing
---


Thanks,

Vaibhav Gumashta



Re: [ANNOUNCE] New committer: Nishant Bangarwa

2018-10-19 Thread Vaibhav Gumashta
Congrats Nishant.

On 10/19/18, 11:12 AM, "Prasanth Jayachandran"  
wrote:

Congratulations!

> On Oct 19, 2018, at 11:10 AM, Vihang Karajgaonkar 
 wrote:
> 
> Congrats Nishant!
> 
> On Fri, Oct 19, 2018 at 11:02 AM, Deepak Jaiswal 

> wrote:
> 
>> Congratulations!
>> 
>> On 10/19/18, 10:14 AM, "Vineet Garg"  wrote:
>> 
>>Congrats Nishant!
>> 
>>> On Oct 19, 2018, at 8:36 AM, Gunther Hagleitner <
>> ghagleit...@hortonworks.com> wrote:
>>> 
>>> Congrats Nishant!
>>> 
>>> Cheers,
>>> Gunther.
>>> 
>>> From: Andrew Sherman 
>>> Sent: Friday, October 19, 2018 8:34 AM
>>> To: dev@hive.apache.org
>>> Subject: Re: [ANNOUNCE] New committer: Nishant Bangarwa
>>> 
>>> Congratulations Nishant!
>>> 
>>> On Fri, Oct 19, 2018 at 4:29 AM Peter Vary
>> 
>>> wrote:
>>> 
 Congratulations Nishant!
 
> On Oct 19, 2018, at 07:42, Sankar Hariappan <
>> shariap...@hortonworks.com>
 wrote:
> 
> Congrats Nishant!
> 
> Best regards
> Sankar
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On 15/10/18, 12:45 PM, "Ashutosh Chauhan" 
>> wrote:
> 
>> Apache Hive's Project Management Committee (PMC) has invited
>> Nishant
>> Bangarwa
>> to become a committer, and we are pleased to announce that he has
 accepted.
>> 
>> Nishant, welcome, thank you for your contributions, and we look
>> forward
 your
>> further interactions with the community!
>> 
>> Ashutosh Chauhan (on behalf of the Apache Hive PMC)
 
 
>>> 
>>> 
>> 
>> 
>> 
>> 





[jira] [Created] (HIVE-20754) JDBC: Add some missing classes to jdbc standalone jar (follow up to HIVE-19801)

2018-10-16 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20754:
---

 Summary: JDBC: Add some missing classes to jdbc standalone jar 
(follow up to HIVE-19801)
 Key: HIVE-20754
 URL: https://issues.apache.org/jira/browse/HIVE-20754
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 3.1.0
Reporter: Vaibhav Gumashta


Some more classes are needed in a secure cluster



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20734) Beeline: When beeline-site.xml is and hive CLI redirects to beeline, it should use the system username/dummy password instead of prompting for one

2018-10-12 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20734:
---

 Summary: Beeline: When beeline-site.xml is and hive CLI redirects 
to beeline, it should use the system username/dummy password instead of 
prompting for one
 Key: HIVE-20734
 URL: https://issues.apache.org/jira/browse/HIVE-20734
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 3.1.0
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20676) HiveServer2: PrivilegeSynchronizer is not set to daemon status

2018-10-02 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20676:
---

 Summary: HiveServer2: PrivilegeSynchronizer is not set to daemon 
status
 Key: HIVE-20676
 URL: https://issues.apache.org/jira/browse/HIVE-20676
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 4.0.0
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20615) CachedStore: Background refresh thread bug fixes

2018-09-20 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20615:
---

 Summary: CachedStore: Background refresh thread bug fixes
 Key: HIVE-20615
 URL: https://issues.apache.org/jira/browse/HIVE-20615
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Affects Versions: 3.1.0
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20614) CachedStore: Run a select q file tests with CachedStore enabled

2018-09-20 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20614:
---

 Summary: CachedStore: Run a select q file tests with CachedStore 
enabled
 Key: HIVE-20614
 URL: https://issues.apache.org/jira/browse/HIVE-20614
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20613) CachedStore: Add more UT coverage (outside of .q files)

2018-09-20 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20613:
---

 Summary: CachedStore: Add more UT coverage (outside of .q files)
 Key: HIVE-20613
 URL: https://issues.apache.org/jira/browse/HIVE-20613
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20578) Enable org.apache.hive.jdbc.miniHS2.TestHs2ConnectionMetricsHttp.testOpenConnectionMetrics

2018-09-17 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20578:
---

 Summary: Enable 
org.apache.hive.jdbc.miniHS2.TestHs2ConnectionMetricsHttp.testOpenConnectionMetrics
 Key: HIVE-20578
 URL: https://issues.apache.org/jira/browse/HIVE-20578
 Project: Hive
  Issue Type: Bug
  Components: Test
Affects Versions: 4.0.0
Reporter: Vaibhav Gumashta


Disabled in HIVE-20577



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20577) Disable org.apache.hive.jdbc.miniHS2.TestHs2ConnectionMetricsHttp.testOpenConnectionMetrics

2018-09-17 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20577:
---

 Summary: Disable 
org.apache.hive.jdbc.miniHS2.TestHs2ConnectionMetricsHttp.testOpenConnectionMetrics
 Key: HIVE-20577
 URL: https://issues.apache.org/jira/browse/HIVE-20577
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 4.0.0
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20555) HiveServer2: Preauthenticated subject for http transport is not retained for entire duration of http communication in some cases

2018-09-13 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20555:
---

 Summary: HiveServer2: Preauthenticated subject for http transport 
is not retained for entire duration of http communication in some cases
 Key: HIVE-20555
 URL: https://issues.apache.org/jira/browse/HIVE-20555
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2
Affects Versions: 3.1.0, 2.3.2
Reporter: Vaibhav Gumashta


As implemented in HIVE-8705, for http transport, we add the logged in subject's 
credentials in the http header via a request interceptor. The request 
interceptor doesn't seem to be getting used for some http traffic (e.g. knox 
ssl in the same rpc). It would also be better to cache the logged in subject 
for the duration of the whole session.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20507) Beeline: Add a utility command to retrieve all uris from beeline-site.xml

2018-09-05 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20507:
---

 Summary: Beeline: Add a utility command to retrieve all uris from 
beeline-site.xml
 Key: HIVE-20507
 URL: https://issues.apache.org/jira/browse/HIVE-20507
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 3.1.0
Reporter: Vaibhav Gumashta


It will be useful for some clients to get the url list when beeline-site is 
present. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCE] New committer: Andrew Sherman

2018-09-04 Thread Vaibhav Gumashta
Congratulations Andrew!

On 9/4/18, 10:36 AM, "Sahil Takiar"  wrote:

Congrats Andrew!

On Tue, Sep 4, 2018 at 12:02 AM Antal Sinkovits
 wrote:

> Congratulations Andrew!
>
> Deepak Jaiswal  ezt írta (időpont: 2018. szept.
> 4., K 7:34):
>
> > Congratulation Andrew.
> >
> > Deepak
> >
> > On 9/3/18, 10:17 PM, "Zoltan Haindrich"  wrote:
> >
> > Congratulations Andrew!
> >
> > On 2 September 2018 04:49:00 CEST, Lefty Leverenz <
> > leftylever...@gmail.com> wrote:
> > >Congratulations Andrew!
> > >
> > >-- Lefty
> > >
> > >
> > >On Tue, Aug 28, 2018 at 11:36 AM Ashutosh Chauhan
> > >
> > >wrote:
> > >
> > >> Apache Hive's Project Management Committee (PMC) has invited
> Andrew
> > >Sherman
> > >> to become a committer, and we are pleased to announce that he has
> > >accepted.
> > >>
> > >> Andrew, welcome, thank you for your contributions, and we look
> > >forward to
> > >> your
> > >> further interactions with the community!
> > >>
> > >> Ashutosh Chauhan (on behalf of the Apache Hive PMC)
> > >>
> >
> >
> >
>


-- 
Sahil Takiar
Software Engineer
takiar.sa...@gmail.com | (510) 673-0309




[jira] [Created] (HIVE-20478) Metastore: Null checks needed in DecimalColumnStatsAggregator

2018-08-28 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20478:
---

 Summary: Metastore: Null checks needed in 
DecimalColumnStatsAggregator
 Key: HIVE-20478
 URL: https://issues.apache.org/jira/browse/HIVE-20478
 Project: Hive
  Issue Type: Bug
  Components: Standalone Metastore
Affects Versions: 3.1.0
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20446) CachedStore: bug fixes for q file tests: TestMiniLlapCliDriver, TestMiniTezCliDriver, TestMinimrCliDriver when CachedStore is enabled

2018-08-23 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20446:
---

 Summary: CachedStore: bug fixes for q file tests: 
TestMiniLlapCliDriver, TestMiniTezCliDriver, TestMinimrCliDriver when 
CachedStore is enabled
 Key: HIVE-20446
 URL: https://issues.apache.org/jira/browse/HIVE-20446
 Project: Hive
  Issue Type: Sub-task
  Components: Standalone Metastore
Affects Versions: 3.1.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20430) CachedStore: getTableObjectsByName incorrectly adds a null object to the table list

2018-08-20 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20430:
---

 Summary: CachedStore: getTableObjectsByName incorrectly adds a 
null object to the table list
 Key: HIVE-20430
 URL: https://issues.apache.org/jira/browse/HIVE-20430
 Project: Hive
  Issue Type: Sub-task
  Components: Standalone Metastore
Affects Versions: 3.1.0
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-20337) CachedStore: getPartitionsByExpr is not populating the partition list correctly

2018-08-08 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-20337:
---

 Summary: CachedStore: getPartitionsByExpr is not populating the 
partition list correctly
 Key: HIVE-20337
 URL: https://issues.apache.org/jira/browse/HIVE-20337
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.1.0
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [ANNOUNCE] New PMC Member : Vihang Karajgaonkar

2018-08-01 Thread Vaibhav Gumashta
Congrats Vihang!

On 8/1/18, 11:33 AM, "Chaoyu Tang"  wrote:

Congratulations Vihang.

On Tue, Jul 31, 2018 at 7:39 AM, Rajesh Balamohan 
wrote:

> Congratulations Vihang!
>
> ~Rajesh.B
>
>
> On Tue, Jul 31, 2018 at 3:35 PM Marta Kuczora
> 
> wrote:
>
> > Congratulations Vihang!
> >
> > On Mon, Jul 30, 2018 at 9:44 AM Peter Vary 
> > wrote:
> >
> > > Congratulations Vihang!
> > >
> > > > On Jul 29, 2018, at 22:32, Vineet Garg 
> wrote:
> > > >
> > > > Congratulations Vihang!
> > > >
> > > >> On Jul 26, 2018, at 11:27 AM, Ashutosh Chauhan <
> hashut...@apache.org>
> > > wrote:
> > > >>
> > > >> On behalf of the Hive PMC I am delighted to announce Vihang
> > > Karajgaonkar
> > > >> is joining Hive PMC.
> > > >> Thanks Vihang for all your contributions till now. Looking forward
> to
> > > many
> > > >> more.
> > > >>
> > > >> Welcome, Vihang!
> > > >>
> > > >> Thanks,
> > > >> Ashutosh
> > > >
> > >
> > >
>




Re: [ANNOUNCE] New PMC Member : Vineet Garg

2018-08-01 Thread Vaibhav Gumashta
Congrats Vineet!

On 8/1/18, 11:33 AM, "Chaoyu Tang"  wrote:

Congratulations Vineet!

On Tue, Jul 31, 2018 at 7:39 AM, Rajesh Balamohan <
rajesh.balamo...@gmail.com> wrote:

> Congratulations Vineet!
>
> ~Rajesh.B
>
>
> On Tue, Jul 31, 2018 at 3:34 PM Marta Kuczora
> 
> wrote:
>
> > Congratulations Vineet!
> >
> > On Mon, Jul 30, 2018 at 9:45 AM Peter Vary 
> > wrote:
> >
> > > Congratulations Vineet!
> > >
> > > > On Jul 30, 2018, at 01:59, Ashutosh Chauhan 
> > > wrote:
> > > >
> > > > On behalf of the Hive PMC I am delighted to announce Vineet Garg is
> > > joining
> > > > Hive PMC.
> > > > Thanks Vineet for all your contributions till now. Looking forward 
to
> > > many
> > > > more.
> > > >
> > > > Welcome, Vineet!
> > > >
> > > > Thanks,
> > > > Ashutosh
> > >
> > >
> >
>
>
> --
> ~Rajesh.B
>




Re: [ANNOUNCE] New PMC Member : Sahil Takiar

2018-08-01 Thread Vaibhav Gumashta
Congrats Sahil!

On 8/1/18, 11:32 AM, "Chaoyu Tang"  wrote:

Congratulations Sahil!

On Tue, Jul 31, 2018 at 7:40 AM, Rajesh Balamohan 
wrote:

> Congratulations Sahil!
>
> ~Rajesh.B
>
>
> On Tue, Jul 31, 2018 at 3:57 PM Marta Kuczora
> 
> wrote:
>
> > Congratulations Sahil!
> >
> > On Mon, Jul 30, 2018 at 9:44 AM Peter Vary 
> > wrote:
> >
> > > Congratulations Sahil!
> > >
> > > > On Jul 29, 2018, at 22:32, Vineet Garg 
> wrote:
> > > >
> > > > Congratulations Sahil!
> > > >
> > > >> On Jul 26, 2018, at 11:28 AM, Ashutosh Chauhan <
> hashut...@apache.org>
> > > wrote:
> > > >>
> > > >> On behalf of the Hive PMC I am delighted to announce Sahil Takiar 
is
> > > >> joining Hive PMC.
> > > >> Thanks Sahil for all your contributions till now. Looking forward 
to
> > > many
> > > >> more.
> > > >>
> > > >> Welcome, Sahil!
> > > >>
> > > >> Thanks,
> > > >> Ashutosh
> > > >
> > >
> > >
>




Re: [ANNOUNCE] New PMC Member : Peter Vary

2018-08-01 Thread Vaibhav Gumashta
Congrats Peter!

On 8/1/18, 11:31 AM, "Chaoyu Tang"  wrote:

Congratulations, Peter.

On Wed, Aug 1, 2018 at 2:08 PM, Peter Vary 
wrote:

> Thanks everyone!
>
> Rajesh Balamohan  ezt írta (időpont: 2018. júl.
> 31.,
> Ke 13:41):
>
> > Congratulations Peter!
> >
> > ~Rajesh.B
> >
> >
> > On Tue, Jul 31, 2018 at 3:58 PM Marta Kuczora
> > 
> > wrote:
> >
> > > Congratulations Peter!
> > >
> > > On Mon, Jul 30, 2018 at 7:53 PM Andrew Sherman
> > >  wrote:
> > >
> > > > Congratulations Peter!
> > > >
> > > > On Sun, Jul 29, 2018 at 1:32 PM Vineet Garg 
> > > wrote:
> > > >
> > > > > Congratulations Peter!
> > > > >
> > > > > > On Jul 26, 2018, at 11:25 AM, Ashutosh Chauhan <
> > hashut...@apache.org
> > > >
> > > > > wrote:
> > > > > >
> > > > > > On behalf of the Hive PMC I am delighted to announce Peter Vary
> is
> > > > > joining
> > > > > > Hive PMC.
> > > > > > Thanks Peter for all your contributions till now. Looking 
forward
> > to
> > > > many
> > > > > > more.
> > > > > >
> > > > > > Welcome, Peter!
> > > > > >
> > > > > > Thanks,
> > > > > > Ashutosh
> > > > >
> > > > >
> > > >
> >
>




[jira] [Created] (HIVE-19801) JDBC: Add some missing classes to jdbc standalone jar and remove hbase classes

2018-06-05 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-19801:
---

 Summary: JDBC: Add some missing classes to jdbc standalone jar and 
remove hbase classes
 Key: HIVE-19801
 URL: https://issues.apache.org/jira/browse/HIVE-19801
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 3.0.0, 3.1.0
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19748) Add appropriate null checks to DecimalColumnStatsAggregator

2018-05-31 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-19748:
---

 Summary: Add appropriate null checks to 
DecimalColumnStatsAggregator
 Key: HIVE-19748
 URL: https://issues.apache.org/jira/browse/HIVE-19748
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Vaibhav Gumashta


In some of our internal testing, we noticed that calls to 
MetaStoreUtils.decimalToDoublee(Decimal decimal)  from within 
DecimalColumnStatsAggregator end up passing null Decimal values to the method.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19734) Beeline: When beeline-site.xml is present, beeline does not honor -n (username) and -p (password) arguments

2018-05-29 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-19734:
---

 Summary: Beeline: When beeline-site.xml is present, beeline does 
not honor -n (username) and -p (password) arguments
 Key: HIVE-19734
 URL: https://issues.apache.org/jira/browse/HIVE-19734
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 3.0.0
Reporter: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19528) Beeline: When beeline-site.xml is present and the default named url is incorrect, throw an exception instead of relying on resolution via hive-site.xml/beeline-hs2-connec

2018-05-14 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-19528:
---

 Summary: Beeline: When beeline-site.xml is present and the default 
named url is incorrect, throw an exception instead of relying on resolution via 
hive-site.xml/beeline-hs2-connection.xml 
 Key: HIVE-19528
 URL: https://issues.apache.org/jira/browse/HIVE-19528
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Affects Versions: 3.0.0, 3.1.0
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19389) Schematool: For Hive's Information Schema, use embedded HS2 as default

2018-05-02 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-19389:
---

 Summary: Schematool: For Hive's Information Schema, use embedded 
HS2 as default
 Key: HIVE-19389
 URL: https://issues.apache.org/jira/browse/HIVE-19389
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.0.0, 3.1.0
Reporter: Vaibhav Gumashta


Currently, for initializing/upgrading Hive's information schema, we require a 
full jdbc url (for HS2). It will be good to have it connect using embedded HS2 
by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19385) Optional hive env variable to redirect bin/hive to use Beeline

2018-05-02 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-19385:
---

 Summary: Optional hive env variable to redirect bin/hive to use 
Beeline
 Key: HIVE-19385
 URL: https://issues.apache.org/jira/browse/HIVE-19385
 Project: Hive
  Issue Type: Bug
Affects Versions: 3.0.0, 3.1.0
Reporter: Vaibhav Gumashta


With beeline-site and beeline-user-site, the user can easily specify default 
hs2 urls to connect. We can use an optional env variable, which when set, will 
enable bin/hive to use beeline.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 66857: Replication: The file uris being dumped should contain information about the uri of the source cluster's cm root

2018-04-27 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66857/
---

Review request for hive and Thejas Nair.


Bugs: HIVE-19343
https://issues.apache.org/jira/browse/HIVE-19343


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-19343


Diffs
-

  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestReplChangeManager.java
 6ade76d0c2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java de270cfcdb 
  
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/load/message/CreateFunctionHandler.java
 f7c90409b7 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ReplChangeManager.java
 7c1d5f5cca 


Diff: https://reviews.apache.org/r/66857/diff/1/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-19343) Replication: The file uris being dumped should contain information about the uri of the source cluster's cm root

2018-04-27 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-19343:
---

 Summary: Replication: The file uris being dumped should contain 
information about the uri of the source cluster's cm root
 Key: HIVE-19343
 URL: https://issues.apache.org/jira/browse/HIVE-19343
 Project: Hive
  Issue Type: Bug
  Components: repl
Affects Versions: 3.0.0, 3.1.0
Reporter: Vaibhav Gumashta


In replication v2, we use change manager (the location is specified by cmroot: 
{{hive.repl.cmrootdir}}) to archive deleted files from the source cluster so 
that they can later be copied on the target cluster. When files are read from 
the cmroot, the target needs to know the appropriate file system. This patch 
adds the fs information of the cmroot on the source to the filenames that get 
written in the repldump command.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19310) Metastore: MetaStoreDirectSql.ensureDbInit has some slow DN calls which might need to be run only in test env

2018-04-25 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-19310:
---

 Summary: Metastore: MetaStoreDirectSql.ensureDbInit has some slow 
DN calls which might need to be run only in test env
 Key: HIVE-19310
 URL: https://issues.apache.org/jira/browse/HIVE-19310
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.0.0, 3.1.0
Reporter: Vaibhav Gumashta


MetaStoreDirectSql.ensureDbInit has the following 2 calls which we have 
observed taking a long time in our testing:
{code}
initQueries.add(pm.newQuery(MNotificationLog.class, "dbName == ''"));
initQueries.add(pm.newQuery(MNotificationNextId.class, "nextEventId < -1"));
{code}
In a production environment, these tables should be initialized using 
schematool, however in a test environment, these calls might be needed. 




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-19249) Replication: The WITH clause is not passing the configuration to Task correctly in all cases

2018-04-19 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-19249:
---

 Summary: Replication: The WITH clause is not passing the 
configuration to Task correctly in all cases
 Key: HIVE-19249
 URL: https://issues.apache.org/jira/browse/HIVE-19249
 Project: Hive
  Issue Type: Bug
  Components: repl
Affects Versions: 3.0.0, 3.1.0
Reporter: Vaibhav Gumashta


When running repl load like following:
{code}
REPL LOAD `repldb_kms207` FROM 
'hdfs://url:8020/apps/hive/repl/f8b057a7-c3f2-43bd-8baa-f7408a9008fc' WITH 
('hive.exec.parallel'='true','hive.distcp.privileged.doAs'='beacon','hive.metastore.uris'='thrift://metastore-url:9083','hive.metastore.warehouse.dir'='s3a://s3-warehouse','hive.warehouse.subdir.inherit.perms'='false','hive.repl.replica.functions.root.dir'='s3a://s3-warehouse','fs.s3a.bucket.ss-datasets.endpoint'='s3-bucket-endpoint','fs.s3a.impl.disable.cache'='true','fs.s3a.server-side-encryption-algorithm'='SSE-KMS','fs.s3a.server-side-encryption.key'='encr-key','distcp.options.pp'='','distcp.options.pg'='','distcp.options.pu'='');
{code}

the task that get created need to use the configs that are passed in the USING 
clause. However, in some cases the wrong config object gets used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 66503: HIVE-19126: CachedStore: Use memory estimation to limit cache size during prewarm

2018-04-16 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66503/
---

(Updated April 16, 2018, 9:42 p.m.)


Review request for hive and Thejas Nair.


Bugs: HIVE-19126
https://issues.apache.org/jira/browse/HIVE-19126


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-19126


Diffs (updated)
-

  
llap-server/src/java/org/apache/hadoop/hive/llap/IncrementalObjectSizeEstimator.java
 6f4ec6f1ea 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java
 2f7fa24558 
  
llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java
 0bbaf7e459 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 1ce86bbdba 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 89b400697b 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
 f007261daf 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/SizeValidator.java
 PRE-CREATION 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 d451f966b0 


Diff: https://reviews.apache.org/r/66503/diff/6/

Changes: https://reviews.apache.org/r/66503/diff/5-6/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 66503: HIVE-19126: CachedStore: Use memory estimation to limit cache size during prewarm

2018-04-11 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66503/
---

(Updated April 11, 2018, 10:53 p.m.)


Review request for hive and Thejas Nair.


Changes
---

Changed some logging to trace level


Bugs: HIVE-19126
https://issues.apache.org/jira/browse/HIVE-19126


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-19126


Diffs (updated)
-

  
llap-server/src/java/org/apache/hadoop/hive/llap/IncrementalObjectSizeEstimator.java
 6f4ec6f1ea 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java
 2f7fa24558 
  
llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java
 0bbaf7e459 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 c47856de87 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 89b400697b 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
 f007261daf 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/SizeValidator.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/66503/diff/5/

Changes: https://reviews.apache.org/r/66503/diff/4-5/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 66503: HIVE-19126: CachedStore: Use memory estimation to limit cache size during prewarm

2018-04-11 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66503/
---

(Updated April 11, 2018, 10:38 p.m.)


Review request for hive and Thejas Nair.


Bugs: HIVE-19126
https://issues.apache.org/jira/browse/HIVE-19126


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-19126


Diffs (updated)
-

  
llap-server/src/java/org/apache/hadoop/hive/llap/IncrementalObjectSizeEstimator.java
 6f4ec6f1ea 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java
 2f7fa24558 
  
llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java
 0bbaf7e459 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 c47856de87 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 89b400697b 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
 f007261daf 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/SizeValidator.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/66503/diff/4/

Changes: https://reviews.apache.org/r/66503/diff/3-4/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 66503: HIVE-19126: CachedStore: Use memory estimation to limit cache size during prewarm

2018-04-09 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66503/
---

(Updated April 10, 2018, 12:25 a.m.)


Review request for hive and Thejas Nair.


Bugs: HIVE-19126
https://issues.apache.org/jira/browse/HIVE-19126


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-19126


Diffs (updated)
-

  
llap-server/src/java/org/apache/hadoop/hive/llap/IncrementalObjectSizeEstimator.java
 6f4ec6f1ea 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java
 2f7fa24558 
  
llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java
 0bbaf7e459 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 c47856de87 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 89b400697b 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
 940a1bf276 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/SizeValidator.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/66503/diff/3/

Changes: https://reviews.apache.org/r/66503/diff/2-3/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 66503: HIVE-19126: CachedStore: Use memory estimation to limit cache size during prewarm

2018-04-09 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66503/
---

(Updated April 9, 2018, 10:42 p.m.)


Review request for hive and Thejas Nair.


Bugs: HIVE-19126
https://issues.apache.org/jira/browse/HIVE-19126


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-19126


Diffs (updated)
-

  
llap-server/src/java/org/apache/hadoop/hive/llap/IncrementalObjectSizeEstimator.java
 6f4ec6f1ea 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java
 2f7fa24558 
  
llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java
 0bbaf7e459 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 c47856de87 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 89b400697b 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
 940a1bf276 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/SizeValidator.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/66503/diff/2/

Changes: https://reviews.apache.org/r/66503/diff/1-2/


Testing
---


Thanks,

Vaibhav Gumashta



Review Request 66503: HIVE-19126: CachedStore: Use memory estimation to limit cache size during prewarm

2018-04-09 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66503/
---

Review request for hive and Thejas Nair.


Bugs: HIVE-19126
https://issues.apache.org/jira/browse/HIVE-19126


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-19126


Diffs
-

  
llap-server/src/java/org/apache/hadoop/hive/llap/IncrementalObjectSizeEstimator.java
 6f4ec6f1ea 
  
llap-server/src/java/org/apache/hadoop/hive/llap/io/metadata/OrcFileEstimateErrors.java
 2f7fa24558 
  
llap-server/src/test/org/apache/hadoop/hive/llap/cache/TestIncrementalObjectSizeEstimator.java
 0bbaf7e459 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 c47856de87 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 89b400697b 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
 995137f967 


Diff: https://reviews.apache.org/r/66503/diff/1/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-19126) CachedStore: Use memory estimation to limit cache size during prewarm

2018-04-06 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-19126:
---

 Summary: CachedStore: Use memory estimation to limit cache size 
during prewarm
 Key: HIVE-19126
 URL: https://issues.apache.org/jira/browse/HIVE-19126
 Project: Hive
  Issue Type: Sub-task
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Vaibhav Gumashta


We can rely on 
https://github.com/apache/hive/blob/master/llap-server/src/java/org/apache/hadoop/hive/llap/IncrementalObjectSizeEstimator.java
 to estimate memory of SharedCache. This jira addresses the size estimation 
during prewarm, so that we can stop when we hit the memory limit. In a 
follow-up jira, we will work on estimation/eviction after prewarm is complete, 
so that we can keep the frequently used tables and their partitions in cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 66185: JDBC: Provide an option to simplify beeline usage by supporting default and named URL for beeline

2018-04-01 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66185/
---

(Updated April 2, 2018, 5:04 a.m.)


Review request for hive, Thejas Nair and Vihang Karajgaonkar.


Bugs: HIVE-18963
https://issues.apache.org/jira/browse/HIVE-18963


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-18963


Diffs (updated)
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 4928761565 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/BeelineConfFileParseException.java
 PRE-CREATION 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/BeelineHS2ConnectionFileParseException.java
 acddf82a67 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/BeelineSiteParseException.java
 PRE-CREATION 
  beeline/src/java/org/apache/hive/beeline/hs2connection/BeelineSiteParser.java 
PRE-CREATION 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/HS2ConnectionFileParser.java
 b769e8581f 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/HS2ConnectionFileUtils.java
 f635b40633 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/UserHS2ConnectionFileParser.java
 2801ebee09 
  beeline/src/main/resources/BeeLine.properties 6fca953836 
  
beeline/src/test/org/apache/hive/beeline/hs2connection/TestUserHS2ConnectionFileParser.java
 1d17887417 
  beeline/src/test/resources/test-hs2-named-connection-config.xml PRE-CREATION 
  
itests/hive-unit/src/test/java/org/apache/hive/beeline/hs2connection/BeelineWithHS2ConnectionFileTestBase.java
 3da31ad8a9 
  jdbc/src/java/org/apache/hive/jdbc/Utils.java 6d7787da7d 


Diff: https://reviews.apache.org/r/66185/diff/3/

Changes: https://reviews.apache.org/r/66185/diff/2-3/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 66185: JDBC: Provide an option to simplify beeline usage by supporting default and named URL for beeline

2018-03-28 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66185/
---

(Updated March 28, 2018, 8:27 p.m.)


Review request for hive, Thejas Nair and Vihang Karajgaonkar.


Bugs: HIVE-18963
https://issues.apache.org/jira/browse/HIVE-18963


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-18963


Diffs (updated)
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 402fae 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/BeelineConfFileParseException.java
 PRE-CREATION 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/BeelineHS2ConnectionFileParseException.java
 acddf82a67 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/BeelineSiteParseException.java
 PRE-CREATION 
  beeline/src/java/org/apache/hive/beeline/hs2connection/BeelineSiteParser.java 
PRE-CREATION 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/HS2ConnectionFileParser.java
 b769e8581f 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/HS2ConnectionFileUtils.java
 f635b40633 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/UserHS2ConnectionFileParser.java
 2801ebee09 
  beeline/src/main/resources/BeeLine.properties 6fca953836 
  
beeline/src/test/org/apache/hive/beeline/hs2connection/TestUserHS2ConnectionFileParser.java
 1d17887417 
  beeline/src/test/resources/test-hs2-named-connection-config.xml PRE-CREATION 
  
itests/hive-unit/src/test/java/org/apache/hive/beeline/hs2connection/BeelineWithHS2ConnectionFileTestBase.java
 3da31ad8a9 
  jdbc/src/java/org/apache/hive/jdbc/Utils.java 6d7787da7d 


Diff: https://reviews.apache.org/r/66185/diff/2/

Changes: https://reviews.apache.org/r/66185/diff/1-2/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 66185: JDBC: Provide an option to simplify beeline usage by supporting default and named URL for beeline

2018-03-22 Thread Vaibhav Gumashta


> On March 22, 2018, 4:22 p.m., Vihang Karajgaonkar wrote:
> > I am a bit confused here. If the full url can be provided in the config 
> > file by the user, how is it better than just creating a environment 
> > variable like BEELINE_URL_ and use it instead of adding it in the 
> > config file? I think the objective of this config file was to automatically 
> > figure out the connection url based on hive-site.xml and the additional 
> > beeline-hs2-connection.xml to override/augment the information from 
> > hive-site.xml
> > 
> > The current code is structured such that all keys start with 
> > beeline.hs2.connection. and components of the url are parsed automatically 
> > using the values of those keys. If we want to add full support of named 
> > urls which can have completely different url components like session vars 
> > etc, what do you think of adding a new prefix key of the form 
> > beeline.hs2.connection. and then the existing code will work exactly 
> > like it does currently but instead will parse the keys starting with 
> > beeline.hs2.connection.. For example, a named url called "blue" will 
> > be constructed using all the keys from beeline.hs2.connection.blue. That 
> > way we reuse existing logic. The beeline will be invoked like beeline -c 
> > blue. Do you see any problems with this approach? This way the user doesn't 
> > have to provide all the url components which can be reused from 
> > hive-site.xml (like the nasty ssl, kerberos settings)

Thanks for your feedback. Discussed a bit offline with Thejas as well. Let me 
add more details on the use case:
Suppose you have 2 different sets of HS2 instances running on the cluster, a 
beeline shell will only be able to parse one hive-site.xml (set 1 for example). 
To be able to connect to set 2, it would be nice to have an installer 
(something like Apache Ambari) managed beeline-site.xml, which can publish the 
named urls (and also regenerate the named urls if the admin makes any change in 
the cluster manager), which can be used by the beeline shell. Once the base 
connection url is figured out, beeline-hs2-connection.xml can then be used to 
overlay user specific driver configs like it is doing right now. Hope that 
clarifies the use case. I'll post an updated patch based on your feedback above.


- Vaibhav


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66185/#review199770
---


On March 20, 2018, 10:54 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66185/
> ---
> 
> (Updated March 20, 2018, 10:54 p.m.)
> 
> 
> Review request for hive, Thejas Nair and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-18963
> https://issues.apache.org/jira/browse/HIVE-18963
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-18963
> 
> 
> Diffs
> -
> 
>   beeline/src/java/org/apache/hive/beeline/BeeLine.java 402fae 
>   
> beeline/src/java/org/apache/hive/beeline/hs2connection/HS2ConnectionFileParser.java
>  b769e8581f 
>   
> beeline/src/java/org/apache/hive/beeline/hs2connection/HS2ConnectionFileUtils.java
>  f635b40633 
>   
> beeline/src/java/org/apache/hive/beeline/hs2connection/UserHS2ConnectionFileParser.java
>  2801ebee09 
>   beeline/src/main/resources/BeeLine.properties 6fca953836 
>   
> beeline/src/test/org/apache/hive/beeline/hs2connection/TestUserHS2ConnectionFileParser.java
>  1d17887417 
>   beeline/src/test/resources/test-hs2-named-connection-config.xml 
> PRE-CREATION 
>   
> itests/hive-unit/src/test/java/org/apache/hive/beeline/hs2connection/BeelineWithHS2ConnectionFileTestBase.java
>  3da31ad8a9 
> 
> 
> Diff: https://reviews.apache.org/r/66185/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



Review Request 66185: JDBC: Provide an option to simplify beeline usage by supporting default and named URL for beeline

2018-03-20 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66185/
---

Review request for hive, Thejas Nair and Vihang Karajgaonkar.


Bugs: HIVE-18963
https://issues.apache.org/jira/browse/HIVE-18963


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-18963


Diffs
-

  beeline/src/java/org/apache/hive/beeline/BeeLine.java 402fae 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/HS2ConnectionFileParser.java
 b769e8581f 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/HS2ConnectionFileUtils.java
 f635b40633 
  
beeline/src/java/org/apache/hive/beeline/hs2connection/UserHS2ConnectionFileParser.java
 2801ebee09 
  beeline/src/main/resources/BeeLine.properties 6fca953836 
  
beeline/src/test/org/apache/hive/beeline/hs2connection/TestUserHS2ConnectionFileParser.java
 1d17887417 
  beeline/src/test/resources/test-hs2-named-connection-config.xml PRE-CREATION 
  
itests/hive-unit/src/test/java/org/apache/hive/beeline/hs2connection/BeelineWithHS2ConnectionFileTestBase.java
 3da31ad8a9 


Diff: https://reviews.apache.org/r/66185/diff/1/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-18963) JDBC: Provide an option to simplify beeline usage

2018-03-14 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-18963:
---

 Summary: JDBC: Provide an option to simplify beeline usage
 Key: HIVE-18963
 URL: https://issues.apache.org/jira/browse/HIVE-18963
 Project: Hive
  Issue Type: Bug
  Components: Beeline
Reporter: Vaibhav Gumashta


Currently, after opening Beeline CLI, the user needs to supply a connection 
string to use the HS2 instance and set up the jdbc driver. Since we plan to 
replace Hive CLI with Beeline in future (HIVE-10511), it will help the 
usability if the user can simply type {{beeline}} and get start the hive 
session. The jdbc url can be specified in a beeline-site.xml (which can contain 
other named jdbc urls as well, and they can be accessed by something like: 
{{beeline -c namedUrl}}. The use of beeline-site.xml can also be potentially 
expanded later if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65634: HIVE-18264: CachedStore: Store cached partitions/col stats within the table cache

2018-03-09 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65634/
---

(Updated March 9, 2018, 9 p.m.)


Review request for hive, Daniel Dai and Thejas Nair.


Bugs: HIVE-18264
https://issues.apache.org/jira/browse/HIVE-18264


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-18264


Diffs (updated)
-

  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
 a3725c5395 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 86c9c2b33c 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 ac71d0882f 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 7b44df4128 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
 f500d63725 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
 f0f650ddcf 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 0d132f2074 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 32ea17495f 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 75ea8c4a77 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 207d842f94 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 ab6feb6f0b 
  standalone-metastore/src/test/resources/log4j2.properties 365687e1c9 


Diff: https://reviews.apache.org/r/65634/diff/5/

Changes: https://reviews.apache.org/r/65634/diff/4-5/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 65634: HIVE-18264: CachedStore: Store cached partitions/col stats within the table cache

2018-03-09 Thread Vaibhav Gumashta
634/diff/4/?file=1968310#file1968310line292>
> >
> > Please document this method. Among other things - can prewarm() be 
> > called multiple times? If not, should it be somehow enforced?

Have enforced that. Currently it was being called just once from the background 
thread.


> On March 2, 2018, 7:54 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Line 271 (original), 207 (patched)
> > <https://reviews.apache.org/r/65634/diff/4/?file=1968310#file1968310line297>
> >
> > Please remove  and add explicit size:
> > 
> > `List databases = new ArrayList<>(dbNames.size());`

I had kept it for Java 6 compatibility, but looks like Hive2+ doesn't support 
it. Removed


> On March 2, 2018, 7:54 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Line 285 (original), 217 (patched)
> > <https://reviews.apache.org/r/65634/diff/4/?file=1968310#file1968310line318>
> >
> > It is quite possible that there is another metastore instance running 
> > and someone removes this database, so this call will fail due to missing 
> > database. I think this code should continue prewarm for other databases in 
> > such cases.

rawStore.getAllTables(dbName) should return an empty list, so processing will 
continue.


> On March 2, 2018, 7:54 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Line 368 (original), 305 (patched)
> > <https://reviews.apache.org/r/65634/diff/4/?file=1968310#file1968310line419>
> >
> > ANy reason not to use lambda here?

Not really, just a matter of preference here.


> On March 2, 2018, 7:54 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Line 746 (original), 595 (patched)
> > <https://reviews.apache.org/r/65634/diff/4/?file=1968310#file1968310line853>
> >
> > Wouldn;t dropDatabase also throw exception if it fails?

Yes, we're throwing it if we get an exception from rawStore.dropDatabase


> On March 2, 2018, 7:54 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Line 758 (original), 599 (patched)
> > <https://reviews.apache.org/r/65634/diff/4/?file=1968310#file1968310line865>
> >
> > This is rather useless since in case of failure tu'll throw 
> > MetaException anyway.

Here we are working with the Rawstore API which returns a boolean which can 
potentially be false. In that case we don't want to work on SharedCache. The 
underlying ObjectStore implementation may change in future


> On March 2, 2018, 7:54 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Line 902 (original), 701 (patched)
> > <https://reviews.apache.org/r/65634/diff/4/?file=1968310#file1968310line1013>
> >
> > tbl would never be null, there will be an exception if the call above 
> > fails.

If the table is not yet in the cache (cache not prewarmed yet), 
sharedCache.getTableFromCache will return null. In that case we would like to 
serve the call from the metastore db.


- Vaibhav


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65634/#review198507
---


On March 1, 2018, 11:09 a.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65634/
> ---
> 
> (Updated March 1, 2018, 11:09 a.m.)
> 
> 
> Review request for hive, Daniel Dai and Thejas Nair.
> 
> 
> Bugs: HIVE-18264
> https://issues.apache.org/jira/browse/HIVE-18264
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-18264
> 
> 
> Diffs
> -
> 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  a3725c5395 
>   service/src/java/org/apache/hive/service/server/HiveServer2.java 86c9c2b33c 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  ac71d0882f 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  7b44df4128 
>   

[jira] [Created] (HIVE-18847) CachedStore: Investigate TestCachedStore#testTableColStatsOps

2018-03-01 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-18847:
---

 Summary: CachedStore: Investigate 
TestCachedStore#testTableColStatsOps 
 Key: HIVE-18847
 URL: https://issues.apache.org/jira/browse/HIVE-18847
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Vaibhav Gumashta


Currently commented out due to ObjectStore.updateTableColumnStatistics call 
unable to persist stats to derby. Needs investigation



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HIVE-18840) CachedStore: Prioritize loading of recently accessed tables during prewarm

2018-03-01 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-18840:
---

 Summary: CachedStore: Prioritize loading of recently accessed 
tables during prewarm
 Key: HIVE-18840
 URL: https://issues.apache.org/jira/browse/HIVE-18840
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Vaibhav Gumashta


On clusters with large metadata, prewarming the cache can take several hours. 
Now that CachedStore does not block on prewarm anymore (after HIVE-18264), we 
should prioritize loading of recently accessed tables during prewarm.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65634: HIVE-18264: CachedStore: Store cached partitions/col stats within the table cache

2018-03-01 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65634/
---

(Updated March 1, 2018, 11:09 a.m.)


Review request for hive, Daniel Dai and Thejas Nair.


Bugs: HIVE-18264
https://issues.apache.org/jira/browse/HIVE-18264


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-18264


Diffs (updated)
-

  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
 a3725c5395 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 86c9c2b33c 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 ac71d0882f 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 7b44df4128 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
 f500d63725 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
 f0f650ddcf 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 0d132f2074 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 32ea17495f 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 75ea8c4a77 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 207d842f94 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 ab6feb6f0b 
  standalone-metastore/src/test/resources/log4j2.properties 365687e1c9 


Diff: https://reviews.apache.org/r/65634/diff/4/

Changes: https://reviews.apache.org/r/65634/diff/3-4/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 65634: HIVE-18264: CachedStore: Store cached partitions/col stats within the table cache

2018-02-26 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65634/
---

(Updated Feb. 26, 2018, 9:47 p.m.)


Review request for hive, Daniel Dai and Thejas Nair.


Bugs: HIVE-18264
https://issues.apache.org/jira/browse/HIVE-18264


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-18264


Diffs (updated)
-

  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
 a3725c5395 
  service/src/java/org/apache/hive/service/server/HiveServer2.java 6c1a0b98cc 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 c6e34a8a22 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 7b44df4128 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
 f500d63725 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
 f0f650ddcf 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 0d132f2074 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 32ea17495f 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 75ea8c4a77 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 207d842f94 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 ab6feb6f0b 
  standalone-metastore/src/test/resources/log4j2.properties 365687e1c9 


Diff: https://reviews.apache.org/r/65634/diff/3/

Changes: https://reviews.apache.org/r/65634/diff/2-3/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 65634: HIVE-18264: CachedStore: Store cached partitions/col stats within the table cache

2018-02-23 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65634/
---

(Updated Feb. 23, 2018, 8:14 p.m.)


Review request for hive, Daniel Dai and Thejas Nair.


Bugs: HIVE-18264
https://issues.apache.org/jira/browse/HIVE-18264


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-18264


Diffs (updated)
-

  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
 a3725c5395 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 7b44df4128 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
 f500d63725 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
 f0f650ddcf 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 0d132f2074 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 32ea17495f 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 50f873a013 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 75ea8c4a77 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 207d842f94 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 ab6feb6f0b 


Diff: https://reviews.apache.org/r/65634/diff/2/

Changes: https://reviews.apache.org/r/65634/diff/1-2/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Review Request 65634: HIVE-18264: CachedStore: Store cached partitions/col stats within the table cache

2018-02-23 Thread Vaibhav Gumashta


> On Feb. 21, 2018, 10:07 p.m., Daniel Dai wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Line 311 (original), 215 (patched)
> > <https://reviews.apache.org/r/65634/diff/1/?file=1958990#file1958990line316>
> >
> > This is not introduced in this patch, but getting columns for table and 
> > apply to partition will not work for schema revolution. We shall get 
> > columns for every individual partition.

I agree, but not sure if current stats works with schema evolution. Let me take 
this up in a follow up jira as this might need a little more thought.


> On Feb. 21, 2018, 10:07 p.m., Daniel Dai wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
> > Line 800 (original), 632 (patched)
> > <https://reviews.apache.org/r/65634/diff/1/?file=1958990#file1958990line881>
> >
> > I don't remember but why this is get() not getUnsafe()? It sounds the 
> > same as getAllTables etc. Also apply to getDatabases, alterDatabase, 
> > dropDatabase, getDatabase and createDatabase

We're using get() here so that this call blocks till the database cache is 
populated. We're letting reads go through the cache while the tables are 
getting populated, but not for databases. Let me know if you think otherwise.


- Vaibhav


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65634/#review197794
---


On Feb. 13, 2018, 12:08 p.m., Vaibhav Gumashta wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65634/
> ---
> 
> (Updated Feb. 13, 2018, 12:08 p.m.)
> 
> 
> Review request for hive, Daniel Dai and Thejas Nair.
> 
> 
> Bugs: HIVE-18264
> https://issues.apache.org/jira/browse/HIVE-18264
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> https://issues.apache.org/jira/browse/HIVE-18264
> 
> 
> Diffs
> -
> 
>   
> itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
>  78b26374f2 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  d58ed677f3 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
>  e4e7d4239d 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
>  f0f650ddcf 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
>  80aa3bcdb4 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
>  32ea17495f 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
>  9100c73beb 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
>  86e72d8d76 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
>  bd61df654a 
> 
> 
> Diff: https://reviews.apache.org/r/65634/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Vaibhav Gumashta
> 
>



Review Request 65634: HIVE-18264: CachedStore: Store cached partitions/col stats within the table cache

2018-02-13 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65634/
---

Review request for hive, Daniel Dai and Thejas Nair.


Bugs: HIVE-18264
https://issues.apache.org/jira/browse/HIVE-18264


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-18264


Diffs
-

  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
 78b26374f2 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 d58ed677f3 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
 e4e7d4239d 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
 f0f650ddcf 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 80aa3bcdb4 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 32ea17495f 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 9100c73beb 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 86e72d8d76 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/cache/TestCachedStore.java
 bd61df654a 


Diff: https://reviews.apache.org/r/65634/diff/1/


Testing
---


Thanks,

Vaibhav Gumashta



Re: Question on CachedStore cache update

2018-02-08 Thread Vaibhav Gumashta
Hi Alan,

To add to Daniel’s response, as part of 
https://issues.apache.org/jira/browse/HIVE-18264 and 
https://issues.apache.org/jira/browse/HIVE-18661 (I’m actively working on 
these), we plan to remove the current mechanism of updating the cache (which is 
very inefficient anyway) and instead use the NOTIFICATION_LOG table to update 
the cache incrementally. The code that you pointed was meant to not let the 
background update thread block the metastore client calls for a long time, but 
with the plan to update the cache incrementally we may not need to worry about 
that, as applying the notification incrementally will not be a long blocking 
execution.

Thanks,
--Vaibhav

On 2/8/18, 11:41 AM, "Daniel Dai"  wrote:

Hi, Alan,

If database cache is changed locally, we don’t want to bring remote copy to 
overwrite it as the remote copy doesn’t carry local changes (ideally, we shall 
also apply local changes to the remote copy images we bring in from db, but we 
are not there yet). That’s why we skip the update if there’s local changes, and 
wait for the next iteration to sync with remote. isDatabaseCacheDirty is 
initially set to false unless there’s local update, and will be reset during 
cache swap, thus give a chance for the next iteration to update the cache if 
there’s no local changes.

Thanks,
Daniel

On 2/6/18, 11:57 AM, "Alan Gates"  wrote:

I’m confused by the following code in the CachedStore.  This in in the
CacheUpdateMasterWork thread, in the updateDatabases method (which is
called by update()):

*// Skip background updates if we detect change*

*if *(*isDatabaseCacheDirty*.compareAndSet(*true*, *false*)) {

  *LOG*.debug(*"Skipping database cache update; the database list we 
have
is dirty."*);

  *return*;

}

Why are we not updating the cache if we’ve dirtied it?  Also, AFAICT no 
one
ever sets isDatabaseCacheDirty to false, meaning once one database is
created the cache will never be updated.  Am I missing something?

Alan.






[jira] [Created] (HIVE-18661) CachedStore: Use metastore notification log events to update cache

2018-02-08 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-18661:
---

 Summary: CachedStore: Use metastore notification log events to 
update cache
 Key: HIVE-18661
 URL: https://issues.apache.org/jira/browse/HIVE-18661
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: Vaibhav Gumashta


Currently, a background thread updates the entire cache which is pretty 
inefficient. We capture the updates to metadata in NOTIFICATION_LOG table which 
is getting used in the Replication work. We should have the background thread 
apply these notifications to incrementally update the cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Review Request 65271: JDBC: Provide a way for JDBC users to pass cookie info via connection string

2018-01-31 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65271/
---

(Updated Jan. 31, 2018, 10:50 p.m.)


Review request for hive and Thejas Nair.


Bugs: HIVE-18447
https://issues.apache.org/jira/browse/HIVE-18447


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-18447


Diffs (updated)
-

  
itests/hive-unit/src/test/java/org/apache/hive/service/cli/thrift/TestThriftHttpCLIServiceFeatures.java
 93b10fb4b4 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java cb2f09cbf2 
  jdbc/src/java/org/apache/hive/jdbc/HttpBasicAuthInterceptor.java 5d2ddb5c21 
  jdbc/src/java/org/apache/hive/jdbc/HttpKerberosRequestInterceptor.java 
37862be804 
  jdbc/src/java/org/apache/hive/jdbc/HttpRequestInterceptorBase.java cf1a11ecb6 
  jdbc/src/java/org/apache/hive/jdbc/HttpTokenAuthInterceptor.java 59a91dd14c 
  jdbc/src/java/org/apache/hive/jdbc/Utils.java f7f3854b86 


Diff: https://reviews.apache.org/r/65271/diff/2/

Changes: https://reviews.apache.org/r/65271/diff/1-2/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-18528) Stats: In the bitvector codepath, when extrapolating column stats for String type columnStringColumnStatsAggregator uses the min value instead of max

2018-01-24 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-18528:
---

 Summary: Stats: In the bitvector codepath, when extrapolating 
column stats for String type columnStringColumnStatsAggregator uses the min 
value instead of max
 Key: HIVE-18528
 URL: https://issues.apache.org/jira/browse/HIVE-18528
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 3.0.0
Reporter: Vaibhav Gumashta


This line: 
[https://github.com/apache/hive/blob/456a65180dcb84f69f26b4c9b9265165ad16dfe4/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/StringColumnStatsAggregator.java#L181]

Should be: 
aggregateData.setAvgColLen(Math.max(aggregateData.getAvgColLen(), 
newData.getAvgColLen()));



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Review Request 65271: JDBC: Provide a way for JDBC users to pass cookie info via connection string

2018-01-22 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65271/
---

Review request for hive and Thejas Nair.


Bugs: HIVE-18447
https://issues.apache.org/jira/browse/HIVE-18447


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-18447


Diffs
-

  
itests/hive-unit/src/test/java/org/apache/hive/service/cli/thrift/TestThriftHttpCLIServiceFeatures.java
 1911d2ce17 
  jdbc/src/java/org/apache/hive/jdbc/HiveConnection.java b5d289e023 
  jdbc/src/java/org/apache/hive/jdbc/HttpBasicAuthInterceptor.java 8cb7a69df7 
  jdbc/src/java/org/apache/hive/jdbc/HttpKerberosRequestInterceptor.java 
3509cab775 
  jdbc/src/java/org/apache/hive/jdbc/HttpRequestInterceptorBase.java 68b0ff19b8 
  jdbc/src/java/org/apache/hive/jdbc/HttpTokenAuthInterceptor.java 207ed9e663 
  jdbc/src/java/org/apache/hive/jdbc/Utils.java 1c1f644c92 


Diff: https://reviews.apache.org/r/65271/diff/1/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-18447) JDBC: Provide a way for JDBC users to pass cookie info via connection string

2018-01-12 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-18447:
---

 Summary: JDBC: Provide a way for JDBC users to pass cookie info 
via connection string
 Key: HIVE-18447
 URL: https://issues.apache.org/jira/browse/HIVE-18447
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Reporter: Vaibhav Gumashta


Some authentication mechanisms like Single Sign On, need the ability to pass a 
cookie to some intermediate authentication service like Knox via the JDBC 
driver. We need to add the mechanism in Hive's JDBC driver (when used in HTTP 
transport mode).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 62228: HIVE-17495: CachedStore: prewarm improvements, refactoring and caching some aggregate stats

2017-12-22 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62228/
---

(Updated Dec. 22, 2017, 11:13 p.m.)


Review request for hive, Ashutosh Chauhan, Daniel Dai, and Thejas Nair.


Bugs: HIVE-17495
https://issues.apache.org/jira/browse/HIVE-17495


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-17495


Diffs (updated)
-

  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
 6dc052db45 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 0aa1d4e16a 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 14653b4043 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 b708fae7ec 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
 8af96db0bc 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
 ab6b90fb6b 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 9856f8a195 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 b606779709 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/BinaryColumnStatsAggregator.java
 45d5d8c984 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/BooleanColumnStatsAggregator.java
 8aac0fe33d 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/ColumnStatsAggregator.java
 cd0392d6c0 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DateColumnStatsAggregator.java
 7f2956152c 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DecimalColumnStatsAggregator.java
 05c0280262 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DoubleColumnStatsAggregator.java
 faf22dcd7c 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/LongColumnStatsAggregator.java
 d12cdc08ea 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/StringColumnStatsAggregator.java
 4539e6b026 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 8bc4ce752e 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 e59e3496bf 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 8be099cbcb 


Diff: https://reviews.apache.org/r/62228/diff/6/

Changes: https://reviews.apache.org/r/62228/diff/5-6/


Testing
---


Thanks,

Vaibhav Gumashta



[jira] [Created] (HIVE-18264) CachedStore: Store cached partitions within the table cache

2017-12-11 Thread Vaibhav Gumashta (JIRA)
Vaibhav Gumashta created HIVE-18264:
---

 Summary: CachedStore: Store cached partitions within the table 
cache  
 Key: HIVE-18264
 URL: https://issues.apache.org/jira/browse/HIVE-18264
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Gumashta
Assignee: Vaibhav Gumashta


Currently we have a separate cache for partitions and partition col stats which 
results in some calls iterating through each of these for retrieving/updating. 
We can get better performance by organizing hierarchically. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Review Request 62228: HIVE-17495: CachedStore: prewarm improvements, refactoring and caching some aggregate stats

2017-12-07 Thread Vaibhav Gumashta

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62228/
---

(Updated Dec. 8, 2017, 12:06 a.m.)


Review request for hive, Ashutosh Chauhan, Daniel Dai, and Thejas Nair.


Changes
---

Rebased on master


Bugs: HIVE-17495
https://issues.apache.org/jira/browse/HIVE-17495


Repository: hive-git


Description
---

https://issues.apache.org/jira/browse/HIVE-17495


Diffs (updated)
-

  
itests/hcatalog-unit/src/test/java/org/apache/hive/hcatalog/listener/DummyRawStoreFailEvent.java
 62c9172ef5 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 f344c47443 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
 14653b4043 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 2e80c9d3b1 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/RawStore.java
 75fbfa23d2 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CacheUtils.java
 ab6b90fb6b 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/CachedStore.java
 da518ab6e3 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/cache/SharedCache.java
 b606779709 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/BinaryColumnStatsAggregator.java
 45d5d8c984 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/BooleanColumnStatsAggregator.java
 8aac0fe33d 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/ColumnStatsAggregator.java
 cd0392d6c0 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DateColumnStatsAggregator.java
 7f2956152c 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DecimalColumnStatsAggregator.java
 05c0280262 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/DoubleColumnStatsAggregator.java
 faf22dcd7c 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/LongColumnStatsAggregator.java
 d12cdc08ea 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/columnstats/aggr/StringColumnStatsAggregator.java
 4539e6b026 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreUtils.java
 cde34bcf42 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreControlledCommit.java
 24c59f2f1b 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/DummyRawStoreForJdoConnection.java
 1e4fe5d973 


Diff: https://reviews.apache.org/r/62228/diff/5/

Changes: https://reviews.apache.org/r/62228/diff/4-5/


Testing
---


Thanks,

Vaibhav Gumashta



  1   2   3   4   5   6   7   8   9   10   >