Re: Review Request 72532: HIVE-23495 AcidUtils.getAcidState cleanup

2020-06-15 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72532/#review221005
---


Ship it!




Ship It!

- Marta Kuczora


On June 8, 2020, 10:58 a.m., Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72532/
> ---
> 
> (Updated June 8, 2020, 10:58 a.m.)
> 
> 
> Review request for hive, Karen Coppage, Marta Kuczora, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> since HIVE-21225 there are two redundant implementation of the 
> AcidUtils.getAcidState.
> 
> The previous implementation (without the recursive listing) can be removed.
> 
> Also the performance can be improved, by removing unnecessary fileStatus 
> calls.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 635ed3149c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java ca234cfb37 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 1059cb227f 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> 16c915959c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
>  598220b0c4 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 2a15913f9f 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 4e5d5b003b 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 7913295380 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> d83a50f555 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  5e11d8d2d8 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java
>  1bdec7df2d 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 75941b3f33 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 337f469d1a 
>   ql/src/test/org/apache/hadoop/hive/ql/io/TestAcidUtils.java f351f04b08 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestInputOutputFormat.java 
> e4440e9136 
>   ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcRawRecordMerger.java 
> f63c40a7b5 
>   streaming/src/test/org/apache/hive/streaming/TestStreaming.java 3a3b267927 
> 
> 
> Diff: https://reviews.apache.org/r/72532/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Peter Varga
> 
>



Re: Review Request 72336: HIVE-23114: Insert overwrite with dynamic partitioning is not working correctly with direct insert

2020-04-08 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72336/
---

(Updated April 8, 2020, 1:47 p.m.)


Review request for hive and Peter Vary.


Changes
---

Fixing whitespaces.


Bugs: HIVE-23114
https://issues.apache.org/jira/browse/HIVE-23114


Repository: hive-git


Description
---

The idea behind the patch is the following:
When doing a multi-statement insert overwrite with dynamic partitioning, the 
partition information will be written to the manifest file. With this 
information, each FileSinkOperator can clean-up only the partition directories 
written by the same FileSinkOperator and do not clean-up the partition 
directories written by the other FileSinkOperators.
If a statement from the insert overwrite query, doesn't produce any data, a 
manifest file will still be written, otherwise the missing manifest file would 
result a clean-up on table level which could delete the data written by the 
other FileSinkOperators.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties e99ce7babb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
d68d8f9409 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 04166a23ee 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java e25dc54e7d 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 17e6cdf162 
  ql/src/test/queries/clientpositive/acid_direct_insert_insert_overwrite.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/acid_multiinsert_dyn_part.q PRE-CREATION 
  ql/src/test/results/clientpositive/acid_direct_insert_insert_overwrite.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/acid_multiinsert_dyn_part.q.out 
PRE-CREATION 
  
ql/src/test/results/clientpositive/llap/acid_direct_insert_insert_overwrite.q.out
 PRE-CREATION 
  ql/src/test/results/clientpositive/llap/acid_multiinsert_dyn_part.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/72336/diff/2/

Changes: https://reviews.apache.org/r/72336/diff/1-2/


Testing
---

Added specific q tests for different insert overwrite scenarios.


Thanks,

Marta Kuczora



Review Request 72336: HIVE-23114: Insert overwrite with dynamic partitioning is not working correctly with direct insert

2020-04-08 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72336/
---

Review request for hive and Peter Vary.


Bugs: HIVE-23114
https://issues.apache.org/jira/browse/HIVE-23114


Repository: hive-git


Description
---

The idea behind the patch is the following:
When doing a multi-statement insert overwrite with dynamic partitioning, the 
partition information will be written to the manifest file. With this 
information, each FileSinkOperator can clean-up only the partition directories 
written by the same FileSinkOperator and do not clean-up the partition 
directories written by the other FileSinkOperators.
If a statement from the insert overwrite query, doesn't produce any data, a 
manifest file will still be written, otherwise the missing manifest file would 
result a clean-up on table level which could delete the data written by the 
other FileSinkOperators.


Diffs
-

  itests/src/test/resources/testconfiguration.properties e99ce7babb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
d68d8f9409 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 04166a23ee 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java e25dc54e7d 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 17e6cdf162 
  ql/src/test/queries/clientpositive/acid_direct_insert_insert_overwrite.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/acid_multiinsert_dyn_part.q PRE-CREATION 
  ql/src/test/results/clientpositive/acid_direct_insert_insert_overwrite.q.out 
PRE-CREATION 
  ql/src/test/results/clientpositive/acid_multiinsert_dyn_part.q.out 
PRE-CREATION 
  
ql/src/test/results/clientpositive/llap/acid_direct_insert_insert_overwrite.q.out
 PRE-CREATION 
  ql/src/test/results/clientpositive/llap/acid_multiinsert_dyn_part.q.out 
PRE-CREATION 


Diff: https://reviews.apache.org/r/72336/diff/1/


Testing
---

Added specific q tests for different insert overwrite scenarios.


Thanks,

Marta Kuczora



Re: Review Request 72181: HIVE-22832: Parallelise direct insert directory cleaning process

2020-03-04 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72181/#review219763
---


Ship it!




Ship It!

- Marta Kuczora


On March 2, 2020, 9:22 a.m., Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72181/
> ---
> 
> (Updated March 2, 2020, 9:22 a.m.)
> 
> 
> Review request for hive, Marta Kuczora and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22832: Parallelise direct insert directory cleaning process
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java e9966e6364 
> 
> 
> Diff: https://reviews.apache.org/r/72181/diff/1/
> 
> 
> Testing
> ---
> 
> pre-commit build success: 
> https://builds.apache.org/job/PreCommit-HIVE-Build/20874/
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2020-02-18 Thread Marta Kuczora via Review Board


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
> > Lines 1732-1737 (patched)
> > 
> >
> > What about using lambda here?
> 
> Marta Kuczora wrote:
> Fixed it.

At the end this code part got removed.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/#review219487
---


On Feb. 18, 2020, 12:21 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71904/
> ---
> 
> (Updated Feb. 18, 2020, 12:21 p.m.)
> 
> 
> Review request for hive, Gopal V and Peter Vary.
> 
> 
> Bugs: HIVE-21164
> https://issues.apache.org/jira/browse/HIVE-21164
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Extended the original patch with saving the task attempt ids in the file 
> names and also fixed some bugs in the original patch.
> With this fix, inserting into an ACID table would not use move task to place 
> the generated files into the final directory. It will inserts every files to 
> the final directory and then clean up the files which are not needed (like 
> written by failed task attempts).
> Also fixed the replication tests which failed for the original patch as well.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d3cb60b790 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  da677c7977 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
> 056cd27496 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
>  31d15fdef9 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
>  c2aa73b5f1 
>   itests/src/test/resources/testconfiguration.properties 1b1bf1147a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> 9a3258115b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
> 8980a6292a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java 
> c4c56f8477 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> b8a0f0465c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 
> 398698ec06 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
>  2543dc6fc4 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1eb9c12cc8 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 73ca658d9c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 33d3beba46 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> c102a69f8f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 739f2b654b 
>   ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289583 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb6692df 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 842140815d 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java e56d83158f 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 908ceb43fc 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnConcatenate.java 8676e0db11 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnExIm.java 66b2b2768b 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java bb55d9fd79 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java ea6b1d9bec 
>   ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java 
> af14e628b3 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 83db48e758 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
> 2c4b69b2fe 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 48e9afc496 
>   

Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2020-02-18 Thread Marta Kuczora via Review Board


> On Feb. 4, 2020, 10:16 p.m., Rajesh Balamohan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
> > Line 4382 (original), 4397 (patched)
> > 
> >
> > Is this needed for direct insert?. In objectstores, we could have calls 
> > getting throttled.

That's a really good question, I was thinking about it a lot. I think it is not 
needed. This method does two things: removes the temporarily and duplicated 
files and returns the emptyBuckets list. This list contains elements if the 
number of buckets are bigger than the number of files. In this case, for MM 
tables,  empty files will be created. But this is not the case for ACID tables, 
there won't be any empty files created for ACID tables. I want to revisit this 
topic whether or not we need these empty files, but for now, I would go with 
the same behaviour as for ACID tables. 
About the temp file removal, when the direct insert is finished all files which 
are not committed (meaning not in the manifest files) will be deleted prior to 
this call. So there shouldn't be any unnecessary files left at this point. 
I remove this call, and upload a patch to see the result of the pre-commit 
tests. If everything passes, I think it is safe to remove this call in case of 
direct insert.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/#review219494
---


On Feb. 18, 2020, 12:21 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71904/
> ---
> 
> (Updated Feb. 18, 2020, 12:21 p.m.)
> 
> 
> Review request for hive, Gopal V and Peter Vary.
> 
> 
> Bugs: HIVE-21164
> https://issues.apache.org/jira/browse/HIVE-21164
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Extended the original patch with saving the task attempt ids in the file 
> names and also fixed some bugs in the original patch.
> With this fix, inserting into an ACID table would not use move task to place 
> the generated files into the final directory. It will inserts every files to 
> the final directory and then clean up the files which are not needed (like 
> written by failed task attempts).
> Also fixed the replication tests which failed for the original patch as well.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d3cb60b790 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  da677c7977 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
> 056cd27496 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
>  31d15fdef9 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
>  c2aa73b5f1 
>   itests/src/test/resources/testconfiguration.properties 1b1bf1147a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> 9a3258115b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
> 8980a6292a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java 
> c4c56f8477 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> b8a0f0465c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 
> 398698ec06 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
>  2543dc6fc4 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1eb9c12cc8 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 73ca658d9c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 33d3beba46 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> c102a69f8f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 739f2b654b 
>   ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289583 
> 

Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2020-02-18 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/
---

(Updated Feb. 18, 2020, 12:21 p.m.)


Review request for hive, Gopal V and Peter Vary.


Bugs: HIVE-21164
https://issues.apache.org/jira/browse/HIVE-21164


Repository: hive-git


Description
---

Extended the original patch with saving the task attempt ids in the file names 
and also fixed some bugs in the original patch.
With this fix, inserting into an ACID table would not use move task to place 
the generated files into the final directory. It will inserts every files to 
the final directory and then clean up the files which are not needed (like 
written by failed task attempts).
Also fixed the replication tests which failed for the original patch as well.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d3cb60b790 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 da677c7977 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
056cd27496 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 31d15fdef9 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
 c2aa73b5f1 
  itests/src/test/resources/testconfiguration.properties 1b1bf1147a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
9a3258115b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 8980a6292a 
  ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8477 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
b8a0f0465c 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 398698ec06 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 2543dc6fc4 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1eb9c12cc8 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
73ca658d9c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 33d3beba46 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
c102a69f8f 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
739f2b654b 
  ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289583 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb6692df 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 842140815d 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java e56d83158f 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 908ceb43fc 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnConcatenate.java 8676e0db11 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnExIm.java 66b2b2768b 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java bb55d9fd79 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java ea6b1d9bec 
  ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java af14e628b3 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 83db48e758 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
2c4b69b2fe 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
48e9afc496 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java 
cfd7290762 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestWorker.java 
70ae85c458 
  ql/src/test/queries/clientpositive/tez_acid_union_dynamic_partition.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/tez_acid_union_dynamic_partition_2.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/tez_acid_union_multiinsert.q PRE-CREATION 
  ql/src/test/results/clientpositive/acid_subquery.q.out 1dc1775557 
  ql/src/test/results/clientpositive/create_transactional_full_acid.q.out 
e324d5ec43 
  
ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_dynamic.q.out
 61b0057adb 
  ql/src/test/results/clientpositive/llap/acid_no_buckets.q.out fbf4e481f1 
  ql/src/test/results/clientpositive/llap/insert_overwrite.q.out fbc3326b39 
  

Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2020-02-14 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/
---

(Updated Feb. 14, 2020, 4:10 p.m.)


Review request for hive, Gopal V and Peter Vary.


Changes
---

Addressed the review findings


Bugs: HIVE-21164
https://issues.apache.org/jira/browse/HIVE-21164


Repository: hive-git


Description
---

Extended the original patch with saving the task attempt ids in the file names 
and also fixed some bugs in the original patch.
With this fix, inserting into an ACID table would not use move task to place 
the generated files into the final directory. It will inserts every files to 
the final directory and then clean up the files which are not needed (like 
written by failed task attempts).
Also fixed the replication tests which failed for the original patch as well.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 2f695d4acc 
  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 da677c7977 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
056cd27496 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 31d15fdef9 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
 c2aa73b5f1 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 4c0137 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
9a3258115b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 8980a6292a 
  ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8477 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
b8a0f0465c 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 398698ec06 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 2543dc6fc4 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 945eafc034 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
73ca658d9c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 33d3beba46 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
c102a69f8f 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
bb70db4524 
  ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289583 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb6692df 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 842140815d 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 88ca683173 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 908ceb43fc 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnConcatenate.java 8676e0db11 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnExIm.java 66b2b2768b 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java bb55d9fd79 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java ea6b1d9bec 
  ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java af14e628b3 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 83db48e758 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
2c4b69b2fe 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
48e9afc496 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java 
cfd7290762 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestWorker.java 
70ae85c458 
  ql/src/test/queries/clientpositive/tez_acid_union_dynamic_partition.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/tez_acid_union_dynamic_partition_2.q 
PRE-CREATION 
  ql/src/test/queries/clientpositive/tez_acid_union_multiinsert.q PRE-CREATION 
  ql/src/test/results/clientpositive/acid_subquery.q.out 1dc1775557 
  ql/src/test/results/clientpositive/create_transactional_full_acid.q.out 
e324d5ec43 
  
ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_dynamic.q.out
 61b0057adb 
  ql/src/test/results/clientpositive/llap/acid_no_buckets.q.out fbf4e481f1 
  

Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2020-02-14 Thread Marta Kuczora via Review Board


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
> > Lines 1444 (patched)
> > 
> >
> > Why is this null?

It is null, because if the union all optimization is on, the different union 
statements will be translated into different FileSinkOperators and they will 
write to their own separate directories. They are normally writing to the 
staging directory and under folders with specific 'HIVE_UNION_SUBDIR_' prefix. 
Then the move tasks will move these files to the final table directory. In ACID 
tables these FileSinkOperators would write to different delta directories 
anyway, so the tasks could write directly to the final table location instead 
of the 'HIVE_UNION_SUBDIR_' folders. That's why the unionSuffix is null here. 
In other cases, they have the 'HIVE_UNION_SUBDIR_' value.
Btw, I locally modified many union q tests to run with ACID tables and ran them 
with MR and Tez. I found one bug, which I fixed and I also added some union q 
tests to run with ACID table and direct insert.


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java
> > Lines 77 (patched)
> > 
> >
> > We created this variable - we should use it? Maybe set it even as a 
> > constant?

You're right. I move this as a constant and changed the tests.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/#review219487
---


On Jan. 31, 2020, 4:12 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71904/
> ---
> 
> (Updated Jan. 31, 2020, 4:12 p.m.)
> 
> 
> Review request for hive, Gopal V and Peter Vary.
> 
> 
> Bugs: HIVE-21164
> https://issues.apache.org/jira/browse/HIVE-21164
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Extended the original patch with saving the task attempt ids in the file 
> names and also fixed some bugs in the original patch.
> With this fix, inserting into an ACID table would not use move task to place 
> the generated files into the final directory. It will inserts every files to 
> the final directory and then clean up the files which are not needed (like 
> written by failed task attempts).
> Also fixed the replication tests which failed for the original patch as well.
> 
> 
> Diffs
> -
> 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  da677c7977 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
> 056cd27496 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
>  31d15fdef9 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
>  c2aa73b5f1 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  4c0137 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> 9a3258115b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
> 8980a6292a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java 
> c4c56f8477 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> b8a0f0465c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 
> 398698ec06 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
>  2543dc6fc4 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 7f061d4a6b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 73ca658d9c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 5fcc367cc9 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> c102a69f8f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
>   

Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2020-02-04 Thread Marta Kuczora via Review Board


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
> > Lines 1732-1737 (patched)
> > 
> >
> > What about using lambda here?

Fixed it.


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> > Lines 7442-7443 (original), 7456-7460 (patched)
> > 
> >
> > nit: Maybe if/else

Fixed it.


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> > Lines 7562-7563 (original), 7600-7604 (patched)
> > 
> >
> > nit: Maybe if/else?

Fixed it.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/#review219487
---


On Jan. 31, 2020, 4:12 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71904/
> ---
> 
> (Updated Jan. 31, 2020, 4:12 p.m.)
> 
> 
> Review request for hive, Gopal V and Peter Vary.
> 
> 
> Bugs: HIVE-21164
> https://issues.apache.org/jira/browse/HIVE-21164
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Extended the original patch with saving the task attempt ids in the file 
> names and also fixed some bugs in the original patch.
> With this fix, inserting into an ACID table would not use move task to place 
> the generated files into the final directory. It will inserts every files to 
> the final directory and then clean up the files which are not needed (like 
> written by failed task attempts).
> Also fixed the replication tests which failed for the original patch as well.
> 
> 
> Diffs
> -
> 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  da677c7977 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
> 056cd27496 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
>  31d15fdef9 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
>  c2aa73b5f1 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  4c0137 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> 9a3258115b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
> 8980a6292a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java 
> c4c56f8477 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> b8a0f0465c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 
> 398698ec06 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
>  2543dc6fc4 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 7f061d4a6b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 73ca658d9c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 5fcc367cc9 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> c102a69f8f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> bb70db4524 
>   ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289583 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb6692df 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 842140815d 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 88ca683173 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 908ceb43fc 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnConcatenate.java 8676e0db11 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnExIm.java 66b2b2768b 
>   

Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2020-02-04 Thread Marta Kuczora via Review Board


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> > Lines 7526-7543 (patched)
> > 
> >
> > Is this duplicated code?

Yeah, however I cannot move this whole part to a separate method, because the 
acidOp and the isDirectInsert variables both have to be set. I can create a 
separate method for getting the value of isDirectInsert and a separate method 
for getting the tmp dir.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/#review219487
---


On Jan. 31, 2020, 4:12 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71904/
> ---
> 
> (Updated Jan. 31, 2020, 4:12 p.m.)
> 
> 
> Review request for hive, Gopal V and Peter Vary.
> 
> 
> Bugs: HIVE-21164
> https://issues.apache.org/jira/browse/HIVE-21164
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Extended the original patch with saving the task attempt ids in the file 
> names and also fixed some bugs in the original patch.
> With this fix, inserting into an ACID table would not use move task to place 
> the generated files into the final directory. It will inserts every files to 
> the final directory and then clean up the files which are not needed (like 
> written by failed task attempts).
> Also fixed the replication tests which failed for the original patch as well.
> 
> 
> Diffs
> -
> 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  da677c7977 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
> 056cd27496 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
>  31d15fdef9 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
>  c2aa73b5f1 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  4c0137 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> 9a3258115b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 
> 8980a6292a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java 
> c4c56f8477 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> b8a0f0465c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 
> 398698ec06 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
>  2543dc6fc4 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 7f061d4a6b 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 73ca658d9c 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> 5fcc367cc9 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
> c102a69f8f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> bb70db4524 
>   ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289583 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb6692df 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 842140815d 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 88ca683173 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 908ceb43fc 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnConcatenate.java 8676e0db11 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnExIm.java 66b2b2768b 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java bb55d9fd79 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java ea6b1d9bec 
>   ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java 
> af14e628b3 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 83db48e758 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
> 

Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2020-02-04 Thread Marta Kuczora via Review Board


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > Thanks for the patch! This will be very-very usefull.
> > Some minor comments, questions...

Thanks a lot for the review!!


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
> > Lines 55 (patched)
> > 
> >
> > Is this import used?

You're right, it is not used. Removed it.


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
> > Lines 843 (patched)
> > 
> >
> > Is inheritPerms still a working stuff? I kinda remember that it was 
> > removed from Hive some time ago...

No, I think this log message was just a copy-paste error. Fixed it.


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
> > Lines 1799 (patched)
> > 
> >
> > Maybe slightly different log message, so we can easily ditinguish 
> > between this and the line below

Fixed it.


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
> > Lines 7379 (patched)
> > 
> >
> > We might want to make this feature configurable, to turn it on/off in 
> > case we missed some edge cases

You are absolutely right. I introduced a config parameter so we can turn on/off 
this feature.


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
> > Lines 493-494 (patched)
> > 
> >
> > nit: Formatting? Really not important, just for the completensess shake 
> > :D

Fixed it.


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
> > Lines 690-691 (patched)
> > 
> >
> > nit: Formatting?

Fixed it.


> On Feb. 4, 2020, 3:49 p.m., Peter Vary wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java
> > Lines 1246 (patched)
> > 
> >
> > Is this table always exists? Shall we use "drop table if exists" 
> > instead?

Fixed it.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/#review219487
---


On Jan. 31, 2020, 4:12 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71904/
> ---
> 
> (Updated Jan. 31, 2020, 4:12 p.m.)
> 
> 
> Review request for hive, Gopal V and Peter Vary.
> 
> 
> Bugs: HIVE-21164
> https://issues.apache.org/jira/browse/HIVE-21164
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Extended the original patch with saving the task attempt ids in the file 
> names and also fixed some bugs in the original patch.
> With this fix, inserting into an ACID table would not use move task to place 
> the generated files into the final directory. It will inserts every files to 
> the final directory and then clean up the files which are not needed (like 
> written by failed task attempts).
> Also fixed the replication tests which failed for the original patch as well.
> 
> 
> Diffs
> -
> 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  da677c7977 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
> 056cd27496 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
>  31d15fdef9 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
>  c2aa73b5f1 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  4c0137 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> 9a3258115b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 

Re: Review Request 72074: HIVE-21215: Read Parquet INT64 timestamp

2020-02-03 Thread Marta Kuczora via Review Board


> On Feb. 3, 2020, 9:12 a.m., Karen Coppage wrote:
> > Thanks for the patch, looks good! Two ideas:
> > 1. It would be nice to have a unit test that reads a date before October 
> > 1582, so it's clear that we're using the Proleptic Gregorian calendar.
> > 2. ParquetTimestampUtils would be more readable if the big multipliers were 
> > declared as constants and/or in this format: e.g. 1_000_000.
> > 
> > Thanks!

Thanks a lot Karen for the review. These are really good point. I fixed the 
numbers in the ParquetTimestampUtils and also added some test cases for dates 
before 1582.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72074/#review219466
---


On Feb. 3, 2020, 12:31 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72074/
> ---
> 
> (Updated Feb. 3, 2020, 12:31 p.m.)
> 
> 
> Review request for hive, Karen Coppage and Peter Vary.
> 
> 
> Bugs: HIVE-21215
> https://issues.apache.org/jira/browse/HIVE-21215
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Implemented the read path for Parquet INT64 timestamp.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/type/Timestamp.java 
> f2c1493f56 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
> d67b030648 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/ParquetTimestampUtils.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetDataColumnReaderFactory.java
>  519bd813e9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java
>  2803baf90c 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java
>  f6ee57140c 
> 
> 
> Diff: https://reviews.apache.org/r/72074/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 72074: HIVE-21215: Read Parquet INT64 timestamp

2020-02-03 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72074/
---

(Updated Feb. 3, 2020, 12:31 p.m.)


Review request for hive, Karen Coppage and Peter Vary.


Bugs: HIVE-21215
https://issues.apache.org/jira/browse/HIVE-21215


Repository: hive-git


Description
---

Implemented the read path for Parquet INT64 timestamp.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/common/type/Timestamp.java f2c1493f56 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
d67b030648 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/ParquetTimestampUtils.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetDataColumnReaderFactory.java
 519bd813e9 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java
 2803baf90c 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java
 f6ee57140c 


Diff: https://reviews.apache.org/r/72074/diff/2/

Changes: https://reviews.apache.org/r/72074/diff/1-2/


Testing
---


Thanks,

Marta Kuczora



Review Request 72074: HIVE-21215: Read Parquet INT64 timestamp

2020-01-31 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72074/
---

Review request for hive, Karen Coppage and Peter Vary.


Bugs: HIVE-21215
https://issues.apache.org/jira/browse/HIVE-21215


Repository: hive-git


Description
---

Implemented the read path for Parquet INT64 timestamp.


Diffs
-

  common/src/java/org/apache/hadoop/hive/common/type/Timestamp.java f2c1493f56 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
d67b030648 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/timestamp/ParquetTimestampUtils.java
 PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetDataColumnReaderFactory.java
 519bd813e9 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/VectorizedPrimitiveColumnReader.java
 2803baf90c 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/convert/TestETypeConverter.java
 f6ee57140c 


Diff: https://reviews.apache.org/r/72074/diff/1/


Testing
---


Thanks,

Marta Kuczora



Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2020-01-31 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/
---

(Updated Jan. 31, 2020, 4:12 p.m.)


Review request for hive, Gopal V and Peter Vary.


Bugs: HIVE-21164
https://issues.apache.org/jira/browse/HIVE-21164


Repository: hive-git


Description
---

Extended the original patch with saving the task attempt ids in the file names 
and also fixed some bugs in the original patch.
With this fix, inserting into an ACID table would not use move task to place 
the generated files into the final directory. It will inserts every files to 
the final directory and then clean up the files which are not needed (like 
written by failed task attempts).
Also fixed the replication tests which failed for the original patch as well.


Diffs (updated)
-

  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 da677c7977 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
056cd27496 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 31d15fdef9 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
 c2aa73b5f1 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 4c0137 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
9a3258115b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 8980a6292a 
  ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8477 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
b8a0f0465c 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 398698ec06 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 2543dc6fc4 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 7f061d4a6b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
73ca658d9c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 5fcc367cc9 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
c102a69f8f 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
bb70db4524 
  ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289583 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb6692df 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 842140815d 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 88ca683173 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 908ceb43fc 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnConcatenate.java 8676e0db11 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnExIm.java 66b2b2768b 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java bb55d9fd79 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java ea6b1d9bec 
  ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java af14e628b3 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 83db48e758 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
2c4b69b2fe 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
48e9afc496 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java 
cfd7290762 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestWorker.java 
70ae85c458 
  ql/src/test/results/clientpositive/acid_subquery.q.out 1dc1775557 
  ql/src/test/results/clientpositive/create_transactional_full_acid.q.out 
e324d5ec43 
  
ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_dynamic.q.out
 61b0057adb 
  ql/src/test/results/clientpositive/llap/acid_no_buckets.q.out fbf4e481f1 
  ql/src/test/results/clientpositive/llap/insert_overwrite.q.out fbc3326b39 
  ql/src/test/results/clientpositive/llap/mm_all.q.out 226f2a9374 
  ql/src/test/results/clientpositive/mm_all.q.out 143ebd69f9 
  streaming/src/test/org/apache/hive/streaming/TestStreaming.java 35a220facd 


Diff: https://reviews.apache.org/r/71904/diff/3/

Changes: https://reviews.apache.org/r/71904/diff/2-3/


Testing

Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2020-01-31 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/
---

(Updated Jan. 31, 2020, 10:17 a.m.)


Review request for hive, Gopal V and Peter Vary.


Changes
---

Rebased the patch


Bugs: HIVE-21164
https://issues.apache.org/jira/browse/HIVE-21164


Repository: hive-git


Description
---

Extended the original patch with saving the task attempt ids in the file names 
and also fixed some bugs in the original patch.
With this fix, inserting into an ACID table would not use move task to place 
the generated files into the final directory. It will inserts every files to 
the final directory and then clean up the files which are not needed (like 
written by failed task attempts).
Also fixed the replication tests which failed for the original patch as well.


Diffs (updated)
-

  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 da677c7977 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
056cd27496 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 31d15fdef9 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 4c0137 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
9a3258115b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb223f2 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 2f5ec5270c 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 8980a6292a 
  ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e6774b7 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984abd0a 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8477 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
b8a0f0465c 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 398698ec06 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 2543dc6fc4 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 7f061d4a6b 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
73ca658d9c 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 5fcc367cc9 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
c102a69f8f 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bdee4d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed05819b5 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
bb70db4524 
  ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289583 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb6692df 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 842140815d 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 88ca683173 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 908ceb43fc 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnConcatenate.java 8676e0db11 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnExIm.java 66b2b2768b 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java bb55d9fd79 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java ea6b1d9bec 
  ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java af14e628b3 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 83db48e758 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 
2c4b69b2fe 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
48e9afc496 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java 
cfd7290762 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestWorker.java 
70ae85c458 
  ql/src/test/results/clientpositive/acid_subquery.q.out 1dc1775557 
  ql/src/test/results/clientpositive/create_transactional_full_acid.q.out 
e324d5ec43 
  
ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_dynamic.q.out
 61b0057adb 
  ql/src/test/results/clientpositive/llap/acid_no_buckets.q.out fbf4e481f1 
  ql/src/test/results/clientpositive/llap/insert_overwrite.q.out fbc3326b39 
  ql/src/test/results/clientpositive/llap/mm_all.q.out 226f2a9374 
  ql/src/test/results/clientpositive/mm_all.q.out 143ebd69f9 
  streaming/src/test/org/apache/hive/streaming/TestStreaming.java 35a220facd 


Diff: https://reviews.apache.org/r/71904/diff/2/

Changes: https://reviews.apache.org/r/71904/diff/1-2/


Testing
---

Had to modify some tests because of the file name changes. Also added 

Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2019-12-12 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/
---

Review request for hive, Gopal V and Peter Vary.


Bugs: HIVE-21164
https://issues.apache.org/jira/browse/HIVE-21164


Repository: hive-git


Description
---

Extended the original patch with saving the task attempt ids in the file names 
and also fixed some bugs in the original patch.
With this fix, inserting into an ACID table would not use move task to place 
the generated files into the final directory. It will inserts every files to 
the final directory and then clean up the files which are not needed (like 
written by failed task attempts).
Also fixed the replication tests which failed for the original patch as well.


Diffs
-

  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 da677c7 
  itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
2868427 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
 31d15fd 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
 445e39c 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
 b7245e2 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
9a32581 
  ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebe 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 3d30d09 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidOutputFormat.java 1e8bb22 
  ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 3c508ec 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveFileFormatUtils.java 8980a62 
  ql/src/java/org/apache/hadoop/hive/ql/io/RecordUpdater.java 737e677 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.java 76984ab 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c4c56f8 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 2ac6232 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRecordUpdater.java 3fa61d3 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/VectorizedOrcAcidRowBatchReader.java
 2543dc6 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java f4bd0f9 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 73ca658 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 90549f9 
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java c102a69 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java ecc7bde 
  ql/src/java/org/apache/hadoop/hive/ql/plan/LoadTableDesc.java bed0581 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 2b2cc1a 
  ql/src/java/org/apache/hadoop/hive/ql/util/UpgradeTool.java 58e6289 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnAddPartition.java c9cb669 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands.java 8421408 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 88ca683 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 908ceb4 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnConcatenate.java 8676e0d 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnExIm.java 66b2b27 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnLoadData.java bb55d9f 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java ea6b1d9 
  ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java af14e62 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java dd70524 
  ql/src/test/org/apache/hadoop/hive/ql/exec/TestFileSinkOperator.java 2c4b69b 
  ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java c033a94 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java 
cfd7290 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestWorker.java 70ae85c 
  ql/src/test/results/clientpositive/acid_subquery.q.out 1dc1775 
  ql/src/test/results/clientpositive/create_transactional_full_acid.q.out 
e324d5e 
  
ql/src/test/results/clientpositive/encrypted/encryption_insert_partition_dynamic.q.out
 61b0057 
  ql/src/test/results/clientpositive/llap/acid_no_buckets.q.out 5571c53 
  ql/src/test/results/clientpositive/llap/insert_overwrite.q.out fbc3326 
  ql/src/test/results/clientpositive/llap/mm_all.q.out 7542a6a 
  ql/src/test/results/clientpositive/mm_all.q.out 1377856 
  streaming/src/test/org/apache/hive/streaming/TestStreaming.java 58b3ae2 


Diff: https://reviews.apache.org/r/71904/diff/1/


Testing
---

Had to modify some tests because of the file name changes. Also added some 
specific tests.
In the pre-commit run all tests passed successfully.


Thanks,

Marta Kuczora



Re: Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-10-10 Thread Marta Kuczora via Review Board


> On Oct. 10, 2019, 9 a.m., Peter Vary wrote:
> > Thanks for chasing this down!
> > Really appreciate it!

Thanks a lot for the review!


> On Oct. 10, 2019, 9 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
> > Lines 157 (patched)
> > 
> >
> > This is the best way to check this?
> > Is this always starts with char? CHAR? or anything else is not possible?

It always start with "char", but you are right that it is not the best way to 
check it. I changed it to use at least the name of the CHAR serde constant.


> On Oct. 10, 2019, 9 a.m., Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
> > Lines 181 (patched)
> > 
> >
> > I do not like this.
> > Either we only aim for space, or we aim for whitespace characters, but 
> > the check and the replace is different.

You are right, thanks for pointing this out. Since the regex will always 
replace the whitespaces at the end of the string, the check if the string ends 
with space is not event necessary. If it doesn't end with space, the regex 
replace will do nothing.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71606/#review218175
---


On Oct. 10, 2019, 11:39 a.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71606/
> ---
> 
> (Updated Oct. 10, 2019, 11:39 a.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-21407
> https://issues.apache.org/jira/browse/HIVE-21407
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The previous approach didn't solve all use cases. In this new approach the 
> hive type is sent to the Parquet PPD part and trim the value which is pushed 
> to the predicate in case of CHAR hive type.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
>  5b051dd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
> fc9188f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
> 033e26a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
>  ca5e085 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
>  0210a0a 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
>  7c7c657 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
> 4c40908 
>   ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 
>   ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71606/diff/2/
> 
> 
> Testing
> ---
> 
> Added new q test for testing the PPD for char and varchar types. Also 
> extended the unit tests for the 
> ParquetFilterPredicateConverter.toFilterPredicate method.
> 
> 
> The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are 
> both testing the same thing, the behavior of the 
> ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make 
> sense to have tests for the same use case in different test classes, so moved 
> the test cases from the TestParquetRecordReaderWrapper to 
> TestParquetFilterPredicate.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-10-10 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71606/
---

(Updated Oct. 10, 2019, 11:39 a.m.)


Review request for hive and Peter Vary.


Changes
---

Fix the issues from the review.


Bugs: HIVE-21407
https://issues.apache.org/jira/browse/HIVE-21407


Repository: hive-git


Description
---

The previous approach didn't solve all use cases. In this new approach the hive 
type is sent to the Parquet PPD part and trim the value which is pushed to the 
predicate in case of CHAR hive type.


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
 5b051dd 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
fc9188f 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
033e26a 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
 ca5e085 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
 0210a0a 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
 7c7c657 
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
4c40908 
  ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 
  ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
  ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/71606/diff/2/

Changes: https://reviews.apache.org/r/71606/diff/1-2/


Testing
---

Added new q test for testing the PPD for char and varchar types. Also extended 
the unit tests for the ParquetFilterPredicateConverter.toFilterPredicate method.


The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are both 
testing the same thing, the behavior of the 
ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make sense 
to have tests for the same use case in different test classes, so moved the 
test cases from the TestParquetRecordReaderWrapper to 
TestParquetFilterPredicate.


Thanks,

Marta Kuczora



Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-10-10 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71606/
---

Review request for hive and Peter Vary.


Bugs: HIVE-21407
https://issues.apache.org/jira/browse/HIVE-21407


Repository: hive-git


Description
---

The previous approach didn't solve all use cases. In this new approach the hive 
type is sent to the Parquet PPD part and trim the value which is pushed to the 
predicate in case of CHAR hive type.


Diffs
-

  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
 5b051dd 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
fc9188f 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
033e26a 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
 ca5e085 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
 0210a0a 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
 7c7c657 
  ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
4c40908 
  ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 
  ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
  ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/71606/diff/1/


Testing
---

Added new q test for testing the PPD for char and varchar types. Also extended 
the unit tests for the ParquetFilterPredicateConverter.toFilterPredicate method.


The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are both 
testing the same thing, the behavior of the 
ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make sense 
to have tests for the same use case in different test classes, so moved the 
test cases from the TestParquetRecordReaderWrapper to 
TestParquetFilterPredicate.


Thanks,

Marta Kuczora



Review Request 71558: HIVE-21987: Hive is unable to read Parquet int32 annotated with decimal

2019-09-30 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71558/
---

Review request for hive and Peter Vary.


Bugs: HIVE-21987
https://issues.apache.org/jira/browse/HIVE-21987


Repository: hive-git


Description
---

Added support to read INT32 Parquet decimals.


Diffs
-

  data/files/parquet_int_decimal_1.parquet PRE-CREATION 
  data/files/parquet_int_decimal_2.parquet PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
350ae2d 
  
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetDataColumnReaderFactory.java
 320ce52 
  ql/src/test/queries/clientpositive/parquet_int_decimal.q PRE-CREATION 
  ql/src/test/results/clientpositive/parquet_int_decimal.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/type_change_test_fraction.q.out 07cf8fa 


Diff: https://reviews.apache.org/r/71558/diff/1/


Testing
---

Added new q tests for the use-case.


Thanks,

Marta Kuczora



Re: Review Request 71271: HIVE-21580: Introduce ISO 8601 week numbering SQL:2016 formats

2019-08-21 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71271/#review217349
---


Fix it, then Ship it!




Thanks a lot for the patch. I have only one comment, otherwise it look good.


common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
Lines 1025 (patched)


As we discussed, it would be enough to update the variable in an if 
statement when the temporalField is "IsoFields.WEEK_BASED_YEAR". In that case, 
there is no need for the updateVar method which is a bit confusing.


- Marta Kuczora


On Aug. 12, 2019, 10:59 a.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71271/
> ---
> 
> (Updated Aug. 12, 2019, 10:59 a.m.)
> 
> 
> Review request for hive and Marta Kuczora.
> 
> 
> Bugs: HIVE-21580
> https://issues.apache.org/jira/browse/HIVE-21580
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Enable Hive to parse the following datetime formats when any 
> combination/subset of these or previously implemented patterns is provided in 
> one string. Also catch combinations that conflict.
> 
> IYYY
> IYY
> IY
> I
> IW
> 
> 
> Diffs
> -
> 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
>  9443e8ec78 
>   
> common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java
>  ff41534fce 
> 
> 
> Diff: https://reviews.apache.org/r/71271/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 71016: HIVE-21578: Introduce SQL:2016 formats FM, FX, and nested strings

2019-07-26 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71016/#review216889
---


Ship it!




Ship It!

- Marta Kuczora


On July 26, 2019, 10:01 a.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71016/
> ---
> 
> (Updated July 26, 2019, 10:01 a.m.)
> 
> 
> Review request for hive and Marta Kuczora.
> 
> 
> Bugs: HIVE-21578
> https://issues.apache.org/jira/browse/HIVE-21578
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Enable Hive to parse the following datetime formats when any combination or 
> subset of these or previously implemented formats is provided in one string. 
> 
> "text" (nested strings)
> FM
> FX
> 
> 
> Diffs
> -
> 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
>  998e5a2f6a 
>   
> common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java
>  ac57842148 
>   ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q 
> 5a2a6d7894 
>   ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out 
> e1fd341050 
> 
> 
> Diff: https://reviews.apache.org/r/71016/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 71016: HIVE-21578: Introduce SQL:2016 formats FM, FX, and nested strings

2019-07-26 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71016/#review216885
---



Thanks a lot for the patch! I had two comments, but otherwise it looks good. 
Nice testing btw!! :)

- Marta Kuczora


On July 5, 2019, 7:51 a.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71016/
> ---
> 
> (Updated July 5, 2019, 7:51 a.m.)
> 
> 
> Review request for hive and Marta Kuczora.
> 
> 
> Bugs: HIVE-21578
> https://issues.apache.org/jira/browse/HIVE-21578
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Enable Hive to parse the following datetime formats when any combination or 
> subset of these or previously implemented formats is provided in one string. 
> 
> "text" (nested strings)
> FM
> FX
> 
> 
> Diffs
> -
> 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
>  998e5a2f6a 
>   
> common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java
>  4e822d53f9 
>   ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q 
> 5a2a6d7894 
>   ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out 
> e1fd341050 
> 
> 
> Diff: https://reviews.apache.org/r/71016/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 71016: HIVE-21578: Introduce SQL:2016 formats FM, FX, and nested strings

2019-07-26 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71016/#review216884
---




common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
Lines 546 (patched)


I think it would be better to split this method to two: one for checking 
only fm and one for checking only fx. Returning a boolean and setting an other 
one in the background can be a bit confusing for the caller.



common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java
Lines 252 (patched)


I think this test could be split up to have separate tests for fm, fx and 
fm-fx cases. It is just a nit, but I think it is a good idea to focus on one 
use-case per test.


- Marta Kuczora


On July 5, 2019, 7:51 a.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71016/
> ---
> 
> (Updated July 5, 2019, 7:51 a.m.)
> 
> 
> Review request for hive and Marta Kuczora.
> 
> 
> Bugs: HIVE-21578
> https://issues.apache.org/jira/browse/HIVE-21578
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Enable Hive to parse the following datetime formats when any combination or 
> subset of these or previously implemented formats is provided in one string. 
> 
> "text" (nested strings)
> FM
> FX
> 
> 
> Diffs
> -
> 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
>  998e5a2f6a 
>   
> common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java
>  4e822d53f9 
>   ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q 
> 5a2a6d7894 
>   ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out 
> e1fd341050 
> 
> 
> Diff: https://reviews.apache.org/r/71016/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 71011: HIVE:21957: Create temporary table like should omit transactional properties.

2019-07-25 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71011/#review216850
---


Ship it!




Ship It!

- Marta Kuczora


On July 4, 2019, 1:52 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71011/
> ---
> 
> (Updated July 4, 2019, 1:52 p.m.)
> 
> 
> Review request for hive, Marta Kuczora and Thejas Nair.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE:21957: Create temporary table like should omit transactional properties.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> e09fc379f5e0127367e73ed4c4556522de9838a8 
> 
> 
> Diff: https://reviews.apache.org/r/71011/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71011: HIVE:21957: Create temporary table like should omit transactional properties.

2019-07-25 Thread Marta Kuczora via Review Board


> On July 18, 2019, noon, Marta Kuczora wrote:
> > Thanks a lot for the patch!
> > Just one question: could you add a test about the fixed use-case?
> 
> Laszlo Pinter wrote:
> I will add test in a separate patch.

Ok, thanks!


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71011/#review216720
---


On July 4, 2019, 1:52 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71011/
> ---
> 
> (Updated July 4, 2019, 1:52 p.m.)
> 
> 
> Review request for hive, Marta Kuczora and Thejas Nair.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE:21957: Create temporary table like should omit transactional properties.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> e09fc379f5e0127367e73ed4c4556522de9838a8 
> 
> 
> Diff: https://reviews.apache.org/r/71011/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71011: HIVE:21957: Create temporary table like should omit transactional properties.

2019-07-18 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71011/#review216720
---



Thanks a lot for the patch!
Just one question: could you add a test about the fixed use-case?

- Marta Kuczora


On July 4, 2019, 1:52 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71011/
> ---
> 
> (Updated July 4, 2019, 1:52 p.m.)
> 
> 
> Review request for hive, Marta Kuczora and Thejas Nair.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE:21957: Create temporary table like should omit transactional properties.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> e09fc379f5e0127367e73ed4c4556522de9838a8 
> 
> 
> Diff: https://reviews.apache.org/r/71011/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70920: HIVE-21868: Vectorize CAST...FORMAT

2019-07-09 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70920/#review216444
---


Ship it!




Ship It!

- Marta Kuczora


On July 4, 2019, 3:04 p.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70920/
> ---
> 
> (Updated July 4, 2019, 3:04 p.m.)
> 
> 
> Review request for hive and Marta Kuczora.
> 
> 
> Bugs: HIVE-21868
> https://issues.apache.org/jira/browse/HIVE-21868
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Vectorize UDFs for CAST ( AS STRING/CHAR/VARCHAR FORMAT 
> ) and CAST ( AS TIMESTAMP/DATE FORMAT ).
> 
> 
> Diffs
> -
> 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
>  4e024a357b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> fa9d1e9783 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
>  dfa9f8a00d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToStringWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToVarCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDate.java
>  a6dff12e1a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDateWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestamp.java
>  b48b0136eb 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestampWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToString.java
>  adc3a9d7b9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToStringWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToVarCharWithFormat.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java 
> 16742eee9b 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorMathFunctions.java
>  663237739e 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java
>  58fd7b030e 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCastsWithFormat.java
>  PRE-CREATION 
>   ql/src/test/queries/clientnegative/udf_cast_format_bad_pattern.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q 
> 269edf6da6 
>   ql/src/test/results/clientnegative/udf_cast_format_bad_pattern.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out 
> 4a502b9700 
> 
> 
> Diff: https://reviews.apache.org/r/70920/diff/5/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 70963: HIVE-21874: Implement add partitions related methods on temporary table

2019-07-03 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70963/#review216337
---


Ship it!




Ship It!

- Marta Kuczora


On July 1, 2019, 9:20 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70963/
> ---
> 
> (Updated July 1, 2019, 9:20 a.m.)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21874: Implement add partitions related methods on temporary table
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  957ebb12725e9deac7e7644709521a998df4dbb4 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  a15f5ea0453c7459217d229fa373cc1fec2f4d7a 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  25643495b53e1ede473c48a90b208b43070ee6aa 
> 
> 
> Diff: https://reviews.apache.org/r/70963/diff/2/
> 
> 
> Testing
> ---
> 
> Unit testing is done via 
> TestSessionHiveMetastoreClientAddPartitionsTempTable, 
> TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70963: HIVE-21874: Implement add partitions related methods on temporary table

2019-07-03 Thread Marta Kuczora via Review Board


> On June 28, 2019, 2:42 p.m., Marta Kuczora wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
> > Line 1046 (original), 1049-1050 (patched)
> > 
> >
> > Why do you need to make the DB and Table name lower case?
> 
> Laszlo Pinter wrote:
> Partition properties like table and db name must be stored in lower case. 
> This is the same in HiveMestarore as well. 
> Other properties are case sensitive.

Ah, I see, thanks for the explanation.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70963/#review216227
---


On July 1, 2019, 9:20 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70963/
> ---
> 
> (Updated July 1, 2019, 9:20 a.m.)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21874: Implement add partitions related methods on temporary table
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  957ebb12725e9deac7e7644709521a998df4dbb4 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  a15f5ea0453c7459217d229fa373cc1fec2f4d7a 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  25643495b53e1ede473c48a90b208b43070ee6aa 
> 
> 
> Diff: https://reviews.apache.org/r/70963/diff/2/
> 
> 
> Testing
> ---
> 
> Unit testing is done via 
> TestSessionHiveMetastoreClientAddPartitionsTempTable, 
> TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70920: HIVE-21868: Vectorize CAST...FORMAT

2019-07-03 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70920/#review216334
---



Thanks a lot Karen for the patch!
I have some questions, but otherwise the change looks good to me.


common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
Line 223 (original), 224 (patched)


Why did you change the type of this variable to ArrayList from List?



ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
Lines 59 (patched)


Do the CastDateToString, CastDateToChar and CastDateToVarchar udfs use this 
method, or is this just a typo and the CastDateToStringWithFormat, ... udfs use 
this?



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java
Line 200 (original), 202 (patched)


Is the formattedOutput variable never going to be null after this change? 
If there is a scenario where it can be null, it will cause problems when trying 
to cast it.



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java
Line 217 (original), 220 (patched)


The same question about being null (previous comment) applies to the t and 
d variable as well.


- Marta Kuczora


On June 26, 2019, 8:44 a.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70920/
> ---
> 
> (Updated June 26, 2019, 8:44 a.m.)
> 
> 
> Review request for hive and Marta Kuczora.
> 
> 
> Bugs: HIVE-21868
> https://issues.apache.org/jira/browse/HIVE-21868
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Vectorize UDFs for CAST ( AS STRING/CHAR/VARCHAR FORMAT 
> ) and CAST ( AS TIMESTAMP/DATE FORMAT ).
> 
> 
> Diffs
> -
> 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
>  4e024a357b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java 
> fa9d1e9783 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToString.java
>  dfa9f8a00d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToStringWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastDateToVarCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDate.java
>  a6dff12e1a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToDateWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestamp.java
>  b48b0136eb 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastStringToTimestampWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToCharWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToString.java
>  adc3a9d7b9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToStringWithFormat.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/CastTimestampToVarCharWithFormat.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java 
> 16742eee9b 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorMathFunctions.java
>  663237739e 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCasts.java
>  58fd7b030e 
>   
> ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorTypeCastsWithFormat.java
>  PRE-CREATION 
>   ql/src/test/queries/clientnegative/udf_cast_format_bad_pattern.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q 
> 269edf6da6 
>   ql/src/test/results/clientnegative/udf_cast_format_bad_pattern.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out 
> 4a502b9700 
> 
> 
> Diff: https://reviews.apache.org/r/70920/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 70963: HIVE-21874: Implement add partitions related methods on temporary table

2019-06-28 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70963/#review216227
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
Line 1046 (original), 1049-1050 (patched)


Why do you need to make the DB and Table name lower case?



ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
Lines 1100 (patched)


Why is it needed to get the newly added partition from the "parts" list as 
the addPartition method returns the newly added Partition?


- Marta Kuczora


On June 27, 2019, 9:07 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70963/
> ---
> 
> (Updated June 27, 2019, 9:07 a.m.)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21874: Implement add partitions related methods on temporary table
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  957ebb12725e9deac7e7644709521a998df4dbb4 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  a15f5ea0453c7459217d229fa373cc1fec2f4d7a 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  25643495b53e1ede473c48a90b208b43070ee6aa 
> 
> 
> Diff: https://reviews.apache.org/r/70963/diff/1/
> 
> 
> Testing
> ---
> 
> Unit testing is done via 
> TestSessionHiveMetastoreClientAddPartitionsTempTable, 
> TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70963: HIVE-21874: Implement add partitions related methods on temporary table

2019-06-28 Thread Marta Kuczora via Review Board


> On June 28, 2019, 2:42 p.m., Marta Kuczora wrote:
> >

Thanks a lot for the patch!


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70963/#review216227
---


On June 27, 2019, 9:07 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70963/
> ---
> 
> (Updated June 27, 2019, 9:07 a.m.)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21874: Implement add partitions related methods on temporary table
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  957ebb12725e9deac7e7644709521a998df4dbb4 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.java
>  PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAddPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  a15f5ea0453c7459217d229fa373cc1fec2f4d7a 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  25643495b53e1ede473c48a90b208b43070ee6aa 
> 
> 
> Diff: https://reviews.apache.org/r/70963/diff/1/
> 
> 
> Testing
> ---
> 
> Unit testing is done via 
> TestSessionHiveMetastoreClientAddPartitionsTempTable, 
> TestSessionHiveMetastoreClientAddPartitionsFromSpecTempTable.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70934: HIVE-18735: Create table like loses transactional attribute.

2019-06-26 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70934/#review216150
---


Ship it!




Thanks a lot for the patch. It looks good to me.

- Marta Kuczora


On June 25, 2019, 12:47 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70934/
> ---
> 
> (Updated June 25, 2019, 12:47 p.m.)
> 
> 
> Review request for hive, Eugene Koifman, Marta Kuczora, Peter Vary, and Adam 
> Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-18735: Create table like loses transactional attribute.
> 
> 
> Diffs
> -
> 
>   hbase-handler/src/test/results/positive/hbase_queries.q.out 
> 0c21d6d74882788d5748639ea2675579893791af 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
> d395db1b59d021789b1bb47c7f09ff337cba2dd0 
>   ql/src/test/results/clientpositive/alter_rename_table.q.out 
> dd656954a1877f7f808de81f6952d7cf8ebfda2f 
>   ql/src/test/results/clientpositive/alter_table_stats_status.q.out 
> efa2834e0d6dbd77181473c214b77d09fcc1fe69 
>   ql/src/test/results/clientpositive/autoColumnStats_1.q.out 
> 1f594ddb6816805d22a1152c261dda75490cd5d0 
>   ql/src/test/results/clientpositive/autoColumnStats_2.q.out 
> 121a10384bca03942c297dd0488aceaf0d3bed68 
>   ql/src/test/results/clientpositive/autoColumnStats_3.q.out 
> 777d165dc26fb11a6fd863fe1f375c6ae3d55b2a 
>   ql/src/test/results/clientpositive/autoColumnStats_8.q.out 
> 0e1868bd52d717a6103f1456a1d4e525e85d8622 
>   ql/src/test/results/clientpositive/create_alter_list_bucketing_table1.q.out 
> 593ae8389971449ad0f8704d911f6f7c6bcc 
>   ql/src/test/results/clientpositive/create_like.q.out 
> f4a5ed55a568b0160a6c87cb2fe8c7cd9b20c7c8 
>   ql/src/test/results/clientpositive/create_like2.q.out 
> 7152f52fcf82d5052a67be6e27bda532f2b521bd 
>   ql/src/test/results/clientpositive/create_like_tbl_props.q.out 
> 4d11fc3c9e39c18dd18fdb585ad1831a0a068768 
>   ql/src/test/results/clientpositive/create_table_like_stats.q.out 
> 4aa1b4f167a99ffc97d97bb62e0f5313fd83314e 
>   ql/src/test/results/clientpositive/describe_table.q.out 
> 8c7a16c4b65d3f3951e6c230c42325056a7eab0b 
>   ql/src/test/results/clientpositive/erasurecoding/erasure_simple.q.out 
> 3ceb3d03c2614f3256a822c9f105ed6e9f2bada8 
>   ql/src/test/results/clientpositive/explain_ddl.q.out 
> c53ffae8003bdcc320d4910f021c821c0777bdeb 
>   ql/src/test/results/clientpositive/llap/autoColumnStats_1.q.out 
> 7272a9c925a4115ee3f1d3a4e6576057d75ac994 
>   ql/src/test/results/clientpositive/llap/autoColumnStats_2.q.out 
> 1a4b164b0925860543dd74215e0820fe84c5f3f1 
>   
> ql/src/test/results/clientpositive/llap/insert_values_orig_table_use_metadata.q.out
>  6c892cc5b87960b086d90c43516526056bdf221f 
>   ql/src/test/results/clientpositive/llap/stats_noscan_1.q.out 
> af55d23484ddb74a2c5b7f06c4e91a6063ae11dc 
>   ql/src/test/results/clientpositive/llap/whroot_external1.q.out 
> cac158c92669f1ad532ada3d6620adebeb909eae 
>   ql/src/test/results/clientpositive/load_dyn_part8.q.out 
> 7b1b5c1f862a581af3b2c4cabe21b6d186601652 
>   ql/src/test/results/clientpositive/merge3.q.out 
> 4e670558808894b0dd5f7b8815987e03de1dc6d3 
>   ql/src/test/results/clientpositive/mm_default.q.out 
> 70519b7da8346ddc2de74e46010183d2c9ab11ee 
>   ql/src/test/results/clientpositive/partition_discovery.q.out 
> cddb6e56ba8db9162c491125e3efd3acd2ed29b2 
>   ql/src/test/results/clientpositive/spark/load_dyn_part8.q.out 
> aebf4382cd78b02d9b7bab7285254431f04e29c0 
>   ql/src/test/results/clientpositive/spark/stats12.q.out 
> 9db43ef112d0898c08429c839e196b3e48067383 
>   ql/src/test/results/clientpositive/spark/stats13.q.out 
> 4922d717a0074146d6da91aae859f09aa5a2b623 
>   ql/src/test/results/clientpositive/spark/stats14.q.out 
> eb8a995e298d77098c5d7a01086943dc08307c19 
>   ql/src/test/results/clientpositive/spark/stats15.q.out 
> 3874e6de249428404946f461eec3575d6dcb50a5 
>   ql/src/test/results/clientpositive/spark/stats2.q.out 
> 30339caeb2cff5cc96101d8cbf5f3ed8b5b01667 
>   ql/src/test/results/clientpositive/spark/stats6.q.out 
> 77be16cb13558e6b2af2e772ff0505ea4dba8125 
>   ql/src/test/results/clientpositive/spark/stats7.q.out 
> fe942ad94b35288b0cc74d1434429378835ce9c2 
>   ql/src/test/results/clientpositive/spark/stats8.q.out 
> edfbd57f72b55d040233328330562703286627d3 
>   ql/src/test/results/clientpositive/spark/stats9.q.out 
> ed226b68d21733c7d371472d99f714b759e380e2 
>   ql/src/test/results/clientpositive/spark/stats_noscan_1.q.out 
> 32001f6a6812ea7768933a0b479d2107265ef799 
>   ql/src/test/results/clientpositive/stats12.q.out 
> 50cdace04384f9408298de50a426a22149605563 
>   ql/src/test/results/clientpositive/stats13.q.out 
> c6364ebf1d8cabe4818c9788d2819e8a4d7a0726 
>   

Re: Review Request 70867: HIVE-21814: Implement list partitions related methods on temporary tables

2019-06-20 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70867/#review215995
---


Ship it!




Ship It!

- Marta Kuczora


On June 20, 2019, 9:53 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70867/
> ---
> 
> (Updated June 20, 2019, 9:53 a.m.)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21814: Implement list partitions related methods on temporary tables
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  b71ef5a725d610cda402717f501f6c6a0f653216 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientListPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/ConditionalIgnoreOnSessionHiveMetastoreClient.java
>  99039b08d014cddc9de12e70801267eba7331266 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java
>  34ceb34de646cc2e501564e9b3a0cb8cc8a034e1 
> 
> 
> Diff: https://reviews.apache.org/r/70867/diff/2/
> 
> 
> Testing
> ---
> 
> Unit testing is done via 
> TestSessionHiveMetastoreClientListPartitionsTempTable.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70841: HIVE-21576: Introduce CAST...FORMAT and limited list of SQL:2016 datetime formats

2019-06-19 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70841/#review215966
---


Ship it!




Ship It!

- Marta Kuczora


On June 14, 2019, 8:30 a.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70841/
> ---
> 
> (Updated June 14, 2019, 8:30 a.m.)
> 
> 
> Review request for hive, Gabor Kaszab and Marta Kuczora.
> 
> 
> Bugs: HIVE-21576
> https://issues.apache.org/jira/browse/HIVE-21576
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Timestamp and date handling and formatting are currently implemented in Hive 
> using (sometimes very specific) Java SimpleDateFormat patterns with both 
> SimpleDateFormat and java.time.DateTimeFormatter, however, these patterns are 
> not what most standard SQL systems use. For example see Vertica, Netezza, 
> Oracle, and PostgreSQL.
> 
> **Cast...Format**
> 
> SQL:2016 introduced the FORMAT clause for CAST which is the standard way to 
> do string <-> datetime conversions
> 
> For example:
> 
> CAST( AS  [FORMAT ])
> CAST( AS  [FORMAT ])
> cast(dt as string format 'DD-MM-')
> cast('01-05-2017' as date format 'DD-MM-')
> Stuff like this wouldn't need to happen.
> 
> **New SQL:2016 Patterns**
> 
> Some conflicting examples:
> 
> SimpleDateTime: 'MMM dd,  HH:mm:ss'
> SQL:2016: 'mon dd,  hh24:mi:ss'
> 
> SimpleDateTime: '-MM-dd HH:mm:ss'
> SQL:2016: '-mm-dd hh24:mi:ss'
> 
> For the full list of patterns, see subsection "Proposal for Impala’s datetime 
> patterns" in this doc: 
> https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit
> 
> **Continued usage of SimpleDateFormat patterns**
> 
> [Update] This feature will NOT be behind a flag in order to keep things 
> simple for users. Existing Hive functions that accept SimpleDateFormat 
> patterns as input will continue to do so. Please let me know if you disagree 
> with this decision. These are the functions (afaik) affected:
> 
> from_unixtime(bigint unixtime[, string format])
> unix_timestamp(string date, string pattern)
> to_unix_timestamp(date[, pattern])
> add_months(string start_date, int num_months, output_date_format)
> date_format(date/timestamp/string ts, string fmt)
> This description is a heavily edited description of IMPALA-4018.
> 
> 
> Diffs
> -
> 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
>  PRE-CREATION 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/package-info.java
>  PRE-CREATION 
>   
> common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java
>  PRE-CREATION 
>   
> common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d08b05fb68 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 58fe0cd32e 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java 
> PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFCastFormat.java
>  PRE-CREATION 
>   ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/show_functions.q.out 374e9c4fce 
> 
> 
> Diff: https://reviews.apache.org/r/70841/diff/8/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 70841: HIVE-21576: Introduce CAST...FORMAT and limited list of SQL:2016 datetime formats

2019-06-19 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70841/#review215964
---



Thanks a lot for the patch.
I have some minor hints/questions, but otherwise the change looks good to me.


common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
Lines 631-651 (patched)


Does this method still needed?



common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
Lines 756 (patched)


Does this issue still exist?



common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java
Lines 235 (patched)


You could add a message to the assertEquals to make it easier to identify 
which test case is failing.


- Marta Kuczora


On June 14, 2019, 8:30 a.m., Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70841/
> ---
> 
> (Updated June 14, 2019, 8:30 a.m.)
> 
> 
> Review request for hive, Gabor Kaszab and Marta Kuczora.
> 
> 
> Bugs: HIVE-21576
> https://issues.apache.org/jira/browse/HIVE-21576
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Timestamp and date handling and formatting are currently implemented in Hive 
> using (sometimes very specific) Java SimpleDateFormat patterns with both 
> SimpleDateFormat and java.time.DateTimeFormatter, however, these patterns are 
> not what most standard SQL systems use. For example see Vertica, Netezza, 
> Oracle, and PostgreSQL.
> 
> **Cast...Format**
> 
> SQL:2016 introduced the FORMAT clause for CAST which is the standard way to 
> do string <-> datetime conversions
> 
> For example:
> 
> CAST( AS  [FORMAT ])
> CAST( AS  [FORMAT ])
> cast(dt as string format 'DD-MM-')
> cast('01-05-2017' as date format 'DD-MM-')
> Stuff like this wouldn't need to happen.
> 
> **New SQL:2016 Patterns**
> 
> Some conflicting examples:
> 
> SimpleDateTime: 'MMM dd,  HH:mm:ss'
> SQL:2016: 'mon dd,  hh24:mi:ss'
> 
> SimpleDateTime: '-MM-dd HH:mm:ss'
> SQL:2016: '-mm-dd hh24:mi:ss'
> 
> For the full list of patterns, see subsection "Proposal for Impala’s datetime 
> patterns" in this doc: 
> https://docs.google.com/document/d/1V7k6-lrPGW7_uhqM-FhKl3QsxwCRy69v2KIxPsGjc1k/edit
> 
> **Continued usage of SimpleDateFormat patterns**
> 
> [Update] This feature will NOT be behind a flag in order to keep things 
> simple for users. Existing Hive functions that accept SimpleDateFormat 
> patterns as input will continue to do so. Please let me know if you disagree 
> with this decision. These are the functions (afaik) affected:
> 
> from_unixtime(bigint unixtime[, string format])
> unix_timestamp(string date, string pattern)
> to_unix_timestamp(date[, pattern])
> add_months(string start_date, int num_months, output_date_format)
> date_format(date/timestamp/string ts, string fmt)
> This description is a heavily edited description of IMPALA-4018.
> 
> 
> Diffs
> -
> 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/HiveSqlDateTimeFormatter.java
>  PRE-CREATION 
>   
> common/src/java/org/apache/hadoop/hive/common/format/datetime/package-info.java
>  PRE-CREATION 
>   
> common/src/test/org/apache/hadoop/hive/common/format/datetime/TestHiveSqlDateTimeFormatter.java
>  PRE-CREATION 
>   
> common/src/test/org/apache/hadoop/hive/common/format/datetime/package-info.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java d08b05fb68 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/IdentifiersParser.g 58fe0cd32e 
>   ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFCastFormat.java 
> PRE-CREATION 
>   
> ql/src/test/org/apache/hadoop/hive/ql/udf/generic/TestGenericUDFCastFormat.java
>  PRE-CREATION 
>   ql/src/test/queries/clientpositive/cast_datetime_with_sql_2016_format.q 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/cast_datetime_with_sql_2016_format.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/show_functions.q.out 374e9c4fce 
> 
> 
> Diff: https://reviews.apache.org/r/70841/diff/8/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 70867: HIVE-21814: Implement list partitions related methods on temporary tables

2019-06-19 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70867/#review215963
---


Ship it!




Thanks for the patch.
I have two minor hints, otherwise the change looks good to me. Just please 
consider them before the new batch of partitioned temp table changes.

- Marta Kuczora


On June 17, 2019, 3:36 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70867/
> ---
> 
> (Updated June 17, 2019, 3:36 p.m.)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21814: Implement list partitions related methods on temporary tables
> 
> This change is the next step to support partitions on temporary tables. 
> HIVE-18739 and HIVE-20661 added partial support for partition columns on 
> temporary tables, but it was not complete and it was available only for 
> internal usage. This change addresses the missing functionality related to 
> listing partitions from temporary tables, although is still remains unexposed 
> until all the partition related functionalities (get, list, add, alter etc.) 
> are implemented.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  b71ef5a725d610cda402717f501f6c6a0f653216 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientListPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/ConditionalIgnoreOnSessionHiveMetastoreClient.java
>  99039b08d014cddc9de12e70801267eba7331266 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java
>  34ceb34de646cc2e501564e9b3a0cb8cc8a034e1 
> 
> 
> Diff: https://reviews.apache.org/r/70867/diff/1/
> 
> 
> Testing
> ---
> 
> Unit testing is done via 
> TestSessionHiveMetastoreClientListPartitionsTempTable.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70867: HIVE-21814: Implement list partitions related methods on temporary tables

2019-06-19 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70867/#review215960
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
Lines 1250-1255 (patched)


This code piece is used in multiple methods. Maybe it would make sense to 
extract it to a separate method.
But since you have some more patches to go around the temp table partition 
handling, it is ok if you consider fixing this in a next patch.



ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientListPartitionsTempTable.java
Lines 133 (patched)


Would it make sense to add test with low max parts number to see if the 
method returns the correct number of partitions?


- Marta Kuczora


On June 17, 2019, 3:36 p.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70867/
> ---
> 
> (Updated June 17, 2019, 3:36 p.m.)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21814: Implement list partitions related methods on temporary tables
> 
> This change is the next step to support partitions on temporary tables. 
> HIVE-18739 and HIVE-20661 added partial support for partition columns on 
> temporary tables, but it was not complete and it was available only for 
> internal usage. This change addresses the missing functionality related to 
> listing partitions from temporary tables, although is still remains unexposed 
> until all the partition related functionalities (get, list, add, alter etc.) 
> are implemented.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  b71ef5a725d610cda402717f501f6c6a0f653216 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientListPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/ConditionalIgnoreOnSessionHiveMetastoreClient.java
>  99039b08d014cddc9de12e70801267eba7331266 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java
>  34ceb34de646cc2e501564e9b3a0cb8cc8a034e1 
> 
> 
> Diff: https://reviews.apache.org/r/70867/diff/1/
> 
> 
> Testing
> ---
> 
> Unit testing is done via 
> TestSessionHiveMetastoreClientListPartitionsTempTable.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70850: HIVE-21812: Implement get partition related methods on temporary tables

2019-06-13 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70850/#review215869
---


Ship it!




Ship It!

- Marta Kuczora


On June 13, 2019, 8:01 a.m., Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70850/
> ---
> 
> (Updated June 13, 2019, 8:01 a.m.)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21812: Implement get partition related methods on temporary tables
> 
> HIVE-18739 and HIVE-20661 added partial support for partition columns on 
> temporary tables, but it was not complete and it was available only for 
> internal usage. This change addresses the missing functionality related to 
> getting partitions from temporary tables, although is still remains unexposed 
> until all the partition related functionalities (get, list, add, alter etc.) 
> are implemented.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  410868cacfe53e8898d4e08572d7a01e05b7eb49 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientGetPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/ConditionalIgnoreOnSessionHiveMetastoreClient.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/CustomIgnoreRule.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/MetaStoreClientTest.java
>  dc48fa8308a07f68c5e21a2d95f40127d3ff41df 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetPartitions.java
>  4d7f7c12203a9a90568f4aae644ff5cabaafa18c 
> 
> 
> Diff: https://reviews.apache.org/r/70850/diff/1/
> 
> 
> Testing
> ---
> 
> Unit testing is done via TestSessionHiveMetastoreClientGetPartitionsTempTable.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70474: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-05-09 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70474/
---

(Updated May 9, 2019, 7:51 a.m.)


Review request for hive and Peter Vary.


Changes
---

Fixed the whitespace issue.


Bugs: HIVE-21407
https://issues.apache.org/jira/browse/HIVE-21407


Repository: hive-git


Description
---

The idea behind the patch is that for CHAR columns extend the predicate which 
is pushed to Parquet with an “or” clause which contains the same expression 
with a padded and a stripped value.
Example:
column c is a CHAR(10) type and the search expression is c='apple'
The predicate which is pushed to Parquet looked like c='apple ' before the 
patch and it would look like (c='apple ' or c='apple') after the patch.
Since the value 'apple' is stored in Parquet without padding, the predicate 
before the patch didn’t return any rows. With the patch it will return the 
correct row. 
Since on predicate level, there is no distinction between CHAR or VARCHAR, the 
predicates for VARCHARs will be changed as well, so the result set returned 
from Parquet will be wider than before.
Example:
A table contains a c VARCHAR(10) column and there is a row where c='apple' and 
there is an other row where c='apple '. If the search expression is c='apple ', 
both rows will be returned from Parquet after the patch. But since Hive is 
doing an additional filtering on the rows returned from Parquet, it won’t be a 
problem, the result set returned by Hive will contain only the row with the 
value 'apple '.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
be4c0d5 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
 0210a0a 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
 d464046 
  ql/src/test/queries/clientpositive/parquet_ppd_char.q 4230d8c 
  ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
  ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/70474/diff/2/

Changes: https://reviews.apache.org/r/70474/diff/1-2/


Testing
---

Added new q test for testing the PPD for char and varchar types. Also extended 
the unit tests for the ParquetFilterPredicateConverter.toFilterPredicate method.

The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are both 
testing the same thing, the behavior of the 
ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make sense 
to have tests for the same use case in different test classes, so moved the 
test cases from the TestParquetRecordReaderWrapper to 
TestParquetFilterPredicate.


Thanks,

Marta Kuczora



Review Request 70474: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-04-14 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70474/
---

Review request for hive and Peter Vary.


Bugs: HIVE-21407
https://issues.apache.org/jira/browse/HIVE-21407


Repository: hive-git


Description
---

The idea behind the patch is that for CHAR columns extend the predicate which 
is pushed to Parquet with an “or” clause which contains the same expression 
with a padded and a stripped value.
Example:
column c is a CHAR(10) type and the search expression is c='apple'
The predicate which is pushed to Parquet looked like c='apple ' before the 
patch and it would look like (c='apple ' or c='apple') after the patch.
Since the value 'apple' is stored in Parquet without padding, the predicate 
before the patch didn’t return any rows. With the patch it will return the 
correct row. 
Since on predicate level, there is no distinction between CHAR or VARCHAR, the 
predicates for VARCHARs will be changed as well, so the result set returned 
from Parquet will be wider than before.
Example:
A table contains a c VARCHAR(10) column and there is a row where c='apple' and 
there is an other row where c='apple '. If the search expression is c='apple ', 
both rows will be returned from Parquet after the patch. But since Hive is 
doing an additional filtering on the rows returned from Parquet, it won’t be a 
problem, the result set returned by Hive will contain only the row with the 
value 'apple '.


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
be4c0d5 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
 0210a0a 
  
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
 d464046 
  ql/src/test/queries/clientpositive/parquet_ppd_char.q 4230d8c 
  ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
  ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 


Diff: https://reviews.apache.org/r/70474/diff/1/


Testing
---

Added new q test for testing the PPD for char and varchar types. Also extended 
the unit tests for the ParquetFilterPredicateConverter.toFilterPredicate method.

The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are both 
testing the same thing, the behavior of the 
ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make sense 
to have tests for the same use case in different test classes, so moved the 
test cases from the TestParquetRecordReaderWrapper to 
TestParquetFilterPredicate.


Thanks,

Marta Kuczora



Re: Review Request 69432: HIVE-20964 Create a test that checks the level of the parallel compilation

2018-11-23 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69432/#review210821
---


Ship it!




Ship It!

- Marta Kuczora


On Nov. 22, 2018, 3:19 p.m., Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69432/
> ---
> 
> (Updated Nov. 22, 2018, 3:19 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko, Marta Kuczora, and Adam Szita.
> 
> 
> Bugs: HIVE-20964
> https://issues.apache.org/jira/browse/HIVE-20964
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * Created 2 query types in the TestCompileLock mock driver. The original 
> SHORT_QUERY is finishing in 0.5s as before, but the new LONG_QUERY will 
> finish only after 5s.
> * With using the new 5s query I have created a new test where the compile 
> quota is 4 and the parallel request number is 10. So the test expects that 6 
> query will fail with timeout.
> * Added a new verifyThatTimedOutCompileOpsCount method to validate the number 
> of the timed out queries.
> * The other changes are just pushing down the query string so the 
> compileAndRespond method can decide which query to run.
> 
> 
> Diffs
> -
> 
>   ql/src/test/org/apache/hadoop/hive/ql/TestCompileLock.java 8dc05ff480 
> 
> 
> Diff: https://reviews.apache.org/r/69432/diff/1/
> 
> 
> Testing
> ---
> 
> Run the new test, and all the old tests in TestCompileLock
> 
> 
> Thanks,
> 
> Peter Vary
> 
>



Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods

2018-06-25 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/
---

(Updated June 25, 2018, 12:01 p.m.)


Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita.


Changes
---

Rebased the patch


Bugs: HIVE-19046
https://issues.apache.org/jira/browse/HIVE-19046


Repository: hive-git


Description
---

The biggest part of these methods use the same code. Refactored these code 
parts to common methods.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 e9d7e7c 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 bf559b4 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 4f11a55 


Diff: https://reviews.apache.org/r/7/diff/7/

Changes: https://reviews.apache.org/r/7/diff/6-7/


Testing
---


Thanks,

Marta Kuczora



Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods

2018-06-11 Thread Marta Kuczora via Review Board


> On June 6, 2018, 10:34 p.m., Alexander Kolbasov wrote:
> > Looks good, a few nits below.

Thanks for looking into this review. I fixed/answered the issues. 
Please let me know if the patch looks ok, then I will upload it to the Jira to 
run the pre-commit tests.


> On June 6, 2018, 10:34 p.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 3227 (original), 3322 (patched)
> > 
> >
> > Is it possible to do it once in constructor instead? I suspect that 
> > this is a no-trivial operation.

To be honest, I don't see clearly if it would be worth to move this part to the 
constructor. I am not sure what side effect it would have. In HIVE-15137, where 
this part was added to the code, the problem was that if two HiveCli were 
started with different users and both users added a partition, the owner of the 
partition directories was always the first user. Would moving this code to the 
constructor not affect this use-case? Would it work correctly? I think, this 
should be investigated. I am just not sure of the benefit of moving this code. 
The current user is fetched only once when creating a batch of partitions, and 
I don't see this as a very expensive call. If we want to move this, I would 
suggest to investigate and do it in a seperate Jira. What do you think?


> On June 6, 2018, 10:34 p.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3253 (patched)
> > 
> >
> > Can you clarify that "clean up" means removing associated directory.

I fixed it accordingly.


> On June 6, 2018, 10:34 p.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3268 (patched)
> > 
> >
> > Please add a Javadoc here explaining what is checked by validation. 
> > Also it isn't obvious that validation has side effects (updating partsToAdd)

Added Javadoc


> On June 6, 2018, 10:34 p.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 3247 (original), 3343 (patched)
> > 
> >
> > addedPartitions is not defined here so it isn't obvious that it should 
> > be thread-safe. Is it possible to allocate and return addedPartitions here 
> > so that you guarantee using of thread-safe map? 
> > 
> > Another way you can do it is to collect added partitions in thread-safe 
> > local map and then copy it to the resulting map once you are done with 
> > concurrent part.

The createPartitionFolders method is called with a ConcurrentHashMap, I thought 
it would do the trick. 
Returning with the addedPartitions map would be complicated as we have to 
return the newParts list as well. So I fixed this issue by introducing a local 
map and then copy the result to the addedPartitions map.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review203099
---


On June 11, 2018, 11:27 a.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated June 11, 2018, 11:27 a.m.)
> 
> 
> Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita.
> 
> 
> Bugs: HIVE-19046
> https://issues.apache.org/jira/browse/HIVE-19046
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The biggest part of these methods use the same code. Refactored these code 
> parts to common methods.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  b9f5fb8 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  bf559b4 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  4f11a55 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/6/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods

2018-06-11 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/
---

(Updated June 11, 2018, 11:27 a.m.)


Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita.


Changes
---

Address review findings.


Bugs: HIVE-19046
https://issues.apache.org/jira/browse/HIVE-19046


Repository: hive-git


Description
---

The biggest part of these methods use the same code. Refactored these code 
parts to common methods.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 b9f5fb8 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 bf559b4 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 4f11a55 


Diff: https://reviews.apache.org/r/7/diff/6/

Changes: https://reviews.apache.org/r/7/diff/5-6/


Testing
---


Thanks,

Marta Kuczora



Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods

2018-06-01 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/
---

(Updated June 1, 2018, 12:31 p.m.)


Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita.


Bugs: HIVE-19046
https://issues.apache.org/jira/browse/HIVE-19046


Repository: hive-git


Description
---

The biggest part of these methods use the same code. Refactored these code 
parts to common methods.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 d8b8414 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 88064d9 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 debcd0e 


Diff: https://reviews.apache.org/r/7/diff/5/

Changes: https://reviews.apache.org/r/7/diff/4-5/


Testing
---


Thanks,

Marta Kuczora



Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods

2018-05-29 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/
---

(Updated May 29, 2018, 4:24 p.m.)


Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita.


Changes
---

Rebased the patch.


Bugs: HIVE-19046
https://issues.apache.org/jira/browse/HIVE-19046


Repository: hive-git


Description
---

The biggest part of these methods use the same code. Refactored these code 
parts to common methods.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 c1d25db 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 88064d9 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 debcd0e 


Diff: https://reviews.apache.org/r/7/diff/4/

Changes: https://reviews.apache.org/r/7/diff/3-4/


Testing
---


Thanks,

Marta Kuczora



Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods

2018-05-23 Thread Marta Kuczora via Review Board


> On April 18, 2018, 9:52 p.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 3323 (original), 3396 (patched)
> > 
> >
> > Should we set interrupted flag on the thread if we get 
> > InterruptedException?
> 
> Marta Kuczora wrote:
> Could you please give me some details about why you think it is needed? I 
> don't know actually if it is needed or not. My idea here was to go through on 
> all FutureTasks and if one of them didn't finish successfully (there was 
> either an error or the task was interrupted), throw an exception, cause it 
> would mean that not all partition folders were created successfully. For this 
> I don't think that I should set anything on the thread, but I might miss 
> something. So could you please explain me your thoughts on this?

I just uploaded a new patch with this change.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/#review201465
---


On May 23, 2018, 4:24 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/7/
> ---
> 
> (Updated May 23, 2018, 4:24 p.m.)
> 
> 
> Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita.
> 
> 
> Bugs: HIVE-19046
> https://issues.apache.org/jira/browse/HIVE-19046
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The biggest part of these methods use the same code. Refactored these code 
> parts to common methods.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  92d2e3f 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  88064d9 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  debcd0e 
> 
> 
> Diff: https://reviews.apache.org/r/7/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods

2018-05-23 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/
---

(Updated May 23, 2018, 4:24 p.m.)


Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita.


Changes
---

Address review finding.


Bugs: HIVE-19046
https://issues.apache.org/jira/browse/HIVE-19046


Repository: hive-git


Description
---

The biggest part of these methods use the same code. Refactored these code 
parts to common methods.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 92d2e3f 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 88064d9 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 debcd0e 


Diff: https://reviews.apache.org/r/7/diff/3/

Changes: https://reviews.apache.org/r/7/diff/2-3/


Testing
---


Thanks,

Marta Kuczora



Review Request 66935: HIVE-18977: Listing partitions returns different results with JDO and direct SQL

2018-05-03 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66935/
---

Review request for hive, Alan Gates and Peter Vary.


Bugs: HIVE-18977
https://issues.apache.org/jira/browse/HIVE-18977


Repository: hive-git


Description
---

Some of the test cases in TestListPartitions fail when directSQL is disabled.


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 4601e09 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 6645e55 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/parser/ExpressionTree.java
 d608e50 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java
 a8b6e31 


Diff: https://reviews.apache.org/r/66935/diff/1/


Testing
---


Thanks,

Marta Kuczora



Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods

2018-04-26 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/
---

(Updated April 26, 2018, 11:18 a.m.)


Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita.


Changes
---

Fixed review findings


Bugs: HIVE-19046
https://issues.apache.org/jira/browse/HIVE-19046


Repository: hive-git


Description
---

The biggest part of these methods use the same code. Refactored these code 
parts to common methods.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 397a081 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 88064d9 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 debcd0e 


Diff: https://reviews.apache.org/r/7/diff/2/

Changes: https://reviews.apache.org/r/7/diff/1-2/


Testing
---


Thanks,

Marta Kuczora



Re: Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods

2018-04-26 Thread Marta Kuczora via Review Board


> On April 18, 2018, 9:52 p.m., Alexander Kolbasov wrote:
> >

Thanks a lot Sasha for the review!


> On April 18, 2018, 9:52 p.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3303 (patched)
> > 
> >
> > Since you are doing the refactoring, would you mind adding good Javadoc 
> > to this method?

You are right, I added Javadoc to the method.


> On April 18, 2018, 9:52 p.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3308 (patched)
> > 
> >
> > Here we are trying to nuke a  bunch of values. If a single one fails, 
> > we do not attempt to delete others. Since you are just doing refactoring it 
> > is out of scope but I think the proper behavior is to continue nuking for 
> > others as well.

I agree that the others should be still cleaned up. But on the other hand we 
still should throw an exception because we should notify the user that 
something went wrong with the folder clean up, so the user could check and make 
sure that all folders are deleted properly. Would you mind creating a separate 
Jira about this issue?


> On April 18, 2018, 9:52 p.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3340 (patched)
> > 
> >
> > Please add Javadoc for this method.
> > Also I don't see the result used anywhere - can it be changed to void?

Sure, I added Javadoc to the method.

The result is used in the add_partitions_core method:
   newParts.addAll(createPartitionFolders(partitionsToAdd, tbl, 
addedPartitions));
   
The add_partitions_core and add_partitions_pspec_core methods were a bit 
different here. The add_partitions_core method has a list (newParts) which was 
populated like this
for (Future partFuture : partFutures) {
Partition part = partFuture.get();
if (part != null) {
newParts.add(part);
}
}
And then this list was used when creating the notification events.
In the add_partitions_pspec_core method however there is no such list. It uses 
the PartitionSpecProxy instead.
I didn't want to change the behavior of the methods in the scope of this Jira.


> On April 18, 2018, 9:52 p.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 3293 (original), 3365 (patched)
> > 
> >
> > This is outside of your fix, but there is something strange going on. 
> > Looking at initializeAddedPartition() I see that it does:
> > 
> > if (MetastoreConf.getBoolVar(conf, ConfVars.STATS_AUTO_GATHER) &&
> >   !MetaStoreUtils.isView(tbl)) {
> > MetaStoreUtils.updatePartitionStatsFast(part, tbl, wh, madeDir, 
> > false, null, true);
> >   }
> >   
> > I am pretty sure that earlier there was a check for a setting from 
> > environment context that could disable stats gathering and now it is gone. 
> > This stats gathering is very expensive and Impala disables it using 
> > environmental context settings.

I am not sure what you mean. I've only seen this method like this so far.
As far as I see, in the MetaStoreUtils.updateBasicState the properties from 
EnvironmentContext are considered:

  public static void updateBasicState(EnvironmentContext environmentContext, 
Map
  params) {
if (params == null) {
  return;
}
if (environmentContext != null
&& environmentContext.isSetProperties()
&& StatsSetupConst.TASK.equals(environmentContext.getProperties().get(
StatsSetupConst.STATS_GENERATED))) {
  StatsSetupConst.setBasicStatsState(params, StatsSetupConst.TRUE);
} else {
  StatsSetupConst.setBasicStatsState(params, StatsSetupConst.FALSE);
}
  }
  
But I am not familiar with stats computing, so you might look for something 
else. Could you please open a separate Jira about this?


> On April 18, 2018, 9:52 p.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 3323 (original), 3396 (patched)
> > 
> >
> > Should we set interrupted flag on the thread if we get 
> > InterruptedException?

Could you please give me some details about why you think it is needed? I don't 
know actually if it is needed or not. My idea here was to go through on all 
FutureTasks and if one of them didn't finish successfully (there was either an 
error or the task was 

Re: Review Request 66774: HIVE-19285: Add logs to the subclasses of MetaDataOperation

2018-04-26 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66774/
---

(Updated April 26, 2018, 8:16 a.m.)


Review request for hive and Peter Vary.


Changes
---

Fixed stylecheck issues.


Bugs: HIVE-19285
https://issues.apache.org/jira/browse/HIVE-19285


Repository: hive-git


Description
---

Subclasses of MetaDataOperation are not writing anything to the logs. It would 
be useful to have some INFO and DEBUG level logging in these classes.


Diffs (updated)
-

  
service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java
 7944467 
  
service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java 
d67ea90 
  
service/src/java/org/apache/hive/service/cli/operation/GetCrossReferenceOperation.java
 99ccd4e 
  
service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java
 091bf50 
  
service/src/java/org/apache/hive/service/cli/operation/GetPrimaryKeysOperation.java
 e603fdd 
  
service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java 
de09ec9 
  
service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java
 59cfbb2 
  
service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 
c9233d0 
  
service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java
 ac078b4 
  service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java 
bf7c021 


Diff: https://reviews.apache.org/r/66774/diff/3/

Changes: https://reviews.apache.org/r/66774/diff/2-3/


Testing
---

Just adding some additional log messages. Tested locally by checking the log 
messages for different use cases


Thanks,

Marta Kuczora



Re: Review Request 66774: HIVE-19285: Add logs to the subclasses of MetaDataOperation

2018-04-24 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66774/
---

(Updated April 24, 2018, 6:35 p.m.)


Review request for hive and Peter Vary.


Changes
---

Fixed review findings.


Bugs: HIVE-19285
https://issues.apache.org/jira/browse/HIVE-19285


Repository: hive-git


Description
---

Subclasses of MetaDataOperation are not writing anything to the logs. It would 
be useful to have some INFO and DEBUG level logging in these classes.


Diffs (updated)
-

  
service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java
 7944467 
  
service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java 
d67ea90 
  
service/src/java/org/apache/hive/service/cli/operation/GetCrossReferenceOperation.java
 99ccd4e 
  
service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java
 091bf50 
  
service/src/java/org/apache/hive/service/cli/operation/GetPrimaryKeysOperation.java
 e603fdd 
  
service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java 
de09ec9 
  
service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java
 59cfbb2 
  
service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 
c9233d0 
  
service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java
 ac078b4 
  service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java 
bf7c021 


Diff: https://reviews.apache.org/r/66774/diff/2/

Changes: https://reviews.apache.org/r/66774/diff/1-2/


Testing
---

Just adding some additional log messages. Tested locally by checking the log 
messages for different use cases


Thanks,

Marta Kuczora



Re: Review Request 66774: HIVE-19285: Add logs to the subclasses of MetaDataOperation

2018-04-24 Thread Marta Kuczora via Review Board


> On April 24, 2018, 11:55 a.m., Peter Vary wrote:
> > Thanks Marta for the patch! These messages will be usefull.
> > Two small things / questions below.
> > 
> > Peter

Thanks a lot Peter for the review!


> On April 24, 2018, 11:55 a.m., Peter Vary wrote:
> > service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java
> > Lines 66 (patched)
> > 
> >
> > Does this result a duplication in the log for the exception messages?

You are right, the exception gets logged anyway, so these lines are not 
necessary. I removed them.


> On April 24, 2018, 11:55 a.m., Peter Vary wrote:
> > service/src/java/org/apache/hive/service/cli/operation/GetPrimaryKeysOperation.java
> > Lines 113 (patched)
> > 
> >
> > nit: formatting?

Something went wrong here. Fixed and few other formatting issues.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66774/#review201826
---


On April 24, 2018, 10:34 a.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66774/
> ---
> 
> (Updated April 24, 2018, 10:34 a.m.)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-19285
> https://issues.apache.org/jira/browse/HIVE-19285
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Subclasses of MetaDataOperation are not writing anything to the logs. It 
> would be useful to have some INFO and DEBUG level logging in these classes.
> 
> 
> Diffs
> -
> 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java
>  7944467 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java
>  d67ea90 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetCrossReferenceOperation.java
>  99ccd4e 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java
>  091bf50 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetPrimaryKeysOperation.java
>  e603fdd 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java
>  de09ec9 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java
>  59cfbb2 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java
>  c9233d0 
>   
> service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java
>  ac078b4 
>   
> service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java 
> bf7c021 
> 
> 
> Diff: https://reviews.apache.org/r/66774/diff/1/
> 
> 
> Testing
> ---
> 
> Just adding some additional log messages. Tested locally by checking the log 
> messages for different use cases
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Review Request 66774: HIVE-19285: Add logs to the subclasses of MetaDataOperation

2018-04-24 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66774/
---

Review request for hive and Peter Vary.


Bugs: HIVE-19285
https://issues.apache.org/jira/browse/HIVE-19285


Repository: hive-git


Description
---

Subclasses of MetaDataOperation are not writing anything to the logs. It would 
be useful to have some INFO and DEBUG level logging in these classes.


Diffs
-

  
service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java
 7944467 
  
service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java 
d67ea90 
  
service/src/java/org/apache/hive/service/cli/operation/GetCrossReferenceOperation.java
 99ccd4e 
  
service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java
 091bf50 
  
service/src/java/org/apache/hive/service/cli/operation/GetPrimaryKeysOperation.java
 e603fdd 
  
service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java 
de09ec9 
  
service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java
 59cfbb2 
  
service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 
c9233d0 
  
service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java
 ac078b4 
  service/src/java/org/apache/hive/service/cli/operation/MetadataOperation.java 
bf7c021 


Diff: https://reviews.apache.org/r/66774/diff/1/


Testing
---

Just adding some additional log messages. Tested locally by checking the log 
messages for different use cases


Thanks,

Marta Kuczora



Review Request 66667: HIVE-19046: Refactor the common parts of the HiveMetastore add_partition_core and add_partitions_pspec_core methods

2018-04-17 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7/
---

Review request for hive, Peter Vary, Sahil Takiar, and Adam Szita.


Bugs: HIVE-19046
https://issues.apache.org/jira/browse/HIVE-19046


Repository: hive-git


Description
---

The biggest part of these methods use the same code. Refactored these code 
parts to common methods.


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 ae9ec5c 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 f8497c7 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 fc0c60f 


Diff: https://reviews.apache.org/r/7/diff/1/


Testing
---


Thanks,

Marta Kuczora



Review Request 66604: HIVE-19158: Fix NPE in the HiveMetastore add partition tests

2018-04-13 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66604/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-19158
https://issues.apache.org/jira/browse/HIVE-19158


Repository: hive-git


Description
---

The TestAddPartitions and TestAddPartitionsFromPartSpec tests revealed that NPE 
is thrown in some cases. These NPEs could be prevented with a simple null check 
and a MetaException with a proper error message should be thrown instead.


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 565549a 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 9a43b2c 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/partition/spec/CompositePartitionSpecProxy.java
 92813b9 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/partition/spec/PartitionListComposingSpecProxy.java
 6bd29d0 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/partition/spec/PartitionSpecProxy.java
 ff2dea1 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/partition/spec/PartitionSpecWithSharedSDProxy.java
 61e00ea 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 f8497c7 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 fc0c60f 


Diff: https://reviews.apache.org/r/66604/diff/1/


Testing
---


Thanks,

Marta Kuczora



Review Request 66477: HIVE-19119: Fix the TestAppendPartitions tests which are failing in the pre-commit runs

2018-04-05 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66477/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-19119
https://issues.apache.org/jira/browse/HIVE-19119


Repository: hive-git


Description
---

The test got fixed in HIVE-19060, but the fix got overwritten by an other 
commit. Fixing the testAppendPartitionNullPartValues and 
testAppendPartitionEmptyPartValues test cases again.


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 8539fea 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
 75b26f2 


Diff: https://reviews.apache.org/r/66477/diff/1/


Testing
---


Thanks,

Marta Kuczora



Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-04-05 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/
---

(Updated April 5, 2018, 3:40 p.m.)


Review request for hive, Alexander Kolbasov, Peter Vary, and Adam Szita.


Changes
---

Rebasing the patch


Bugs: HIVE-18696
https://issues.apache.org/jira/browse/HIVE-18696


Repository: hive-git


Description
---

The idea behind the patch is

1) Separate the partition validation from starting the tasks which create the 
partition folders. 
Instead of doing the checks on the partitions and submit the tasks in one loop, 
separated the validation into a different loop. So first iterate through the 
partitions, validate the table/db names, and check for duplicates. Then if all 
partitions were correct, in the second loop submit the tasks to create the 
partition folders. This way if one of the partitions is incorrect, the 
exception will be thrown in the first loop, before the tasks are submitted. So 
we can be sure that no partition folder will be created if the list contains an 
invalid partition.

2) Handle the exceptions which occur during the execution of the tasks 
differently.
Previously if an exception occured in one task, the remaining tasks were 
canceled, and the newly created partition folders were cleaned up in the 
finally part. The problem was that it could happen that some tasks were still 
not finished with the folder creation when cleaning up the others, so there 
could have been leftover folders. After doing some testing it turned out that 
this use case cannot be avoided completely when canceling the tasks.
The idea of this patch is to set a flag if an exception is thrown in one of the 
tasks. This flag is visible in the tasks and if its value is true, the 
partition folders won't be created. Then iterate through the remaining tasks 
and wait for them to finish. The tasks which are started before the flag got 
set will then finish creating the partition folders. The tasks which are 
started after the flag got set, won't create the partition folders, to avoid 
unnecessary work. This way it is sure that all tasks are finished, when 
entering the finally part where the partition folders are cleaned up.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 8539fea 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 8555eee 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 b32954f 


Diff: https://reviews.apache.org/r/65716/diff/6/

Changes: https://reviews.apache.org/r/65716/diff/5-6/


Testing
---

Added some new tests cases to the TestAddPartitions and 
TestAddPartitionsFromPartSpec tests.


Thanks,

Marta Kuczora



Re: Review Request 66359: HIVE-19076: Fix NPE and TApplicationException in function related HiveMetastore methods

2018-04-05 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66359/
---

(Updated April 5, 2018, 1:43 p.m.)


Review request for hive, Peter Vary and Adam Szita.


Changes
---

Rebased the patch and fixed the issues which came up.


Bugs: HIVE-19076
https://issues.apache.org/jira/browse/HIVE-19076


Repository: hive-git


Description
---

The TestFunctions tests revealed that NPE is thrown in some cases. These NPEs 
could be prevented with a simple null check and a MetaException with a proper 
error message should be thrown instead.

Also there are some alter function tests where InvalidObjectException is thrown 
with Embedded MetaStore, but TApplicationException it thrown with Remote 
MetaStore. The reason is that the InvalidObjectException is not defined for the 
alter_function method in the thrift interface, so we got the 
TApplicationException when the InvalidObjectException was thrown. In these 
cases the InvalidObjectException could be handled on the server side and 
re-throw it as a MetaException


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 8539fea 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 ebbf465 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestFunctions.java
 9857c4e 


Diff: https://reviews.apache.org/r/66359/diff/2/

Changes: https://reviews.apache.org/r/66359/diff/1-2/


Testing
---


Thanks,

Marta Kuczora



Review Request 66359: HIVE-19076: Fix NPE and TApplicationException in function related HiveMetastore methods

2018-03-29 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66359/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-19076
https://issues.apache.org/jira/browse/HIVE-19076


Repository: hive-git


Description
---

The TestFunctions tests revealed that NPE is thrown in some cases. These NPEs 
could be prevented with a simple null check and a MetaException with a proper 
error message should be thrown instead.

Also there are some alter function tests where InvalidObjectException is thrown 
with Embedded MetaStore, but TApplicationException it thrown with Remote 
MetaStore. The reason is that the InvalidObjectException is not defined for the 
alter_function method in the thrift interface, so we got the 
TApplicationException when the InvalidObjectException was thrown. In these 
cases the InvalidObjectException could be handled on the server side and 
re-throw it as a MetaException


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 8a5de09 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestFunctions.java
 d504f34 


Diff: https://reviews.apache.org/r/66359/diff/1/


Testing
---


Thanks,

Marta Kuczora



Review Request 66356: HIVE-19075: Fix NPE when trying to drop or get DB with null name

2018-03-29 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66356/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-19075
https://issues.apache.org/jira/browse/HIVE-19075


Repository: hive-git


Description
---

The TestDatabases tests revealed that NPE is thrown if the get_database_core 
and drop_database_core methods are called with null DB name. These NPEs could 
be prevented with a simple null check and a MetaException with a proper error 
message should be thrown instead.

Example: NPE is thrown in the following test cases
- TestDatabases.testGetDatabaseNullName
- TestDatabases.testDropDatabaseNullName


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 8a5de09 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDatabases.java
 f2d745e 


Diff: https://reviews.apache.org/r/66356/diff/1/


Testing
---


Thanks,

Marta Kuczora



Review Request 66312: HIVE-19060: Fix the TestAppendPartitions.testAppendPartitionNullPartValues

2018-03-27 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66312/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-19060
https://issues.apache.org/jira/browse/HIVE-19060


Repository: hive-git


Description
---

In the pre-commit jobs the 
TestAppendPartitions.testAppendPartitionNullPartValues test is failing.

It can be reproduced locally and the test is failing because it expected a NPE 
to be thrown, but actually a NPE wrapped to a MetaException is thrown.

Added a null check for the partition values and throw a MetaException if the 
partition values list is null.


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 519e8fe 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
 b67f33d 


Diff: https://reviews.apache.org/r/66312/diff/1/


Testing
---


Thanks,

Marta Kuczora



Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-27 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/
---

(Updated March 27, 2018, 1:50 p.m.)


Review request for hive, Alexander Kolbasov, Peter Vary, and Adam Szita.


Changes
---

Made some minor fix on the patch because the 
TestMetaStoreEventListener.testListener test was failing.


Bugs: HIVE-18696
https://issues.apache.org/jira/browse/HIVE-18696


Repository: hive-git


Description
---

The idea behind the patch is

1) Separate the partition validation from starting the tasks which create the 
partition folders. 
Instead of doing the checks on the partitions and submit the tasks in one loop, 
separated the validation into a different loop. So first iterate through the 
partitions, validate the table/db names, and check for duplicates. Then if all 
partitions were correct, in the second loop submit the tasks to create the 
partition folders. This way if one of the partitions is incorrect, the 
exception will be thrown in the first loop, before the tasks are submitted. So 
we can be sure that no partition folder will be created if the list contains an 
invalid partition.

2) Handle the exceptions which occur during the execution of the tasks 
differently.
Previously if an exception occured in one task, the remaining tasks were 
canceled, and the newly created partition folders were cleaned up in the 
finally part. The problem was that it could happen that some tasks were still 
not finished with the folder creation when cleaning up the others, so there 
could have been leftover folders. After doing some testing it turned out that 
this use case cannot be avoided completely when canceling the tasks.
The idea of this patch is to set a flag if an exception is thrown in one of the 
tasks. This flag is visible in the tasks and if its value is true, the 
partition folders won't be created. Then iterate through the remaining tasks 
and wait for them to finish. The tasks which are started before the flag got 
set will then finish creating the partition folders. The tasks which are 
started after the flag got set, won't create the partition folders, to avoid 
unnecessary work. This way it is sure that all tasks are finished, when 
entering the finally part where the partition folders are cleaned up.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 519e8fe 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 4d9cb1b 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 1122057 


Diff: https://reviews.apache.org/r/65716/diff/5/

Changes: https://reviews.apache.org/r/65716/diff/4-5/


Testing
---

Added some new tests cases to the TestAddPartitions and 
TestAddPartitionsFromPartSpec tests.


Thanks,

Marta Kuczora



Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-26 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/
---

(Updated March 26, 2018, 10:10 a.m.)


Review request for hive, Alexander Kolbasov, Peter Vary, and Adam Szita.


Changes
---

Rebase patch


Bugs: HIVE-18696
https://issues.apache.org/jira/browse/HIVE-18696


Repository: hive-git


Description
---

The idea behind the patch is

1) Separate the partition validation from starting the tasks which create the 
partition folders. 
Instead of doing the checks on the partitions and submit the tasks in one loop, 
separated the validation into a different loop. So first iterate through the 
partitions, validate the table/db names, and check for duplicates. Then if all 
partitions were correct, in the second loop submit the tasks to create the 
partition folders. This way if one of the partitions is incorrect, the 
exception will be thrown in the first loop, before the tasks are submitted. So 
we can be sure that no partition folder will be created if the list contains an 
invalid partition.

2) Handle the exceptions which occur during the execution of the tasks 
differently.
Previously if an exception occured in one task, the remaining tasks were 
canceled, and the newly created partition folders were cleaned up in the 
finally part. The problem was that it could happen that some tasks were still 
not finished with the folder creation when cleaning up the others, so there 
could have been leftover folders. After doing some testing it turned out that 
this use case cannot be avoided completely when canceling the tasks.
The idea of this patch is to set a flag if an exception is thrown in one of the 
tasks. This flag is visible in the tasks and if its value is true, the 
partition folders won't be created. Then iterate through the remaining tasks 
and wait for them to finish. The tasks which are started before the flag got 
set will then finish creating the partition folders. The tasks which are 
started after the flag got set, won't create the partition folders, to avoid 
unnecessary work. This way it is sure that all tasks are finished, when 
entering the finally part where the partition folders are cleaned up.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 519e8fe 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 4d9cb1b 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 1122057 


Diff: https://reviews.apache.org/r/65716/diff/4/

Changes: https://reviews.apache.org/r/65716/diff/3-4/


Testing
---

Added some new tests cases to the TestAddPartitions and 
TestAddPartitionsFromPartSpec tests.


Thanks,

Marta Kuczora



Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-26 Thread Marta Kuczora via Review Board


> On March 13, 2018, 5:07 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 3162 (original), 3169 (patched)
> > 
> >
> > is it possible that getSd() is null here?

No, it is not possible. If the SD is null, an exception will occur earlier when 
trying to create the partition folder. However this use case is not handled  
there either (NPE will occur), so I will create an other Jira to fix this.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/#review199066
---


On March 8, 2018, 4:52 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65716/
> ---
> 
> (Updated March 8, 2018, 4:52 p.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Peter Vary, and Adam Szita.
> 
> 
> Bugs: HIVE-18696
> https://issues.apache.org/jira/browse/HIVE-18696
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The idea behind the patch is
> 
> 1) Separate the partition validation from starting the tasks which create the 
> partition folders. 
> Instead of doing the checks on the partitions and submit the tasks in one 
> loop, separated the validation into a different loop. So first iterate 
> through the partitions, validate the table/db names, and check for 
> duplicates. Then if all partitions were correct, in the second loop submit 
> the tasks to create the partition folders. This way if one of the partitions 
> is incorrect, the exception will be thrown in the first loop, before the 
> tasks are submitted. So we can be sure that no partition folder will be 
> created if the list contains an invalid partition.
> 
> 2) Handle the exceptions which occur during the execution of the tasks 
> differently.
> Previously if an exception occured in one task, the remaining tasks were 
> canceled, and the newly created partition folders were cleaned up in the 
> finally part. The problem was that it could happen that some tasks were still 
> not finished with the folder creation when cleaning up the others, so there 
> could have been leftover folders. After doing some testing it turned out that 
> this use case cannot be avoided completely when canceling the tasks.
> The idea of this patch is to set a flag if an exception is thrown in one of 
> the tasks. This flag is visible in the tasks and if its value is true, the 
> partition folders won't be created. Then iterate through the remaining tasks 
> and wait for them to finish. The tasks which are started before the flag got 
> set will then finish creating the partition folders. The tasks which are 
> started after the flag got set, won't create the partition folders, to avoid 
> unnecessary work. This way it is sure that all tasks are finished, when 
> entering the finally part where the partition folders are cleaned up.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  662de9a 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  4d9cb1b 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  1122057 
> 
> 
> Diff: https://reviews.apache.org/r/65716/diff/3/
> 
> 
> Testing
> ---
> 
> Added some new tests cases to the TestAddPartitions and 
> TestAddPartitionsFromPartSpec tests.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-26 Thread Marta Kuczora via Review Board


> On Feb. 21, 2018, 3:32 p.m., Sahil Takiar wrote:
> > Would be good to know which Hive queries invoke this method.
> 
> Marta Kuczora wrote:
> Thanks a lot Sahil for the review. I will check where these methods are 
> used and come back to you with the answer a bit later.

I checked how these methods are used.

The 'add_partition_core' method is used when adding multiple partitions to the 
table. A simple query, like the following, can trigger it.

ALTER TABLE bubu ADD PARTITION (year='2017',month='march') PARTITION 
(year='2017',month='april') PARTITION (year='2018',month='march') PARTITION 
(year='2018',month='may') PARTITION (year='2017',month='march', day="3");

It is also used by the DDLTask.createPartitionsInBatches method which is used 
by the 'msck repair' command.


I didn't find much about the 'add_partitions_pspec_core' method. I only found 
that it is used by the HCatClientHMSImpl.addPartitionSpec method, but I don't 
know if the HCatClientHMSImpl is used or how it is used.


> On Feb. 21, 2018, 3:32 p.m., Sahil Takiar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 3032 (original), 3065 (patched)
> > 
> >
> > this code looks very similar to the block above. I know its was never 
> > the intention of this JIRA to do any re-factoring, but how difficult would 
> > it be to move all this code into a common method so that we don't have to 
> > fix the bug in two places? not a blocking issue though
> 
> Marta Kuczora wrote:
> Yeah, I absolutely agree. This code duplication annoys me as well, just I 
> wasn't sure that it is acceptable doing the refactoring in the scope of this 
> Jira. But it is not so difficult, so I will upload a patch where I moved the 
> common parts to a separate method and we can decide if it is ok like that or 
> rather do it in a different Jira.
> 
> Marta Kuczora wrote:
> I checked how this could be refactored and there are some differences 
> between the methods which makes it not that straightforward. It is not that 
> difficult and basically I have the patch, but I would do it in the scope of 
> an other Jira, so we can discuss some details there. Would this be ok for you 
> Sahil?

Created a follow-up Jira for the refactoring
https://issues.apache.org/jira/browse/HIVE-19046


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/#review197829
---


On March 8, 2018, 4:52 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65716/
> ---
> 
> (Updated March 8, 2018, 4:52 p.m.)
> 
> 
> Review request for hive, Alexander Kolbasov, Peter Vary, and Adam Szita.
> 
> 
> Bugs: HIVE-18696
> https://issues.apache.org/jira/browse/HIVE-18696
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The idea behind the patch is
> 
> 1) Separate the partition validation from starting the tasks which create the 
> partition folders. 
> Instead of doing the checks on the partitions and submit the tasks in one 
> loop, separated the validation into a different loop. So first iterate 
> through the partitions, validate the table/db names, and check for 
> duplicates. Then if all partitions were correct, in the second loop submit 
> the tasks to create the partition folders. This way if one of the partitions 
> is incorrect, the exception will be thrown in the first loop, before the 
> tasks are submitted. So we can be sure that no partition folder will be 
> created if the list contains an invalid partition.
> 
> 2) Handle the exceptions which occur during the execution of the tasks 
> differently.
> Previously if an exception occured in one task, the remaining tasks were 
> canceled, and the newly created partition folders were cleaned up in the 
> finally part. The problem was that it could happen that some tasks were still 
> not finished with the folder creation when cleaning up the others, so there 
> could have been leftover folders. After doing some testing it turned out that 
> this use case cannot be avoided completely when canceling the tasks.
> The idea of this patch is to set a flag if an exception is thrown in one of 
> the tasks. This flag is visible in the tasks and if its value is true, the 
> partition folders won't be created. Then iterate through the remaining tasks 
> and wait for them to finish. The tasks which are started before the flag got 
> set will then finish creating the partition folders. The tasks which are 
> started after the flag got set, won't create the partition folders, to avoid 
> unnecessary work. This way it is sure that all tasks are 

Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-08 Thread Marta Kuczora via Review Board


> On March 7, 2018, 5:42 a.m., Alexander Kolbasov wrote:
> >

Thanks a lot Sasha for the review!


> On March 7, 2018, 5:42 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 2887 (original), 2889 (patched)
> > 
> >
> > Here you can specify the length:
> > 
> > `List partitionsToAdd = new ArrayList<>(parts.size());
> > List partValWrappers = new ArrayList<>(parts.size());
> > `
> > 
> > But also why do we needd this list at all? We can just use 
> > partValWrappers as a collection of partitions we care about.

You are right, we don't need this list. I fixed the code to use only the 
wrappers.


> On March 7, 2018, 5:42 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 2888 (original), 2890 (patched)
> > 
> >
> > You use this to check for duplicates and list is pretty bad structure 
> > for this - please use set instead.

Fixed it.


> On March 7, 2018, 5:42 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 2891 (original), 2897 (patched)
> > 
> >
> > In cases like this it is also quite useful to know actual table and 
> > dbname that was supplied - it could help to figure out what wrong partition 
> > ended up here.

Fixed the error message.


> On March 7, 2018, 5:42 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 2898 (original), 2904 (patched)
> > 
> >
> > Can you fix this as well to use
> > 
> > `LOG.info("Not adding partition {} as it already exists", part)`

Fixed it.


> On March 7, 2018, 5:42 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 2922 (patched)
> > 
> >
> > Please use new ArrayList<>(partitionsToAdd.size())

Fixed it.


> On March 7, 2018, 5:42 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 2902 (original), 2925 (patched)
> > 
> >
> > ugi doesn't change in the loop, so it can be moved outside. Same goes 
> > for currentUser - it can be done just once outside the loop.

You are right, I moved it outside the loop.


> On March 7, 2018, 5:42 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 2906 (original), 2929 (patched)
> > 
> >
> > Why are we converting IO exception to RuntimeException? This doesn't 
> > look right.

I don't know why it was implemented like this. I didn't change this part. It 
got introduced like this in https://issues.apache.org/jira/browse/HIVE-15137


> On March 7, 2018, 5:42 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 2909 (original), 2932 (patched)
> > 
> >
> > I am wondering what is the motivation for doing this concurrently. I 
> > guess that if the list of partitions is huge, it may be useful, but for 
> > smaller lists it is probably just an overhead. This is putside your scope 
> > but definitely worth investigating.

I only know that the threads got introduced in 
https://issues.apache.org/jira/browse/HIVE-13901 because the folder creation 
was slow on some file systems.


> On March 7, 2018, 5:42 a.m., Alexander Kolbasov wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 2913 (original), 2936 (patched)
> > 
> >
> > Since we are using Java 8 we can use lambdas now, so this can become:
> > 
> >   partFutures.add(threadPool.submit(() -> {
> > if (failureOccurred.get()) {
> >   return null;
> > }
> > ugi.doAs((PrivilegedExceptionAction) () -> {
> >   try {
> > boolean madeDir = 
> > createLocationForAddedPartition(table, part);
> > addedPartitions.put(partWrapper, madeDir);
> > initializeAddedPartition(table, part, madeDir);
> >   } 

Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-08 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/
---

(Updated March 8, 2018, 4:52 p.m.)


Review request for hive, Alexander Kolbasov, Peter Vary, and Adam Szita.


Changes
---

Fixed review findings.


Bugs: HIVE-18696
https://issues.apache.org/jira/browse/HIVE-18696


Repository: hive-git


Description
---

The idea behind the patch is

1) Separate the partition validation from starting the tasks which create the 
partition folders. 
Instead of doing the checks on the partitions and submit the tasks in one loop, 
separated the validation into a different loop. So first iterate through the 
partitions, validate the table/db names, and check for duplicates. Then if all 
partitions were correct, in the second loop submit the tasks to create the 
partition folders. This way if one of the partitions is incorrect, the 
exception will be thrown in the first loop, before the tasks are submitted. So 
we can be sure that no partition folder will be created if the list contains an 
invalid partition.

2) Handle the exceptions which occur during the execution of the tasks 
differently.
Previously if an exception occured in one task, the remaining tasks were 
canceled, and the newly created partition folders were cleaned up in the 
finally part. The problem was that it could happen that some tasks were still 
not finished with the folder creation when cleaning up the others, so there 
could have been leftover folders. After doing some testing it turned out that 
this use case cannot be avoided completely when canceling the tasks.
The idea of this patch is to set a flag if an exception is thrown in one of the 
tasks. This flag is visible in the tasks and if its value is true, the 
partition folders won't be created. Then iterate through the remaining tasks 
and wait for them to finish. The tasks which are started before the flag got 
set will then finish creating the partition folders. The tasks which are 
started after the flag got set, won't create the partition folders, to avoid 
unnecessary work. This way it is sure that all tasks are finished, when 
entering the finally part where the partition folders are cleaned up.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 662de9a 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 4d9cb1b 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 1122057 


Diff: https://reviews.apache.org/r/65716/diff/3/

Changes: https://reviews.apache.org/r/65716/diff/2-3/


Testing
---

Added some new tests cases to the TestAddPartitions and 
TestAddPartitionsFromPartSpec tests.


Thanks,

Marta Kuczora



Re: Review Request 65952: HIVE-18898: Fix NPEs in HiveMetastore.dropPartition method

2018-03-08 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65952/
---

(Updated March 8, 2018, 9:40 a.m.)


Review request for hive, Peter Vary and Adam Szita.


Changes
---

Fixed stlyecheck issue.


Bugs: HIVE-18898
https://issues.apache.org/jira/browse/HIVE-18898


Repository: hive-git


Description
---

The TestDropPartitions tests revealed that NPE is thrown if the 
dropPartition(String db_name, String tbl_name, List part_vals, 
PartitionDropOptions options) method is called with null options and with a 
part_vals list which contains null elements.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 662de9a 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 3128089 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDropPartitions.java
 4d94ebf 


Diff: https://reviews.apache.org/r/65952/diff/3/

Changes: https://reviews.apache.org/r/65952/diff/2-3/


Testing
---

Run the TestDropPartitions tests.


Thanks,

Marta Kuczora



Re: Review Request 65952: HIVE-18898: Fix NPEs in HiveMetastore.dropPartition method

2018-03-08 Thread Marta Kuczora via Review Board


> On March 7, 2018, 5:06 p.m., Sahil Takiar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> > Lines 1006 (patched)
> > 
> >
> > Do the other parameters need to be checked?
> 
> Marta Kuczora wrote:
> You are right, we can check the other parameters too.

Added checks to HiveMetaStore for the db and table names and for the part_vals 
list.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65952/#review198796
---


On March 8, 2018, 9:35 a.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65952/
> ---
> 
> (Updated March 8, 2018, 9:35 a.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18898
> https://issues.apache.org/jira/browse/HIVE-18898
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The TestDropPartitions tests revealed that NPE is thrown if the 
> dropPartition(String db_name, String tbl_name, List part_vals, 
> PartitionDropOptions options) method is called with null options and with a 
> part_vals list which contains null elements.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  662de9a 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  3128089 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDropPartitions.java
>  4d94ebf 
> 
> 
> Diff: https://reviews.apache.org/r/65952/diff/2/
> 
> 
> Testing
> ---
> 
> Run the TestDropPartitions tests.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65952: HIVE-18898: Fix NPEs in HiveMetastore.dropPartition method

2018-03-08 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65952/
---

(Updated March 8, 2018, 9:35 a.m.)


Review request for hive, Peter Vary and Adam Szita.


Changes
---

Added checks for the db and table names and for the part value list.


Bugs: HIVE-18898
https://issues.apache.org/jira/browse/HIVE-18898


Repository: hive-git


Description
---

The TestDropPartitions tests revealed that NPE is thrown if the 
dropPartition(String db_name, String tbl_name, List part_vals, 
PartitionDropOptions options) method is called with null options and with a 
part_vals list which contains null elements.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 662de9a 
  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 3128089 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDropPartitions.java
 4d94ebf 


Diff: https://reviews.apache.org/r/65952/diff/2/

Changes: https://reviews.apache.org/r/65952/diff/1-2/


Testing
---

Run the TestDropPartitions tests.


Thanks,

Marta Kuczora



Re: Review Request 65952: HIVE-18898: Fix NPEs in HiveMetastore.dropPartition method

2018-03-08 Thread Marta Kuczora via Review Board


> On March 8, 2018, 2:15 a.m., Alexander Kolbasov wrote:
> > Afrew with Sahil - the checks should be on the server side.

Thanks a lot Sasha for the review!

I absolutely agree that it would be better to have the checks on the server 
side, however these checks have to be done in the client, because the NPE 
occurs before calling HiveMetaStore.

As I wrote above, the PartitionDropOptions parameter is not sent to 
HiveMetaStore. In HiveMetaStore we only have the boolean values which are set 
in PartitionDropOptions and the NPE occurs when trying to get these variables.

With a part_vals list which has null values, the NPE occurs when thrift is 
trying to serialize it. So we should do the check before calling the thrift 
method.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65952/#review198847
---


On March 7, 2018, 3:48 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65952/
> ---
> 
> (Updated March 7, 2018, 3:48 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18898
> https://issues.apache.org/jira/browse/HIVE-18898
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The TestDropPartitions tests revealed that NPE is thrown if the 
> dropPartition(String db_name, String tbl_name, List part_vals, 
> PartitionDropOptions options) method is called with null options and with a 
> part_vals list which contains null elements.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  3128089 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDropPartitions.java
>  4d94ebf 
> 
> 
> Diff: https://reviews.apache.org/r/65952/diff/1/
> 
> 
> Testing
> ---
> 
> Run the TestDropPartitions tests.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65952: HIVE-18898: Fix NPEs in HiveMetastore.dropPartition method

2018-03-08 Thread Marta Kuczora via Review Board


> On March 7, 2018, 5:06 p.m., Sahil Takiar wrote:
> >

Thanks a lot Sahil for the review.


> On March 7, 2018, 5:06 p.m., Sahil Takiar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> > Lines 996 (patched)
> > 
> >
> > Why do this validation in `HiveMetaStoreClient` rather than 
> > `HiveMetaStore`?

I did it there, because the PartitionDropOptions parameter is not sent to 
HiveMetaStore, only the boolean values from it.
The dropPartition method in the HiveMetaStoreClient looked like this:

  public boolean dropPartition(String db_name, String tbl_name,
  List part_vals, PartitionDropOptions options) throws TException {
  
return dropPartition(db_name, tbl_name, part_vals, options.deleteData,
 options.purgeData? 
getEnvironmentContextWithIfPurgeSet() : null);
  }
  
And the NPE occurred when called options.deleteData in the dropPartition method 
call.


> On March 7, 2018, 5:06 p.m., Sahil Takiar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
> > Lines 1006 (patched)
> > 
> >
> > Do the other parameters need to be checked?

You are right, we can check the other parameters too.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65952/#review198796
---


On March 7, 2018, 3:48 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65952/
> ---
> 
> (Updated March 7, 2018, 3:48 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18898
> https://issues.apache.org/jira/browse/HIVE-18898
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The TestDropPartitions tests revealed that NPE is thrown if the 
> dropPartition(String db_name, String tbl_name, List part_vals, 
> PartitionDropOptions options) method is called with null options and with a 
> part_vals list which contains null elements.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  3128089 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDropPartitions.java
>  4d94ebf 
> 
> 
> Diff: https://reviews.apache.org/r/65952/diff/1/
> 
> 
> Testing
> ---
> 
> Run the TestDropPartitions tests.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Review Request 65952: HIVE-18898: Fix NPEs in HiveMetastore.dropPartition method

2018-03-07 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65952/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-18898
https://issues.apache.org/jira/browse/HIVE-18898


Repository: hive-git


Description
---

The TestDropPartitions tests revealed that NPE is thrown if the 
dropPartition(String db_name, String tbl_name, List part_vals, 
PartitionDropOptions options) method is called with null options and with a 
part_vals list which contains null elements.


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 3128089 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestDropPartitions.java
 4d94ebf 


Diff: https://reviews.apache.org/r/65952/diff/1/


Testing
---

Run the TestDropPartitions tests.


Thanks,

Marta Kuczora



Review Request 65947: HIVE-18892: Fix NPEs in HiveMetastore.exchange_partitions method

2018-03-07 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65947/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-18892
https://issues.apache.org/jira/browse/HIVE-18892


Repository: hive-git


Description
---

The TestExchangePartitions tests revealed that NPE is thrown if the 
exchange_partitions method is called with null, empty or non-existing DB and 
table names. These NPEs could be prevented with a simple null check and a 
MetaException with a proper error message should be thrown instead.


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 662de9a 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java
 5b21491 


Diff: https://reviews.apache.org/r/65947/diff/1/


Testing
---

Run the TestExchangePartitions tests.


Thanks,

Marta Kuczora



Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-06 Thread Marta Kuczora via Review Board


> On Feb. 21, 2018, 3:32 p.m., Sahil Takiar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 3032 (original), 3065 (patched)
> > 
> >
> > this code looks very similar to the block above. I know its was never 
> > the intention of this JIRA to do any re-factoring, but how difficult would 
> > it be to move all this code into a common method so that we don't have to 
> > fix the bug in two places? not a blocking issue though
> 
> Marta Kuczora wrote:
> Yeah, I absolutely agree. This code duplication annoys me as well, just I 
> wasn't sure that it is acceptable doing the refactoring in the scope of this 
> Jira. But it is not so difficult, so I will upload a patch where I moved the 
> common parts to a separate method and we can decide if it is ok like that or 
> rather do it in a different Jira.

I checked how this could be refactored and there are some differences between 
the methods which makes it not that straightforward. It is not that difficult 
and basically I have the patch, but I would do it in the scope of an other 
Jira, so we can discuss some details there. Would this be ok for you Sahil?


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/#review197829
---


On March 6, 2018, 5:24 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65716/
> ---
> 
> (Updated March 6, 2018, 5:24 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18696
> https://issues.apache.org/jira/browse/HIVE-18696
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The idea behind the patch is
> 
> 1) Separate the partition validation from starting the tasks which create the 
> partition folders. 
> Instead of doing the checks on the partitions and submit the tasks in one 
> loop, separated the validation into a different loop. So first iterate 
> through the partitions, validate the table/db names, and check for 
> duplicates. Then if all partitions were correct, in the second loop submit 
> the tasks to create the partition folders. This way if one of the partitions 
> is incorrect, the exception will be thrown in the first loop, before the 
> tasks are submitted. So we can be sure that no partition folder will be 
> created if the list contains an invalid partition.
> 
> 2) Handle the exceptions which occur during the execution of the tasks 
> differently.
> Previously if an exception occured in one task, the remaining tasks were 
> canceled, and the newly created partition folders were cleaned up in the 
> finally part. The problem was that it could happen that some tasks were still 
> not finished with the folder creation when cleaning up the others, so there 
> could have been leftover folders. After doing some testing it turned out that 
> this use case cannot be avoided completely when canceling the tasks.
> The idea of this patch is to set a flag if an exception is thrown in one of 
> the tasks. This flag is visible in the tasks and if its value is true, the 
> partition folders won't be created. Then iterate through the remaining tasks 
> and wait for them to finish. The tasks which are started before the flag got 
> set will then finish creating the partition folders. The tasks which are 
> started after the flag got set, won't create the partition folders, to avoid 
> unnecessary work. This way it is sure that all tasks are finished, when 
> entering the finally part where the partition folders are cleaned up.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  2be018b 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  4d9cb1b 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  1122057 
> 
> 
> Diff: https://reviews.apache.org/r/65716/diff/2/
> 
> 
> Testing
> ---
> 
> Added some new tests cases to the TestAddPartitions and 
> TestAddPartitionsFromPartSpec tests.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-03-06 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/
---

(Updated March 6, 2018, 5:24 p.m.)


Review request for hive, Peter Vary and Adam Szita.


Changes
---

Fix some review findings and also added some test cases for adding partitions 
to view.


Bugs: HIVE-18696
https://issues.apache.org/jira/browse/HIVE-18696


Repository: hive-git


Description
---

The idea behind the patch is

1) Separate the partition validation from starting the tasks which create the 
partition folders. 
Instead of doing the checks on the partitions and submit the tasks in one loop, 
separated the validation into a different loop. So first iterate through the 
partitions, validate the table/db names, and check for duplicates. Then if all 
partitions were correct, in the second loop submit the tasks to create the 
partition folders. This way if one of the partitions is incorrect, the 
exception will be thrown in the first loop, before the tasks are submitted. So 
we can be sure that no partition folder will be created if the list contains an 
invalid partition.

2) Handle the exceptions which occur during the execution of the tasks 
differently.
Previously if an exception occured in one task, the remaining tasks were 
canceled, and the newly created partition folders were cleaned up in the 
finally part. The problem was that it could happen that some tasks were still 
not finished with the folder creation when cleaning up the others, so there 
could have been leftover folders. After doing some testing it turned out that 
this use case cannot be avoided completely when canceling the tasks.
The idea of this patch is to set a flag if an exception is thrown in one of the 
tasks. This flag is visible in the tasks and if its value is true, the 
partition folders won't be created. Then iterate through the remaining tasks 
and wait for them to finish. The tasks which are started before the flag got 
set will then finish creating the partition folders. The tasks which are 
started after the flag got set, won't create the partition folders, to avoid 
unnecessary work. This way it is sure that all tasks are finished, when 
entering the finally part where the partition folders are cleaned up.


Diffs (updated)
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 2be018b 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 4d9cb1b 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 1122057 


Diff: https://reviews.apache.org/r/65716/diff/2/

Changes: https://reviews.apache.org/r/65716/diff/1-2/


Testing
---

Added some new tests cases to the TestAddPartitions and 
TestAddPartitionsFromPartSpec tests.


Thanks,

Marta Kuczora



Re: Review Request 65730: HIVE-18697: The HiveMetastore.exchange_partitions method throws FileNotFoundException if the given partition doesn't exist in the source table

2018-03-06 Thread Marta Kuczora via Review Board


> On Feb. 21, 2018, 1:32 p.m., Peter Vary wrote:
> > Ship It!

Thanks a lot Peter for the review!


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65730/#review197847
---


On Feb. 21, 2018, 10:36 a.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65730/
> ---
> 
> (Updated Feb. 21, 2018, 10:36 a.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18697
> https://issues.apache.org/jira/browse/HIVE-18697
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Extended the HiveMetastore.exchange_partitions method to check if the 
> partitionsToExchange list is empty and if it is throw a MetaException with a 
> proper error message that no partition exists with the given values for the 
> source table. Previously a FileNotFoundException was thrown (wrapped in a 
> MetaException) when tried to move the partition folder to the dest table. So 
> the type of the exception was not changed, only the error message.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  47de215 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java
>  3a06aec 
> 
> 
> Diff: https://reviews.apache.org/r/65730/diff/1/
> 
> 
> Testing
> ---
> 
> There are tests for this use case in TestExchangePartitions:
> - testExchangePartitionsNoPartExists
> - testExchangePartitionNoPartExists
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65731: HIVE-18699: Check for duplicate partitions in HiveMetastore.exchange_partitions

2018-03-06 Thread Marta Kuczora via Review Board


> On Feb. 22, 2018, 10:35 a.m., Adam Szita wrote:
> > Ship It!

Thanks a lot Adam for the review!


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65731/#review198106
---


On Feb. 21, 2018, 11:37 a.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65731/
> ---
> 
> (Updated Feb. 21, 2018, 11:37 a.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18699
> https://issues.apache.org/jira/browse/HIVE-18699
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Extended the HiveMetastore.exchange_partitions method to check if the 
> partitions to be exchanged don't exist in the dest table. If one of the 
> partitions already exists, throw a MetaException with a proper error message.
> 
> Previously an exception like this (wrapped in a MetaException) was thrown:
> Insert of object
> "org.apache.hadoop.hive.metastore.model.MPartition@4e78fff5" using statement 
> "INSERT INTO PARTITIONS
> (PART_ID,CREATE_TIME,LAST_ACCESS_TIME,PART_NAME,SD_ID,TBL_ID) VALUES 
> (?,?,?,?,?,?)" failed : The statement was
> aborted because it would have caused a duplicate key value in a unique or 
> primary key constraint or unique index
> identified by 'UNIQUEPARTITION' defined on 'PARTITIONS'.
> 
> From user point of view, the type of the exception is not changed 
> (MetaException), just the error message is changed to a more understandable 
> one.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  47de215 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java
>  3a06aec 
> 
> 
> Diff: https://reviews.apache.org/r/65731/diff/1/
> 
> 
> Testing
> ---
> 
> Tests already exist for this use case in TestExchangePartitions:
> - testExchangePartitionsPartAlreadyExists
> - testExchangePartitionPartAlreadyExists
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65731: HIVE-18699: Check for duplicate partitions in HiveMetastore.exchange_partitions

2018-03-06 Thread Marta Kuczora via Review Board


> On Feb. 21, 2018, 1:37 p.m., Peter Vary wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3370 (patched)
> > 
> >
> > How "expensive" is this call? Is this a simple query? What happens if 
> > the destintaion table has 1m partitions? :)
> 
> Adam Szita wrote:
> I don't see any other way (currently available) that is more lightweight 
> than this is. Under the hood this calls getPartitionNamesNoTxn which executes 
> an actual "select parititionName from .." statement.

Yeah, as Adam says it is the least expensive way to do this check. It would 
mean one extra "select partitionName" per exchangePartition calls.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65731/#review197848
---


On Feb. 21, 2018, 11:37 a.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65731/
> ---
> 
> (Updated Feb. 21, 2018, 11:37 a.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18699
> https://issues.apache.org/jira/browse/HIVE-18699
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Extended the HiveMetastore.exchange_partitions method to check if the 
> partitions to be exchanged don't exist in the dest table. If one of the 
> partitions already exists, throw a MetaException with a proper error message.
> 
> Previously an exception like this (wrapped in a MetaException) was thrown:
> Insert of object
> "org.apache.hadoop.hive.metastore.model.MPartition@4e78fff5" using statement 
> "INSERT INTO PARTITIONS
> (PART_ID,CREATE_TIME,LAST_ACCESS_TIME,PART_NAME,SD_ID,TBL_ID) VALUES 
> (?,?,?,?,?,?)" failed : The statement was
> aborted because it would have caused a duplicate key value in a unique or 
> primary key constraint or unique index
> identified by 'UNIQUEPARTITION' defined on 'PARTITIONS'.
> 
> From user point of view, the type of the exception is not changed 
> (MetaException), just the error message is changed to a more understandable 
> one.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  47de215 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java
>  3a06aec 
> 
> 
> Diff: https://reviews.apache.org/r/65731/diff/1/
> 
> 
> Testing
> ---
> 
> Tests already exist for this use case in TestExchangePartitions:
> - testExchangePartitionsPartAlreadyExists
> - testExchangePartitionPartAlreadyExists
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-02-23 Thread Marta Kuczora via Review Board


> On Feb. 21, 2018, 3:32 p.m., Sahil Takiar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3149-3150 (patched)
> > 
> >
> > curious what behavior you were seeing, wondering why cancelling or 
> > interrupting the `Future`s doesn't work
> 
> Marta Kuczora wrote:
> The idea was to iterate over the Future tasks and call their get method 
> to wait until they finish. If an exception occurred in one task, iterated 
> over the tasks again and canceled them with mayInterruptIfRunning=false flag. 
> According to the Java doc, if this flag is false, the in-progress tasks are 
> allowed to complete. So the idea behind this was to avoid interrupting a task 
> when it already started creating the folder. So the situation when the folder 
> is created, but it is not yet put to the addedPartitions map doesn't happen.
> Then wait for the already running tasks to complete and I assumed we are 
> good at that point because all tasks are either finished or didn't even 
> started.
> 
> Imagine a flow something like this in code:
> 
> boolean failureOccurred = false;
> try {
>   for (Future partFuture : partFutures) {
> partFuture.get();
>   }
> } catch (InterruptedException | ExecutionException e) {
>   failureOccurred = true;  
> }
> 
> for (Future task : partFutures) {
>   task.cancel(false);
> } 
> 
> for (Future partFuture : partFutures) {
>   if (!partFuture.isCanceled()) {
> partFuture.get();
>   }
> }
> 
> 
> Then I created a test to see if it works as I expected. I tried to create 
> 40 partitions and had a counter which were visible to the tasks and threw an 
> exception when it reached 20. What I noticed is that almost every time I got 
> a ConcurrentModificationException on this line
> 
>for (Map.Entry e : 
> addedPartitions.entrySet()) {
> 
> So there must be some tasks still writing the addedPartitions map at that 
> point. By the way, changing the type of the map to ConcurrentMap proved this 
> right, as no exception occurred in this case, but there were leftover folders.
> 
> So I started to debug it, mainly with logging when the call method of a 
> task is called, when a task get canceled and what was the result when the get 
> method was called. What I found that there were tasks which were started, 
> their call method was called, so they started to create the folder, but then 
> there was a successful cancel on them. For these tasks the get method simply 
> would throw a CancellationException as it sees the task is not running any 
> more (or the isCanceled method would return true). But actually these tasks 
> created the folder, but it could happen that they didn't finish until the 
> clean up.
> 
> I checked the FutureTask code and the run method checks if the state of 
> the task is NEW and if it is, calls the Callable's call method. But doesn't 
> change the state at that point. My theory is that if a cancel is called on 
> the same task at this point, it will also see that the state is NEW, so it 
> will change it to CANCELLED. So I believe a task can go into a weird state 
> like this.
> Calling the cancel with mayInterruptIfRunning=true also resulted the same.
> So I didn't find a bullet proof solution with canceling the tasks, but it 
> can be that I missed something and there is a good way to solve this.
> 
> If you have any idea, please share it with me, any idea is welcome. :)
> 
> Sahil Takiar wrote:
> hmm this is odd. did you try checking the return value of the `cancel` 
> method, the javadocs say it returns `true` if the cancel was successful and 
> `false` otherwise; is it returning `true`?
> 
> I can understand that `mayInterruptIfRunning=true` won't help, because it 
> just causes the thread to throw an `InterruptedException` but any code 
> running in that thread can just catch the exception and drop it
> 
> could it be possible the second `partFuture.get()` threw an exception, in 
> which case the `finally` block will be trigerred before all threads complete?
> 
> I think the solution you have proposed works fine, but the 
> `Future#cancel` methodology should work too and is slightly cleaner, but if 
> you can't get it to work no worries, don't spend too much time on it

Yep, I checked it and for these "leftover" tasks, the cancel returned true. 
There were tasks for which the cancel returned false, but those were ok, 
because the get just waited for them.

No, I didn't get exception during the second partFuture.get(). Also when I 
logged what happens with the tasks, all tasks appeared in this second get loop. 
For the ones which were canceled successfully (even though they are running), 
just showed that they were canceled.

Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-02-22 Thread Marta Kuczora via Review Board


> On Feb. 21, 2018, 3:32 p.m., Sahil Takiar wrote:
> > Would be good to know which Hive queries invoke this method.

Thanks a lot Sahil for the review. I will check where these methods are used 
and come back to you with the answer a bit later.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/#review197829
---


On Feb. 20, 2018, 5:03 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65716/
> ---
> 
> (Updated Feb. 20, 2018, 5:03 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18696
> https://issues.apache.org/jira/browse/HIVE-18696
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The idea behind the patch is
> 
> 1) Separate the partition validation from starting the tasks which create the 
> partition folders. 
> Instead of doing the checks on the partitions and submit the tasks in one 
> loop, separated the validation into a different loop. So first iterate 
> through the partitions, validate the table/db names, and check for 
> duplicates. Then if all partitions were correct, in the second loop submit 
> the tasks to create the partition folders. This way if one of the partitions 
> is incorrect, the exception will be thrown in the first loop, before the 
> tasks are submitted. So we can be sure that no partition folder will be 
> created if the list contains an invalid partition.
> 
> 2) Handle the exceptions which occur during the execution of the tasks 
> differently.
> Previously if an exception occured in one task, the remaining tasks were 
> canceled, and the newly created partition folders were cleaned up in the 
> finally part. The problem was that it could happen that some tasks were still 
> not finished with the folder creation when cleaning up the others, so there 
> could have been leftover folders. After doing some testing it turned out that 
> this use case cannot be avoided completely when canceling the tasks.
> The idea of this patch is to set a flag if an exception is thrown in one of 
> the tasks. This flag is visible in the tasks and if its value is true, the 
> partition folders won't be created. Then iterate through the remaining tasks 
> and wait for them to finish. The tasks which are started before the flag got 
> set will then finish creating the partition folders. The tasks which are 
> started after the flag got set, won't create the partition folders, to avoid 
> unnecessary work. This way it is sure that all tasks are finished, when 
> entering the finally part where the partition folders are cleaned up.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  47de215 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
>  f483ca8 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  919ba78 
> 
> 
> Diff: https://reviews.apache.org/r/65716/diff/1/
> 
> 
> Testing
> ---
> 
> Added some new tests cases to the TestAddPartitions and 
> TestAddPartitionsFromPartSpec tests.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-02-22 Thread Marta Kuczora via Review Board


> On Feb. 21, 2018, 3:32 p.m., Sahil Takiar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Line 3032 (original), 3065 (patched)
> > 
> >
> > this code looks very similar to the block above. I know its was never 
> > the intention of this JIRA to do any re-factoring, but how difficult would 
> > it be to move all this code into a common method so that we don't have to 
> > fix the bug in two places? not a blocking issue though

Yeah, I absolutely agree. This code duplication annoys me as well, just I 
wasn't sure that it is acceptable doing the refactoring in the scope of this 
Jira. But it is not so difficult, so I will upload a patch where I moved the 
common parts to a separate method and we can decide if it is ok like that or 
rather do it in a different Jira.


> On Feb. 21, 2018, 3:32 p.m., Sahil Takiar wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3149-3150 (patched)
> > 
> >
> > curious what behavior you were seeing, wondering why cancelling or 
> > interrupting the `Future`s doesn't work

The idea was to iterate over the Future tasks and call their get method to wait 
until they finish. If an exception occurred in one task, iterated over the 
tasks again and canceled them with mayInterruptIfRunning=false flag. According 
to the Java doc, if this flag is false, the in-progress tasks are allowed to 
complete. So the idea behind this was to avoid interrupting a task when it 
already started creating the folder. So the situation when the folder is 
created, but it is not yet put to the addedPartitions map doesn't happen.
Then wait for the already running tasks to complete and I assumed we are good 
at that point because all tasks are either finished or didn't even started.

Imagine a flow something like this in code:

boolean failureOccurred = false;
try {
  for (Future partFuture : partFutures) {
partFuture.get();
  }
} catch (InterruptedException | ExecutionException e) {
  failureOccurred = true;  
}

for (Future task : partFutures) {
  task.cancel(false);
} 

for (Future partFuture : partFutures) {
  if (!partFuture.isCanceled()) {
partFuture.get();
  }
}


Then I created a test to see if it works as I expected. I tried to create 40 
partitions and had a counter which were visible to the tasks and threw an 
exception when it reached 20. What I noticed is that almost every time I got a 
ConcurrentModificationException on this line

   for (Map.Entry e : 
addedPartitions.entrySet()) {

So there must be some tasks still writing the addedPartitions map at that 
point. By the way, changing the type of the map to ConcurrentMap proved this 
right, as no exception occurred in this case, but there were leftover folders.

So I started to debug it, mainly with logging when the call method of a task is 
called, when a task get canceled and what was the result when the get method 
was called. What I found that there were tasks which were started, their call 
method was called, so they started to create the folder, but then there was a 
successful cancel on them. For these tasks the get method simply would throw a 
CancellationException as it sees the task is not running any more (or the 
isCanceled method would return true). But actually these tasks created the 
folder, but it could happen that they didn't finish until the clean up.

I checked the FutureTask code and the run method checks if the state of the 
task is NEW and if it is, calls the Callable's call method. But doesn't change 
the state at that point. My theory is that if a cancel is called on the same 
task at this point, it will also see that the state is NEW, so it will change 
it to CANCELLED. So I believe a task can go into a weird state like this.
Calling the cancel with mayInterruptIfRunning=true also resulted the same.
So I didn't find a bullet proof solution with canceling the tasks, but it can 
be that I missed something and there is a good way to solve this.

If you have any idea, please share it with me, any idea is welcome. :)


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/#review197829
---


On Feb. 20, 2018, 5:03 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65716/
> ---
> 
> (Updated Feb. 20, 2018, 5:03 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18696
> https://issues.apache.org/jira/browse/HIVE-18696
> 
> 
> 

Re: Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-02-22 Thread Marta Kuczora via Review Board


> On Feb. 21, 2018, 1:38 p.m., Peter Vary wrote:
> > Thanks for the patch Marta!
> > Mostly just questions.
> > Peter

Thanks a lot Peter for the review.


> On Feb. 21, 2018, 1:38 p.m., Peter Vary wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 2830 (patched)
> > 
> >
> > nit: formatting

Thanks, I'll fix it. It seems the autoformat is not perfect either. :)


> On Feb. 21, 2018, 1:38 p.m., Peter Vary wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 2856 (patched)
> > 
> >
> > Question: Do this has to be AtomicBoolean instead of simple boolean?

It cannot be a simple boolean, cause it leads to compile error: Local variable 
defined in an enclosing scope must be final or effectively final
But the boolean flag cannot be final since it has to be set in case of error. 
So we need some kind of wrapper here and AtomicBoolean seemed a good choice to 
me. Please tell me if you have concerns with AtomicBoolean.


> On Feb. 21, 2018, 1:38 p.m., Peter Vary wrote:
> > standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
> > Lines 3098 (patched)
> > 
> >
> > Same as above

See above.


> On Feb. 21, 2018, 1:38 p.m., Peter Vary wrote:
> > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
> > Line 840 (original), 858 (patched)
> > 
> >
> > Is this line only changed in formatting?

Yes, just for some reason the diff looks weird.


> On Feb. 21, 2018, 1:38 p.m., Peter Vary wrote:
> > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
> > Lines 626 (patched)
> > 
> >
> > Why is this change? Is this an incompatible change?

No it is not an incompatible change. What happened is that we got a NPE in this 
case, but it was inside the task execution, so it got wrapped into a 
MetaException. Since some code was moved outside the task execution, the NPE 
which occurred here was not wrapped any more. But I will modify some parts so a 
MetaException will occur just as before.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/#review197843
---


On Feb. 20, 2018, 5:03 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65716/
> ---
> 
> (Updated Feb. 20, 2018, 5:03 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18696
> https://issues.apache.org/jira/browse/HIVE-18696
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The idea behind the patch is
> 
> 1) Separate the partition validation from starting the tasks which create the 
> partition folders. 
> Instead of doing the checks on the partitions and submit the tasks in one 
> loop, separated the validation into a different loop. So first iterate 
> through the partitions, validate the table/db names, and check for 
> duplicates. Then if all partitions were correct, in the second loop submit 
> the tasks to create the partition folders. This way if one of the partitions 
> is incorrect, the exception will be thrown in the first loop, before the 
> tasks are submitted. So we can be sure that no partition folder will be 
> created if the list contains an invalid partition.
> 
> 2) Handle the exceptions which occur during the execution of the tasks 
> differently.
> Previously if an exception occured in one task, the remaining tasks were 
> canceled, and the newly created partition folders were cleaned up in the 
> finally part. The problem was that it could happen that some tasks were still 
> not finished with the folder creation when cleaning up the others, so there 
> could have been leftover folders. After doing some testing it turned out that 
> this use case cannot be avoided completely when canceling the tasks.
> The idea of this patch is to set a flag if an exception is thrown in one of 
> the tasks. This flag is visible in the tasks and if its value is true, the 
> partition folders won't be created. Then iterate through the remaining tasks 
> and wait for them to finish. The tasks which are started before the flag got 
> set will then finish creating the partition folders. The tasks which are 
> started after the 

Review Request 65731: HIVE-18699: Check for duplicate partitions in HiveMetastore.exchange_partitions

2018-02-21 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65731/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-18699
https://issues.apache.org/jira/browse/HIVE-18699


Repository: hive-git


Description
---

Extended the HiveMetastore.exchange_partitions method to check if the 
partitions to be exchanged don't exist in the dest table. If one of the 
partitions already exists, throw a MetaException with a proper error message.

Previously an exception like this (wrapped in a MetaException) was thrown:
Insert of object
"org.apache.hadoop.hive.metastore.model.MPartition@4e78fff5" using statement 
"INSERT INTO PARTITIONS
(PART_ID,CREATE_TIME,LAST_ACCESS_TIME,PART_NAME,SD_ID,TBL_ID) VALUES 
(?,?,?,?,?,?)" failed : The statement was
aborted because it would have caused a duplicate key value in a unique or 
primary key constraint or unique index
identified by 'UNIQUEPARTITION' defined on 'PARTITIONS'.

>From user point of view, the type of the exception is not changed 
>(MetaException), just the error message is changed to a more understandable 
>one.


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 47de215 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java
 3a06aec 


Diff: https://reviews.apache.org/r/65731/diff/1/


Testing
---

Tests already exist for this use case in TestExchangePartitions:
- testExchangePartitionsPartAlreadyExists
- testExchangePartitionPartAlreadyExists


Thanks,

Marta Kuczora



Review Request 65730: HIVE-18697: The HiveMetastore.exchange_partitions method throws FileNotFoundException if the given partition doesn't exist in the source table

2018-02-21 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65730/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-18697
https://issues.apache.org/jira/browse/HIVE-18697


Repository: hive-git


Description
---

Extended the HiveMetastore.exchange_partitions method to check if the 
partitionsToExchange list is empty and if it is throw a MetaException with a 
proper error message that no partition exists with the given values for the 
source table. Previously a FileNotFoundException was thrown (wrapped in a 
MetaException) when tried to move the partition folder to the dest table. So 
the type of the exception was not changed, only the error message.


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 47de215 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java
 3a06aec 


Diff: https://reviews.apache.org/r/65730/diff/1/


Testing
---

There are tests for this use case in TestExchangePartitions:
- testExchangePartitionsNoPartExists
- testExchangePartitionNoPartExists


Thanks,

Marta Kuczora



Review Request 65716: HIVE-18696: The partition folders might not get cleaned up properly in the HiveMetaStore.add_partitions_core method if an exception occurs

2018-02-20 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65716/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-18696
https://issues.apache.org/jira/browse/HIVE-18696


Repository: hive-git


Description
---

The idea behind the patch is

1) Separate the partition validation from starting the tasks which create the 
partition folders. 
Instead of doing the checks on the partitions and submit the tasks in one loop, 
separated the validation into a different loop. So first iterate through the 
partitions, validate the table/db names, and check for duplicates. Then if all 
partitions were correct, in the second loop submit the tasks to create the 
partition folders. This way if one of the partitions is incorrect, the 
exception will be thrown in the first loop, before the tasks are submitted. So 
we can be sure that no partition folder will be created if the list contains an 
invalid partition.

2) Handle the exceptions which occur during the execution of the tasks 
differently.
Previously if an exception occured in one task, the remaining tasks were 
canceled, and the newly created partition folders were cleaned up in the 
finally part. The problem was that it could happen that some tasks were still 
not finished with the folder creation when cleaning up the others, so there 
could have been leftover folders. After doing some testing it turned out that 
this use case cannot be avoided completely when canceling the tasks.
The idea of this patch is to set a flag if an exception is thrown in one of the 
tasks. This flag is visible in the tasks and if its value is true, the 
partition folders won't be created. Then iterate through the remaining tasks 
and wait for them to finish. The tasks which are started before the flag got 
set will then finish creating the partition folders. The tasks which are 
started after the flag got set, won't create the partition folders, to avoid 
unnecessary work. This way it is sure that all tasks are finished, when 
entering the finally part where the partition folders are cleaned up.


Diffs
-

  
standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 47de215 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
 f483ca8 
  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 919ba78 


Diff: https://reviews.apache.org/r/65716/diff/1/


Testing
---

Added some new tests cases to the TestAddPartitions and 
TestAddPartitionsFromPartSpec tests.


Thanks,

Marta Kuczora



Re: Review Request 65507: HIVE-18580: Create tests to cover exchange partitions

2018-02-07 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65507/
---

(Updated Feb. 7, 2018, 3:35 p.m.)


Review request for hive, Peter Vary and Adam Szita.


Changes
---

Fixed the review findings.


Bugs: HIVE-18580
https://issues.apache.org/jira/browse/HIVE-18580


Repository: hive-git


Description
---

The following methods of IMetaStoreClient are covered by this test.
- int Partition exchange_partition(Map, String, String, String, 
String)
- List Partition exchange_partition(Map, String, 
String, String, String)

The test covers not just the happy pathes, but the edge cases as well.


Diffs (updated)
-

  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/65507/diff/2/

Changes: https://reviews.apache.org/r/65507/diff/1-2/


Testing
---

Ran the tests


Thanks,

Marta Kuczora



Re: Review Request 65507: HIVE-18580: Create tests to cover exchange partitions

2018-02-07 Thread Marta Kuczora via Review Board


> On Feb. 7, 2018, 10:16 a.m., Adam Szita wrote:
> > Thanks for the patch Marta, this is incredibly thourough!

Thanks a lot Adam for the review.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65507/#review196979
---


On Feb. 5, 2018, 6:01 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65507/
> ---
> 
> (Updated Feb. 5, 2018, 6:01 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18580
> https://issues.apache.org/jira/browse/HIVE-18580
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The following methods of IMetaStoreClient are covered by this test.
> - int Partition exchange_partition(Map, String, String, 
> String, String)
> - List Partition exchange_partition(Map, String, 
> String, String, String)
> 
> The test covers not just the happy pathes, but the edge cases as well.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65507/diff/1/
> 
> 
> Testing
> ---
> 
> Ran the tests
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65507: HIVE-18580: Create tests to cover exchange partitions

2018-02-07 Thread Marta Kuczora via Review Board


> On Feb. 7, 2018, 11:25 a.m., Peter Vary wrote:
> > Ship It!

Thanks a lot Peter for the review.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65507/#review196982
---


On Feb. 5, 2018, 6:01 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65507/
> ---
> 
> (Updated Feb. 5, 2018, 6:01 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18580
> https://issues.apache.org/jira/browse/HIVE-18580
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The following methods of IMetaStoreClient are covered by this test.
> - int Partition exchange_partition(Map, String, String, 
> String, String)
> - List Partition exchange_partition(Map, String, 
> String, String, String)
> 
> The test covers not just the happy pathes, but the edge cases as well.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65507/diff/1/
> 
> 
> Testing
> ---
> 
> Ran the tests
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Review Request 65507: HIVE-18580: Create tests to cover exchange partitions

2018-02-05 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65507/
---

Review request for hive, Peter Vary and Adam Szita.


Bugs: HIVE-18580
https://issues.apache.org/jira/browse/HIVE-18580


Repository: hive-git


Description
---

The following methods of IMetaStoreClient are covered by this test.
- int Partition exchange_partition(Map, String, String, String, 
String)
- List Partition exchange_partition(Map, String, 
String, String, String)

The test covers not just the happy pathes, but the edge cases as well.


Diffs
-

  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/65507/diff/1/


Testing
---

Ran the tests


Thanks,

Marta Kuczora



Re: Review Request 65380: HIVE-18566: Create tests to cover adding partitions from PartitionSpec

2018-01-30 Thread Marta Kuczora via Review Board


> On Jan. 29, 2018, 2:34 p.m., Peter Vary wrote:
> > Nice throughout test cases.
> > Thanks Marta!
> > 
> > Fix it and ship it!

Thanks a lot Peter for the review.


> On Jan. 29, 2018, 2:34 p.m., Peter Vary wrote:
> > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitions.java
> > Lines 1312 (patched)
> > 
> >
> > Agree with Adam, please move it to another test class.

Moved the tests to a separate class.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65380/#review196431
---


On Jan. 30, 2018, 8:15 a.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65380/
> ---
> 
> (Updated Jan. 30, 2018, 8:15 a.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18566
> https://issues.apache.org/jira/browse/HIVE-18566
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> he following methods of IMetaStoreClient are covered by this test.
> - int add_partitions_pspec(PartitionSpecProxy)
> 
> The test covers not just the happy pathes, but the edge cases as well.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65380/diff/2/
> 
> 
> Testing
> ---
> 
> Run the tests
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65380: HIVE-18566: Create tests to cover adding partitions from PartitionSpec

2018-01-30 Thread Marta Kuczora via Review Board


> On Jan. 29, 2018, 2:24 p.m., Adam Szita wrote:
> > Nice patch with well covered cases.
> > Since this is a lot of test code, wouldn't it make sense to move the 
> > PartitionSpec-related code into a separate class (rather than having a 2k+ 
> > LOC class)?

Thanks a lot Adam for the review.
Good point, I moved the tests to a different class.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65380/#review196430
---


On Jan. 30, 2018, 8:15 a.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65380/
> ---
> 
> (Updated Jan. 30, 2018, 8:15 a.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18566
> https://issues.apache.org/jira/browse/HIVE-18566
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> he following methods of IMetaStoreClient are covered by this test.
> - int add_partitions_pspec(PartitionSpecProxy)
> 
> The test covers not just the happy pathes, but the edge cases as well.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65380/diff/2/
> 
> 
> Testing
> ---
> 
> Run the tests
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 65380: HIVE-18566: Create tests to cover adding partitions from PartitionSpec

2018-01-30 Thread Marta Kuczora via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65380/
---

(Updated Jan. 30, 2018, 8:15 a.m.)


Review request for hive, Peter Vary and Adam Szita.


Changes
---

Addressed the review findings. Moved the tests to a different class.


Bugs: HIVE-18566
https://issues.apache.org/jira/browse/HIVE-18566


Repository: hive-git


Description
---

he following methods of IMetaStoreClient are covered by this test.
- int add_partitions_pspec(PartitionSpecProxy)

The test covers not just the happy pathes, but the edge cases as well.


Diffs (updated)
-

  
standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAddPartitionsFromPartSpec.java
 PRE-CREATION 


Diff: https://reviews.apache.org/r/65380/diff/2/

Changes: https://reviews.apache.org/r/65380/diff/1-2/


Testing
---

Run the tests


Thanks,

Marta Kuczora



Re: Review Request 65349: HIVE-18544: Create tests to cover methods for appending Partitions

2018-01-29 Thread Marta Kuczora via Review Board


> On Jan. 29, 2018, 11:15 a.m., Adam Szita wrote:
> > Thanks for the patch Marta, it looks very thourough! I've added my 2 cents: 
> > small observations regarding helper methods only.

Thanks a lot Adam for the review.


> On Jan. 29, 2018, 11:15 a.m., Adam Szita wrote:
> > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
> > Lines 489-495 (patched)
> > 
> >
> > When defining table types we should rely on the enum values IMHO: 
> > TableType.EXTERNAL_TABLE.name()

You are right, thanks for pointing this out.


> On Jan. 29, 2018, 11:15 a.m., Adam Szita wrote:
> > standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
> > Lines 536-539 (patched)
> > 
> >
> > Another way of doing this is:
> > 
> > 
> > Arrays.stream(input.split("/")).map(v->v.split("=")[1]).collect(toList())

Thanks for the hint, I fixed this.


- Marta


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65349/#review196418
---


On Jan. 29, 2018, 12:49 p.m., Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/65349/
> ---
> 
> (Updated Jan. 29, 2018, 12:49 p.m.)
> 
> 
> Review request for hive, Peter Vary and Adam Szita.
> 
> 
> Bugs: HIVE-18544
> https://issues.apache.org/jira/browse/HIVE-18544
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The following methods of IMetaStoreClient are covered by this test.
> - Partition appendPartition(String, String, List)
> - Partition appendPartition(String, String, String)
> 
> The test covers not just the happy pathes, but the edge cases as well.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/65349/diff/2/
> 
> 
> Testing
> ---
> 
> Run the tests
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



  1   2   >