Re: Review Request 72528: ValidTxnManager doesn't consider txns opened and committed between snapshot generation and locking when evaluating ValidTxnListState

2020-06-19 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72528/#review221036
---


Ship it!




+1 since the remaining issue will be fixed in HIVE-23725

- Peter Vary


On jún. 9, 2020, 8:52 de, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72528/
> ---
> 
> (Updated jún. 9, 2020, 8:52 de)
> 
> 
> Review request for hive, Jesús Camacho Rodríguez, Peter Varga, and Peter Vary.
> 
> 
> Bugs: HIVE-23503
> https://issues.apache.org/jira/browse/HIVE-23503
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> ValidTxnManager doesn't consider txns opened and committed between snapshot 
> generation and locking when evaluating ValidTxnListState. This cause issues 
> like duplicate insert in case of concurrent merge insert & insert.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java e70c92eef4 
>   ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java a8c83fc504 
>   ql/src/java/org/apache/hadoop/hive/ql/ValidTxnManager.java 7d49c57dda 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 71afcbdc68 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> 0383881acc 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 600289f837 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 8a15b7cc5d 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  65df9c2ba9 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
>  887d4303f4 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClientPreCatalog.java
>  312936efa8 
>   storage-api/src/java/org/apache/hadoop/hive/common/ValidReadTxnList.java 
> b8ff03f9c4 
>   storage-api/src/java/org/apache/hadoop/hive/common/ValidTxnList.java 
> d4c3b09730 
> 
> 
> Diff: https://reviews.apache.org/r/72528/diff/1/
> 
> 
> Testing
> ---
> 
> DbTxnManager tests.
> 
> Faulty scenario:
> 1. open and generate snapshot for t1 that merge inserts data from a source 
> table into the target one.
> 2. Open, run and commit t2 that inserts source table data into the target 
> table.
> 3. Run t1 - duplicate date would be inserted into target table as t2 changes 
> won't be visible by t1.
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 72553: HIVE-23555 Cancel compaction jobs when hive.compactor.worker.timeout is reached

2020-05-28 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72553/
---

(Updated máj. 28, 2020, 8:58 de)


Review request for hive, Karen Coppage and Laszlo Pinter.


Changes
---

Reworked the tests.
Removed unused looped parameter from MetastoreThreads
Made sure that the end of new the tests the Worker is stopped.


Bugs: HIVE-23555
https://issues.apache.org/jira/browse/HIVE-23555


Repository: hive-git


Description
---

Run the actual execution in a new thread, and use Future.get with timeout


Diffs (updated)
-

  
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
 569de706df 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
 e70d8783bc 
  
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
 32fe535b2b 
  ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUpdaterThread.java 
ecaad509ed 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 5fa3d9ad42 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorThread.java 
b378d40964 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java fa2ede3738 
  
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MetaStoreCompactorThread.java
 aa258b331f 
  
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/RemoteCompactorThread.java 
4235184fec 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 8180adcd66 
  ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 366282a30f 
  ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java 3ff68a3c7e 
  ql/src/test/org/apache/hadoop/hive/ql/stats/TestStatsUpdaterThread.java 
84827d1604 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java 
9a9ab53fcc 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestWorker.java 
443f982d66 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 e20fdaf03d 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreThread.java
 ea6155200c 
  streaming/src/test/org/apache/hive/streaming/TestStreaming.java 6101caac66 


Diff: https://reviews.apache.org/r/72553/diff/2/

Changes: https://reviews.apache.org/r/72553/diff/1-2/


Testing
---

Created unit tests to check the timeout functionality.


Thanks,

Peter Vary



Review Request 72553: HIVE-23555 Cancel compaction jobs when hive.compactor.worker.timeout is reached

2020-05-27 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72553/
---

Review request for hive, Karen Coppage and Laszlo Pinter.


Bugs: HIVE-23555
https://issues.apache.org/jira/browse/HIVE-23555


Repository: hive-git


Description
---

Run the actual execution in a new thread, and use Future.get with timeout


Diffs
-

  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorThread.java 
b378d40964 
  
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/RemoteCompactorThread.java 
4235184fec 
  ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 8180adcd66 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/CompactorTest.java 
9a9ab53fcc 
  ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestWorker.java 
443f982d66 


Diff: https://reviews.apache.org/r/72553/diff/1/


Testing
---

Created unit tests to check the timeout functionality.


Thanks,

Peter Vary



Re: Review Request 72480: HIVE-23242 Fix flaky tests testHouseKeepingThreadExistence

2020-05-20 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72480/#review220839
---


Ship it!




Ship It!

- Peter Vary


On máj. 20, 2020, 2:03 du, Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72480/
> ---
> 
> (Updated máj. 20, 2020, 2:03 du)
> 
> 
> Review request for hive, Miklos Gergely and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Fix the timing to avoid flakyness.
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/MetastoreHousekeepingLeaderTestBase.java
>  a39a9c8e04 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/MetastoreTaskThreadAlwaysTestImpl.java
>  4cd2c58896 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/RemoteMetastoreTaskThreadTestImpl1.java
>  c590b6aad5 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/RemoteMetastoreTaskThreadTestImpl2.java
>  5b50f66c51 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetastoreHousekeepingLeader.java
>  03a8161ea4 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetastoreHousekeepingLeaderEmptyConfig.java
>  75ea637503 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetastoreHousekeepingNonLeader.java
>  0341d3c03b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  57c006b872 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/MetaStoreTestUtils.java
>  0e2c35acaa 
> 
> 
> Diff: https://reviews.apache.org/r/72480/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Peter Varga
> 
>



Re: Review Request 72480: HIVE-23242 Fix flaky tests testHouseKeepingThreadExistence

2020-05-19 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72480/#review220824
---




standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Line 10130 (original), 10130 (patched)


Could be please add a javadoc comment here. I think it is especially 
important as startedBackGroundThreads is a test only parameterer. (maybe rename 
to startedBackgroundThread)



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Lines 10468 (patched)


nit: extra space



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/MetaStoreTestUtils.java
Lines 234 (patched)


Question: Is this log line printed when the HMS is started, but the threads 
are not yet stated? Maybe extend the log line with this info?



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/MetaStoreTestUtils.java
Lines 242 (patched)


Question: Is this again the case when the HMS is started, but the HK 
threads are not started? Maybe extend the log line that the HMS is started?


- Peter Vary


On máj. 8, 2020, 9:46 de, Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72480/
> ---
> 
> (Updated máj. 8, 2020, 9:46 de)
> 
> 
> Review request for hive, Miklos Gergely and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Fix the timing to avoid flakyness.
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/MetastoreHousekeepingLeaderTestBase.java
>  a39a9c8e04 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/MetastoreTaskThreadAlwaysTestImpl.java
>  4cd2c58896 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/RemoteMetastoreTaskThreadTestImpl1.java
>  c590b6aad5 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/RemoteMetastoreTaskThreadTestImpl2.java
>  5b50f66c51 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetastoreHousekeepingLeader.java
>  03a8161ea4 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetastoreHousekeepingLeaderEmptyConfig.java
>  75ea637503 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/metastore/TestMetastoreHousekeepingNonLeader.java
>  0341d3c03b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  7bba8d6ee6 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/MetaStoreTestUtils.java
>  2702e69f86 
> 
> 
> Diff: https://reviews.apache.org/r/72480/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Peter Varga
> 
>



Re: Review Request 72281: HIVE-22971: Eliminate file rename in insert-only compactor

2020-05-19 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72281/#review220822
---


Ship it!




Ship It!

- Peter Vary


On máj. 19, 2020, 5:58 de, Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72281/
> ---
> 
> (Updated máj. 19, 2020, 5:58 de)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-22971
> https://issues.apache.org/jira/browse/HIVE-22971
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> File rename is expensive for object stores, so MM (insert-only) compaction 
> should skip that step when committing and write directly to base_x_cZ or 
> delta_x_y_cZ.
> 
> This also fixes the issue that for MM QB compaction the temp tables were 
> stored under the table directory, and these temp dirs were never cleaned up.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java f5ad3a882b7 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  b9db1d1bb98 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  89920ccebf4 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 9410a963518 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> c70d4f33a80 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> 4d0e5f703e7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  724a4375b75 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java
>  1cd95f80155 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> 7f3ccfa04ed 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  6542eef58af 
> 
> 
> Diff: https://reviews.apache.org/r/72281/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 72281: HIVE-22971: Eliminate file rename in insert-only compactor

2020-05-19 Thread Peter Vary via Review Board


> On máj. 18, 2020, 12:51 du, Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
> > Lines 305 (patched)
> > 
> >
> > Migth want to add asserts here to check non-null argument
> 
> Karen Coppage wrote:
> I think if the StorageDescriptor is null, a NPE *should* be thrown 
> because that would be a huge problem, "non-null" is in the JavaDoc contract, 
> this method is used once, and will make  it a private method in the next 
> patch.

Since it become private, it is even better! :)
Normally for public methods I perfer using, so if someone tries to use this 
then easier to identify what went wrong. Like this:
```
assert sd != null : "Non-null sd is required"
```


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72281/#review220805
---


On máj. 19, 2020, 5:58 de, Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72281/
> ---
> 
> (Updated máj. 19, 2020, 5:58 de)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-22971
> https://issues.apache.org/jira/browse/HIVE-22971
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> File rename is expensive for object stores, so MM (insert-only) compaction 
> should skip that step when committing and write directly to base_x_cZ or 
> delta_x_y_cZ.
> 
> This also fixes the issue that for MM QB compaction the temp tables were 
> stored under the table directory, and these temp dirs were never cleaned up.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java f5ad3a882b7 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  b9db1d1bb98 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  89920ccebf4 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 9410a963518 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> c70d4f33a80 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> 4d0e5f703e7 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  724a4375b75 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java
>  1cd95f80155 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> 7f3ccfa04ed 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  6542eef58af 
> 
> 
> Diff: https://reviews.apache.org/r/72281/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 72281: HIVE-22971: Eliminate file rename in insert-only compactor

2020-05-18 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72281/#review220805
---



Minor comments only.
Thanks for the patch!


common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Lines 2635-2636 (original), 2635-2636 (patched)


nit: spaces at the end of the lines (seems to me usually we do not do that)



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
Lines 305 (patched)


Migth want to add asserts here to check non-null argument


- Peter Vary


On márc. 30, 2020, 6:18 de, Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72281/
> ---
> 
> (Updated márc. 30, 2020, 6:18 de)
> 
> 
> Review request for hive and Laszlo Pinter.
> 
> 
> Bugs: HIVE-22971
> https://issues.apache.org/jira/browse/HIVE-22971
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> File rename is expensive for object stores, so MM (insert-only) compaction 
> should skip that step when committing and write directly to base_x_cZ or 
> delta_x_y_cZ.
> 
> This also fixes the issue that for MM QB compaction the temp tables were 
> stored under the table directory, and these temp dirs were never cleaned up.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 34df01e60e 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  95fa6641f2 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  9659a3f048 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 543ec0b991 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> f47c23a6de 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> 1bf0beea40 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  114b6f7a74 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMinorQueryCompactor.java
>  383891bfad 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> 7f3ccfa04e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  6542eef58a 
> 
> 
> Diff: https://reviews.apache.org/r/72281/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 72481: HIVE-23234: Optimize TxnHandler::allocateTableWriteIds

2020-05-08 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72481/#review220689
---



Thanks Marci,
Few querstions below - probably I just do not understand this part of the code 
enough.

Another question for the perf test: How many threads are you using?

Thanks,
Peter


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
Line 1102 (original)


Why did we remove this?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
Line 1137 (original)


why did we remove this?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 1762-1765 (original), 1759-1762 (patched)


Is this a functionality or performance change?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 2021 (original), 2013 (patched)


Why is this change required?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 2067 (original), 2057 (patched)


Why is this change?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 2079 (original), 2070 (patched)


Why is this change?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 2090 (original), 2081 (patched)


Why is this change?


- Peter Vary


On máj. 7, 2020, 3:55 du, Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72481/
> ---
> 
> (Updated máj. 7, 2020, 3:55 du)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Removed global mutex on writeId allocation, which means write ids can now be 
> allocated concurrently for different tables without blocking each other, 
> speeding up execution (perf test results below). Concurrent 
> allocateTableWriteIds() operations targeting the same table are still mutexed 
> by an S4U if the table is already present in next_write_id, otherwise a race 
> condition to insert the table into next_write_id is solved by retrying after 
> catching the duplicate key exception (the thread which commits later will be 
> the one to retry).
> 
> The situation is similar when allocateTableWriteIds() and 
> replTableWriteIdState() are running concurrently - if they target different 
> tables, they won't block each other anymore. If they target the same table, 
> and the table is already inserted into next_write_id, replTableWriteIdState() 
> returns early and allocateTableWriteIds() updates the next id. If the table 
> is not yet in next_write_id, they might attempt to insert the same row 
> concurrently, in which case who commits later will get a duplicate key 
> exception and retry the operation, just as above.
> 
> 
> Diffs
> -
> 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 868da0c7a0 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  d59f863b11 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  cf41ef8aaf 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  1e177f4a7b 
> 
> 
> Diff: https://reviews.apache.org/r/72481/diff/1/
> 
> 
> Testing
> ---
> 
> Unit test in TestTxnHandler
> + Perf tests:
> dbTypesameTable variant  ms/op  error
> MYSQL FALSE original 46.93  3.041
> MYSQL FALSE patched  19.283 1.311
> MYSQL TRUE  original 50.185 3.595
> MYSQL TRUE  patched  32.254 2.164
> ORACLEFALSE original 57.609 4.461
> ORACLEFALSE patched  25.721 2.551
> ORACLETRUE  original 59.668 3.172
> ORACLETRUE  patched  39.061 2.548
> POSTGRES  FALSE original 39.364 2.94 
> POSTGRES  FALSE patched  18.518 1.038
> POSTGRES  TRUE  original 39.868 2.679
> POSTGRES  TRUE  patched  28.874 1.768
> SQLSERVER FALSE original 45.252 1.643
> SQLSERVER FALSE patched  24.583 1.529
> SQLSERVER TRUE  original 49.149 3.45 
> SQLSERVER TRUE  patched  32.918 1.654
> 

Re: Review Request 72470: ACID: Concurrent MERGE INSERT operations produce duplicates

2020-05-07 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72470/#review220673
---


Ship it!




Ship It!

- Peter Vary


On máj. 7, 2020, 1:21 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72470/
> ---
> 
> (Updated máj. 7, 2020, 1:21 du)
> 
> 
> Review request for hive, Marton Bod, Peter Varga, and Peter Vary.
> 
> 
> Bugs: HIVE-23349
> https://issues.apache.org/jira/browse/HIVE-23349
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> 2 concurrent MERGE INSERT operations generate duplicates due to lack of 
> locking.
> MERGE INSERT is treated as regular INSERT, it acquires SHARED_READ lock that 
> doesn't prevent other INSERTs. We should use EXCLUSIVE lock here or 
> EXCL_WRITE if hive.txn.write.xlock=false;
> 
> create table target (a int, b int) stored as orc TBLPROPERTIES 
> ('transactional'='true')");
> insert into target values (1,2), (3,4)
> create table source (a int, b int)
> execute in parallel:
> 
> insert into source values (5,6), (7,8)
> 
> PS:
> 
> Current patch doesn cover following scenario:
> 1) T1 (merge-insert) opens txns & records snapshot;
> 2) T2 (insert/merge-insert) opens txns, records snapshot & locks it;
> 3) T2 commits it's changes and unlocks T1; 
> 4) T1 locks snapshot and validates txn list, however only txns with txnId 
> lower than T1's is beeing considered, T2 changes are ignored -> duplicates 
> are generated.
> 
> 
> merge-insert/merge-insert scenario could be fixed by leveraging write-write 
> conflict check mechanism. We just need to set operation type for merge-insert 
> to update.
> However it won't solve issue with merge-insert/insert. 
> 
> We should consider moving locking before snapshot generation, current 
> snapshot re-check mechanism doesn't handle described scenarios.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Context.java 9f59d4cea3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ExplainTask.java c1f94d165b 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 998c05e37d 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java deaab89c1f 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/MergeSemanticAnalyzer.java 
> 3ffdcec528 
>   
> ql/src/test/org/apache/hadoop/hive/ql/lockmgr/DbTxnManagerEndToEndTestBase.java
>  b435e79c3c 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 1687425bcb 
>   ql/src/test/results/clientpositive/llap/explain_locks.q.out d62f6ccafd 
> 
> 
> Diff: https://reviews.apache.org/r/72470/diff/4/
> 
> 
> Testing
> ---
> 
> Manual, added number of merge related test scenarios into TestDbTxnManager2, 
> modified explain_locks.q
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 72436: Locks: Implement zero-wait readers

2020-04-28 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72436/#review220526
---


Ship it!




Ship It!

- Peter Vary


On ápr. 28, 2020, 4:23 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72436/
> ---
> 
> (Updated ápr. 28, 2020, 4:23 du)
> 
> 
> Review request for hive, Marton Bod and Peter Vary.
> 
> 
> Bugs: HIVE-23293
> https://issues.apache.org/jira/browse/HIVE-23293
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With a new lock type (EXCL_WRITE) for INSERT_OVERWRITE, SHARED_READ does not 
> have to wait for any lock - it can fails fast for a pending EXCLUSIVE, 
> because even if there is an EXCL_WRITE or SHARED_WRITE pending, there's no 
> semantic reason to wait for them.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java 8e643fe844 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java b4dac4346e 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> f90396b2a3 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LockRequest.java
>  7402fb30eb 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LockResponse.java
>  fdaab4b394 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  9fb7ff011a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  4f317b3453 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  e64ae0ead2 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockRequestBuilder.java
>  93da0f60ec 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 1e3d6e9b8b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  fe39b0b36e 
> 
> 
> Diff: https://reviews.apache.org/r/72436/diff/4/
> 
> 
> Testing
> ---
> 
> DbTxnManager tests
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 72444: HIVE-23280: Trigger compaction with old aborted txns

2020-04-28 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72444/#review220523
---


Ship it!




Ship It!

- Peter Vary


On ápr. 28, 2020, 10:37 de, Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72444/
> ---
> 
> (Updated ápr. 28, 2020, 10:37 de)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-23280
> https://issues.apache.org/jira/browse/HIVE-23280
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When a txn is aborted and the compaction threshold for number of aborted txns 
> is not reached then the aborted transaction can remain forever in the RDBMS 
> database. This could result in several serious performance degradations:
> 
> getOpenTxns has to list this aborted txn forever
> TXN_TO_WRITE_ID table is not cleaned
> We should add a threshold, so after a given time the compaction is started 
> anyway.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e3ddbf197b 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 37a5862791 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  15fcfc0e35 
>   ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
> 1151466f8c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CompactionInfoStruct.java
>  31b6ed450b 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  9fb7ff011a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  4f317b3453 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  e64ae0ead2 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 1e3d6e9b8b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionInfo.java
>  70d63ab18b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  2344c2d5f6 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  87130a519d 
> 
> 
> Diff: https://reviews.apache.org/r/72444/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 72444: HIVE-23280: Trigger compaction with old aborted txns

2020-04-28 Thread Peter Vary via Review Board


> On ápr. 28, 2020, 8:54 de, Peter Vary wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
> > Lines 113 (patched)
> > 
> >
> > Do we know for sure, that for null partition that this query is working 
> > for all of the supported databases?
> 
> Karen Coppage wrote:
> Group by partition was already present, I just removed a conditional 
> clause and added 2 columns to the projection without touching partition 
> handling. Do you think it's worth testing in each supported db even for these 
> changes?

Ok. I missed that, and thought that it is a more throughout change.
No need for extra tests then.


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72444/#review220514
---


On ápr. 28, 2020, 10:37 de, Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72444/
> ---
> 
> (Updated ápr. 28, 2020, 10:37 de)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-23280
> https://issues.apache.org/jira/browse/HIVE-23280
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When a txn is aborted and the compaction threshold for number of aborted txns 
> is not reached then the aborted transaction can remain forever in the RDBMS 
> database. This could result in several serious performance degradations:
> 
> getOpenTxns has to list this aborted txn forever
> TXN_TO_WRITE_ID table is not cleaned
> We should add a threshold, so after a given time the compaction is started 
> anyway.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e3ddbf197b 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 37a5862791 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  15fcfc0e35 
>   ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
> 1151466f8c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CompactionInfoStruct.java
>  31b6ed450b 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  9fb7ff011a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  4f317b3453 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  e64ae0ead2 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 1e3d6e9b8b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionInfo.java
>  70d63ab18b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  2344c2d5f6 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  87130a519d 
> 
> 
> Diff: https://reviews.apache.org/r/72444/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 72436: Locks: Implement zero-wait readers

2020-04-28 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72436/#review220517
---



Quick quersions.
Thanks for the patch,
Peter


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 4353 (patched)


why is this move?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 4389-4391 (patched)


This is strange for me.
We do not abort the transaction, but throw an TxnAbortedException, but 
remove the Locks?


- Peter Vary


On ápr. 27, 2020, 11:24 de, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72436/
> ---
> 
> (Updated ápr. 27, 2020, 11:24 de)
> 
> 
> Review request for hive, Marton Bod and Peter Vary.
> 
> 
> Bugs: HIVE-23293
> https://issues.apache.org/jira/browse/HIVE-23293
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With a new lock type (EXCL_WRITE) for INSERT_OVERWRITE, SHARED_READ does not 
> have to wait for any lock - it can fails fast for a pending EXCLUSIVE, 
> because even if there is an EXCL_WRITE or SHARED_WRITE pending, there's no 
> semantic reason to wait for them.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java b4dac4346e 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 497cedd61f 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/LockRequest.java
>  7402fb30eb 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  9fb7ff011a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  4f317b3453 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  e64ae0ead2 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockRequestBuilder.java
>  93da0f60ec 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 1e3d6e9b8b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  580786832e 
> 
> 
> Diff: https://reviews.apache.org/r/72436/diff/2/
> 
> 
> Testing
> ---
> 
> DbTxnManager tests
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 72444: HIVE-23280: Trigger compaction with old aborted txns

2020-04-28 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72444/#review220514
---



Thanks for the patch Karen!
Few questions below.

Thanks,
Peter


common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Lines 2852 (patched)


Can we add a comment about valid values, and how to turn this off?



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
Lines 92 (patched)


Can we check the valid values, and can we turn this function off?



ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java
Lines 262 (patched)


Could we add a check that check that we do not start compaction before the 
threshold?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
Lines 113 (patched)


Do we know for sure, that for null partition that this query is working for 
all of the supported databases?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
Lines 151 (patched)


Why Instant? I usually use System.currentTimeMillis(). Also it might be 
worth to get it only once, and not for every compactionInfo



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
Lines 153 (patched)


Maybe single statement return, like:
return firstAbortedTxnTime + abortedTimeThreshold < 
System.currentTimeMillis()


- Peter Vary


On ápr. 28, 2020, 8:39 de, Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72444/
> ---
> 
> (Updated ápr. 28, 2020, 8:39 de)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-23280
> https://issues.apache.org/jira/browse/HIVE-23280
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When a txn is aborted and the compaction threshold for number of aborted txns 
> is not reached then the aborted transaction can remain forever in the RDBMS 
> database. This could result in several serious performance degradations:
> 
> getOpenTxns has to list this aborted txn forever
> TXN_TO_WRITE_ID table is not cleaned
> We should add a threshold, so after a given time the compaction is started 
> anyway.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java e3ddbf197b 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 37a5862791 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  15fcfc0e35 
>   ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
> 1151466f8c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CompactionInfoStruct.java
>  31b6ed450b 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  9fb7ff011a 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  4f317b3453 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  e64ae0ead2 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 1e3d6e9b8b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionInfo.java
>  70d63ab18b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  2344c2d5f6 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  87130a519d 
> 
> 
> Diff: https://reviews.apache.org/r/72444/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 72388: HIVE-23048 Use sequences for TXN_ID generation

2020-04-27 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72388/#review220500
---


Fix it, then Ship it!




Fix it and ship it


standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
Lines 72 (patched)


The TXN_ID should be 10-1, otherwise we will try to generate a 
quite big list of openTxns :)



standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
Lines 76 (patched)


TXN_ID should be 10-1



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Oracle.java
Line 27 (original), 27 (patched)


Is this public?


- Peter Vary


On ápr. 24, 2020, 3:40 du, Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72388/
> ---
> 
> (Updated ápr. 24, 2020, 3:40 du)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * Use sequences for TXN_ID generation.
>   * Make it possible to open transactions in parallel
>   * drop Oracle 11g support
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 37a5862791 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  15fcfc0e35 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> f512c1df19 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 2c13e8dd03 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 51b0fa336f 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommandsForMmTable.java 
> 6525ffc00a 
>   ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java 
> 1435269ed3 
>   
> ql/src/test/org/apache/hadoop/hive/ql/lockmgr/DbTxnManagerEndToEndTestBase.java
>  PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 497cedd61f 
>   
> ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManagerIsolationProperties.java
>  PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
> 1151466f8c 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  a874121e12 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/SQLGenerator.java
>  49b737ecf9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  2344c2d5f6 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  385f9d72cd 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  4a6fa6f620 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  87130a519d 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  1ace9d3ef0 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  8a3cd56658 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  2e0177723d 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  9f3951575b 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  0512a45cad 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
>  4b82e36ab4 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
>  db398e5f66 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
>  1be83fc4a9 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
>  e6e30160af 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
>  b90cecb173 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/DbInstallBase.java
>  c1a1629548 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/DatabaseRule.java
>  a6d22d19ef 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Oracle.java
>  0b070e19ac 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestOpenTxn.java
>  PRE-CREATION 
> 
> 
> Diff: 

Re: Review Request 72388: HIVE-23048 Use sequences for TXN_ID generation

2020-04-24 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72388/#review220487
---



Overall looks good.
There are some changes which might not be needed.
Please check the access modifiers for functions and decrease it whenever 
possible.

Would be good to see the perf numbers for all of the db-s

Thanks for the patch


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/SQLGenerator.java
Lines 76 (patched)


Do we need this change?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 428 (patched)


nit: extra line missing
Why is this needed?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 847 (patched)


private?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 910 (patched)


Maybe private?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 1453 (patched)


Why is this change?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
Line 47 (original), 47 (patched)


What is the reason behind this change?


- Peter Vary


On ápr. 23, 2020, 2:48 du, Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72388/
> ---
> 
> (Updated ápr. 23, 2020, 2:48 du)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> * Use sequences for TXN_ID generation.
>   * Make it possible to open transactions in parallel
>   * drop Oracle 11g support
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 37a5862791 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  15fcfc0e35 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 1d211857bf 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands2.java 2c13e8dd03 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommands3.java 51b0fa336f 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommandsForMmTable.java 
> 6525ffc00a 
>   ql/src/test/org/apache/hadoop/hive/ql/TxnCommandsBaseForTests.java 
> 1435269ed3 
>   
> ql/src/test/org/apache/hadoop/hive/ql/lockmgr/DbTxnManagerEndToEndTestBase.java
>  PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 73d3b91585 
>   
> ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManagerIsolationProperties.java
>  PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
> 1151466f8c 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  a874121e12 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/tools/SQLGenerator.java
>  49b737ecf9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  2344c2d5f6 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  385f9d72cd 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  d0d0320584 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  87130a519d 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  1ace9d3ef0 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  8a3cd56658 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  2e0177723d 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  9f3951575b 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  0512a45cad 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
>  4b82e36ab4 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
>  db398e5f66 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
>  1be83fc4a9 
>   
> 

Re: Review Request 72387: Locks: Add new lock implementations for always zero-wait readers

2020-04-22 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72387/#review220426
---


Ship it!





ql/src/java/org/apache/hadoop/hive/ql/ValidTxnManager.java
Lines 96 (patched)


nit: new line?



ql/src/java/org/apache/hadoop/hive/ql/ValidTxnManager.java
Line 113 (original), 112 (patched)


Isn't reference level check is risky here?


- Peter Vary


On ápr. 22, 2020, 2:29 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72387/
> ---
> 
> (Updated ápr. 22, 2020, 2:29 du)
> 
> 
> Review request for hive, Marton Bod and Peter Vary.
> 
> 
> Bugs: HIVE-19369
> https://issues.apache.org/jira/browse/HIVE-19369
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Hive Locking with Micro-managed and full-ACID tables needs a better locking 
> implementation which allows for no-wait readers always.
> 
> EXCL_DROP
> EXCL_WRITE
> SHARED_WRITE
> SHARED_READ
> 
> Short write-up
> 
> EXCL_DROP is a "drop partition" or "drop table" and waits for all others to 
> exit
> EXCL_WRITE excludes all writes and will wait for all existing SHARED_WRITE to 
> exit.
> SHARED_WRITE allows all SHARED_WRITES to go through, but will wait for an 
> EXCL_WRITE & EXCL_DROP (waiting so that you can do drop + insert in different 
> threads).
> 
> SHARED_READ does not wait for any lock - it fails fast for a pending 
> EXCL_DROP, because even if there is an EXCL_WRITE or SHARED_WRITE pending, 
> there's no semantic reason to wait for them to succeed before going ahead 
> with a SHARED_WRITE.
> 
> a select * => SHARED_READ
> an insert into => SHARED_WRITE
> an insert overwrite or MERGE => EXCL_WRITE
> a drop table => EXCL_DROP
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 16bae920df 
>   ql/src/java/org/apache/hadoop/hive/ql/ValidTxnManager.java 4885e437aa 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 77878ca40b 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 1d211857bf 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 73d3b91585 
>   ql/src/test/queries/clientpositive/explain_locks.q 3c11560c5f 
>   ql/src/test/results/clientpositive/llap/explain_locks.q.out 3183533478 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockRequestBuilder.java
>  22902a9c20 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockTypeComparator.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  d080df417b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtil.java
>  f928bf781b 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtilTest.java
>  3d69a6e7dc 
> 
> 
> Diff: https://reviews.apache.org/r/72387/diff/5/
> 
> 
> Testing
> ---
> 
> Added number of tests under TestDbTxnManager2, TestTxnHandler & extended 
> explain_locks.q test.
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 72392: HIVE-23103 Oracle statement batching

2020-04-21 Thread Peter Vary via Review Board


> On ápr. 21, 2020, 7:51 de, Marton Bod wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
> > Lines 662 (patched)
> > 
> >
> > Once this part executes, we wouldn't have 'begin' for the next batch, 
> > no? Also, the sb would need to be cleared I think

good find!
Fixed it


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72392/#review220387
---


On ápr. 21, 2020, 12:41 du, Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72392/
> ---
> 
> (Updated ápr. 21, 2020, 12:41 du)
> 
> 
> Review request for hive, Denys Kuzmenko and Marton Bod.
> 
> 
> Bugs: HIVE-23103
> https://issues.apache.org/jira/browse/HIVE-23103
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Examine how to really get better performance for oracle statement batches.
> 
> Oracle JDBC doc describes:
> 
> The Oracle implementation of standard update batching does not implement true 
> batching for generic statements and callable statements. Even though Oracle 
> JDBC supports the use of standard batching for Statement and 
> CallableStatement objects, you are unlikely to see performance improvement.
> 
> I would look for connection properties to set, so it is handled anyway, or if 
> not, then use:
> 
> begin
>   query1;
>   query2;
>   query3;
> end;
> to we will have only a single roundtrip for the db.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  bb29410e7d 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  d080df417b 
> 
> 
> Diff: https://reviews.apache.org/r/72392/diff/2/
> 
> 
> Testing
> ---
> 
> Baseline:
> Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
> Error  Units
> TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  42.988 ± 
> 4.569  ms/op
> TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  45.029 ± 
> 4.686  ms/op
> 
> After patch:
> Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
> Error  Units
> TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  36.208 ± 
> 3.869  ms/op
> TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  37.038 ± 
> 3.746  ms/op
> 
> 
> Thanks,
> 
> Peter Vary
> 
>



Re: Review Request 72392: HIVE-23103 Oracle statement batching

2020-04-21 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72392/
---

(Updated ápr. 21, 2020, 12:41 du)


Review request for hive, Denys Kuzmenko and Marton Bod.


Changes
---

Addessed Marton's comment


Bugs: HIVE-23103
https://issues.apache.org/jira/browse/HIVE-23103


Repository: hive-git


Description
---

Examine how to really get better performance for oracle statement batches.

Oracle JDBC doc describes:

The Oracle implementation of standard update batching does not implement true 
batching for generic statements and callable statements. Even though Oracle 
JDBC supports the use of standard batching for Statement and CallableStatement 
objects, you are unlikely to see performance improvement.

I would look for connection properties to set, so it is handled anyway, or if 
not, then use:

begin
  query1;
  query2;
  query3;
end;
to we will have only a single roundtrip for the db.


Diffs (updated)
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
 bb29410e7d 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 d080df417b 


Diff: https://reviews.apache.org/r/72392/diff/2/

Changes: https://reviews.apache.org/r/72392/diff/1-2/


Testing
---

Baseline:
Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
Error  Units
TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  42.988 ± 
4.569  ms/op
TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  45.029 ± 
4.686  ms/op

After patch:
Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
Error  Units
TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  36.208 ± 
3.869  ms/op
TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  37.038 ± 
3.746  ms/op


Thanks,

Peter Vary



Re: Review Request 72387: Locks: Add new lock implementations for always zero-wait readers

2020-04-20 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72387/#review220374
---



Some nits, did not checked all the unit tests changes. Anything I have to check 
particularly?


common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Lines 2727-2730 (original), 2727-2730 (patched)


Is this config still needed in this case?



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Lines 2732 (patched)


Maybe rephrase?
Manages concurrency levels for ACID resoruces. Enables users enable 
parallel queries by enabling write-write conflict resolution happen only at 
commit phase 
- If true - no commit phase conflict resolution: 
   - INSERT OVERWRITE requests EXCLUSIVE locks
   - UPDATE/DELETE requests EXCL_WRITE lock
   - INSERT requests SHARED_READ lock
- If false - write might fail when committed on conflict check: 
   - INSERT OVERWRITE requests EXCL_WRITE locks
   - UPDATE/DELETE/INSERT requests SHARED_READ lock



ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
Line 3027 (original), 3027-3028 (patched)


Maybe get once, and store?



ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java
Lines 758 (patched)


nit: remove spaces



ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java
Line 1142 (original), 1265 (patched)


nit: space too


- Peter Vary


On ápr. 20, 2020, 7:23 de, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72387/
> ---
> 
> (Updated ápr. 20, 2020, 7:23 de)
> 
> 
> Review request for hive, Marton Bod and Peter Vary.
> 
> 
> Bugs: HIVE-19369
> https://issues.apache.org/jira/browse/HIVE-19369
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Hive Locking with Micro-managed and full-ACID tables needs a better locking 
> implementation which allows for no-wait readers always.
> 
> EXCL_DROP
> EXCL_WRITE
> SHARED_WRITE
> SHARED_READ
> 
> Short write-up
> 
> EXCL_DROP is a "drop partition" or "drop table" and waits for all others to 
> exit
> EXCL_WRITE excludes all writes and will wait for all existing SHARED_WRITE to 
> exit.
> SHARED_WRITE allows all SHARED_WRITES to go through, but will wait for an 
> EXCL_WRITE & EXCL_DROP (waiting so that you can do drop + insert in different 
> threads).
> 
> SHARED_READ does not wait for any lock - it fails fast for a pending 
> EXCL_DROP, because even if there is an EXCL_WRITE or SHARED_WRITE pending, 
> there's no semantic reason to wait for them to succeed before going ahead 
> with a SHARED_WRITE.
> 
> a select * => SHARED_READ
> an insert into => SHARED_WRITE
> an insert overwrite or MERGE => EXCL_WRITE
> a drop table => EXCL_DROP
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 7b3acad511 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 77878ca40b 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 1d211857bf 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 73d3b91585 
>   ql/src/test/queries/clientpositive/explain_locks.q 3c11560c5f 
>   ql/src/test/results/clientpositive/explain_locks.q.out 3183533478 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockRequestBuilder.java
>  22902a9c20 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/LockTypeComparator.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  331fd4cc8d 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtil.java
>  f928bf781b 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/LockTypeUtilTest.java
>  3d69a6e7dc 
> 
> 
> Diff: https://reviews.apache.org/r/72387/diff/1/
> 
> 
> Testing
> ---
> 
> Added number of tests under TestDbTxnManager2, TestTxnHandler & extended 
> explain_locks.q test.
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Review Request 72392: HIVE-23103 Oracle statement batching

2020-04-20 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72392/
---

Review request for hive, Denys Kuzmenko and Marton Bod.


Bugs: HIVE-23103
https://issues.apache.org/jira/browse/HIVE-23103


Repository: hive-git


Description
---

Examine how to really get better performance for oracle statement batches.

Oracle JDBC doc describes:

The Oracle implementation of standard update batching does not implement true 
batching for generic statements and callable statements. Even though Oracle 
JDBC supports the use of standard batching for Statement and CallableStatement 
objects, you are unlikely to see performance improvement.

I would look for connection properties to set, so it is handled anyway, or if 
not, then use:

begin
  query1;
  query2;
  query3;
end;
to we will have only a single roundtrip for the db.


Diffs
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
 bb29410e7d 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 d080df417b 


Diff: https://reviews.apache.org/r/72392/diff/1/


Testing
---

Baseline:
Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
Error  Units
TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  42.988 ± 
4.569  ms/op
TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  45.029 ± 
4.686  ms/op

After patch:
Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
Error  Units
TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  36.208 ± 
3.869  ms/op
TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  37.038 ± 
3.746  ms/op


Thanks,

Peter Vary



Re: Review Request 72388: HIVE-23048 Use sequences for TXN_ID generation

2020-04-20 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72388/#review220369
---



This is a really big/scary change. I am really interested in the performance 
results! :)
Thanks for all the effort! Some questions below


ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManagerAcid.java
Lines 119 (patched)


nit: space after //



ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManagerAcid.java
Lines 154 (patched)


Could we create a test for multigap case (gap with more then 1?)



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
Lines 563 (patched)


Maybe insert the query instead of getting id and using again. Not very 
important, just asking...



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 424 (patched)


nit: missing line break



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 448 (original), 528 (patched)


Just a question: This is so similar to getOpenTxnsInfo... Any way to use 
the same code with different query and different response handling?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 699 (patched)


openTxns(dbConn, stmt, rqst) is used in replTableWriteIdState as well. Do 
we have to check there for timeout as well?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 747 (patched)


nit: typo? "support every"?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 780 (patched)


TXN_META_INFO? What is it used for before? Or is it new? Could we use a 
specific "state" for example?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 641 (original), 783 (patched)


Could we use a correct PreparedStatement, with the values set without the 
"hacky" quoting?

Like: pstmt.setXX for state/user/host/type/metainfo?

Or does this have a noticable performance gain?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 891 (patched)


If we implement the stuff in a single query in both cases when it is used, 
we can get rid of this method



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 1168 (original), 1418 (patched)


If we do not make such a fuss about the checking, just a simple assert 
instead, we might can inline the method here, since it is not used anywhere 
else? What do you think?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 1995 (original), 2245 (patched)


nit: space "MetaException{"



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 2262 (patched)


This is again a single query implemented with 3 SQL query and a java code. 
Am I right?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 4539 (original), 4775 (patched)


nit: I do not see what is changed, but if we change please remove double 
spaces too :)



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 5078 (original), 5321 (patched)


nit: Missing space: "MetaException{"?



standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
Lines 117 (patched)


nit: extra space? "TXN_COMPONENTS  WITH"



standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/txn/TestOpenTxn.java
Lines 78 (patched)


Please test for multiple gaps too


- Peter Vary


On ápr. 20, 2020, 7:25 de, 

Re: Review Request 72380: HIVE-23207 Create integration tests for TxnManager for different rdbms metastores

2020-04-20 Thread Peter Vary via Review Board


> On ápr. 17, 2020, 7:03 du, Peter Vary wrote:
> > Thanks Peter for the patch!
> > This fix is long overdue!
> > 
> > I do not understand one thing, see below.
> > 
> > Also I would like to ask Denys to confirm, that running the init sqls again 
> > and again will not cause too much overhead in test runtime as he mentioned 
> > in our discussion. (5 min tests are fine, 1 hour tests are not fine :))
> > 
> > Thanks!
> 
> Peter Varga wrote:
> In this version we do not run the init sql again and again. We just run 
> it once, and we just run the cleanDb between tests. There might be some Test 
> classes that is not true, because there are multiple prepDb calls (I solved 
> in in Qtest, TestDbTxnManager2) but in those places it will be just one call 
> to check if the txns table exists, and then we return

Got it. Thanks!


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72380/#review220352
---


On ápr. 17, 2020, 3:24 du, Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72380/
> ---
> 
> (Updated ápr. 17, 2020, 3:24 du)
> 
> 
> Review request for hive, Denys Kuzmenko and Zoltan Chovan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> In the final version the prepDb creates the transactional tables with the 
> init schema (and ignores the others if they exists). The cleanDb resets the 
> database to the starting point. So between the test cases the cleanDb call is 
> enough. If the prepDb is called unneccessary it will just check if the txns 
> table exist and then return, so it will be fast
> 
> 
> Diffs
> -
> 
>   itests/hive-blobstore/pom.xml 09955c55f3 
>   itests/qtest-accumulo/pom.xml a35d2a8a10 
>   itests/qtest-druid/pom.xml cc0cceff68 
>   itests/qtest-kudu/pom.xml f23399fa37 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java
>  b86d736a89 
>   pom.xml 90e39702a1 
>   ql/pom.xml d1846c9245 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  15fcfc0e35 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 1d211857bf 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandlerNoConnectionPool.java
>  ebe4880e3a 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/ITestDbTxnManager.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 73d3b91585 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java
>  3e56ad513c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  a66e16973f 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  962a63d418 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  1ace9d3ef0 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
>  5f3db52c2f 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Mysql.java
>  c537d95470 
> 
> 
> Diff: https://reviews.apache.org/r/72380/diff/1/
> 
> 
> Testing
> ---
> 
> On my machine the 50 tests in TestDbTxnManager2 on postgres runs under 5 
> minutes.
> 
> 
> Thanks,
> 
> Peter Varga
> 
>



Re: Review Request 72380: HIVE-23207 Create integration tests for TxnManager for different rdbms metastores

2020-04-17 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72380/#review220352
---



Thanks Peter for the patch!
This fix is long overdue!

I do not understand one thing, see below.

Also I would like to ask Denys to confirm, that running the init sqls again and 
again will not cause too much overhead in test runtime as he mentioned in our 
discussion. (5 min tests are fine, 1 hour tests are not fine :))

Thanks!


itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java
Lines 67 (patched)


Why is this needed?



itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java
Lines 119 (patched)


Why is this needed? Again?


- Peter Vary


On ápr. 17, 2020, 3:24 du, Peter Varga wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72380/
> ---
> 
> (Updated ápr. 17, 2020, 3:24 du)
> 
> 
> Review request for hive, Denys Kuzmenko and Zoltan Chovan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> In the final version the prepDb creates the transactional tables with the 
> init schema (and ignores the others if they exists). The cleanDb resets the 
> database to the starting point. So between the test cases the cleanDb call is 
> enough. If the prepDb is called unneccessary it will just check if the txns 
> table exist and then return, so it will be fast
> 
> 
> Diffs
> -
> 
>   itests/hive-blobstore/pom.xml 09955c55f3 
>   itests/qtest-accumulo/pom.xml a35d2a8a10 
>   itests/qtest-druid/pom.xml cc0cceff68 
>   itests/qtest-kudu/pom.xml f23399fa37 
>   
> itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestMetaStoreHandler.java
>  b86d736a89 
>   pom.xml 90e39702a1 
>   ql/pom.xml d1846c9245 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  15fcfc0e35 
>   ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandler.java 
> 1d211857bf 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestTxnHandlerNoConnectionPool.java
>  ebe4880e3a 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/ITestDbTxnManager.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java 
> 73d3b91585 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/DatabaseProduct.java
>  3e56ad513c 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  a66e16973f 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  962a63d418 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  1ace9d3ef0 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStoreTxns.java
>  5f3db52c2f 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/dbinstall/rules/Mysql.java
>  c537d95470 
> 
> 
> Diff: https://reviews.apache.org/r/72380/diff/1/
> 
> 
> Testing
> ---
> 
> On my machine the 50 tests in TestDbTxnManager2 on postgres runs under 5 
> minutes.
> 
> 
> Thanks,
> 
> Peter Varga
> 
>



Re: Review Request 72378: HIVE-23201: Improve logging in locking

2020-04-17 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72378/#review220347
---



Mostly agree, few comments.
I would like to ask you to go through them, and if there are any places where 
the problem should not happen use at least info level.

Thanks,
Peter


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 4492 (patched)


nit: Maybe remove the last ','?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 4810 (original), 4800 (patched)


I prefer the original info level for these logs.


- Peter Vary


On ápr. 17, 2020, 11:30 de, Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72378/
> ---
> 
> (Updated ápr. 17, 2020, 11:30 de)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-23201: Improve logging in locking
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbLockManager.java 4b6bc3e1e3 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  962a63d418 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java
>  b3a1f826bb 
> 
> 
> Diff: https://reviews.apache.org/r/72378/diff/1/
> 
> 
> Testing
> ---
> 
> Green build: https://builds.apache.org/job/PreCommit-HIVE-Build/21717/
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 72360: HIVE-23093: Create new metastore config value for jdbc max batch size

2020-04-16 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72360/#review220331
---




standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
Lines 650-651 (patched)


The text describing the config value is for administrators and not for 
developers. So this part should be a java comment instead of being part of the 
description



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 2520-2525 (patched)


I would do it in setConf, like we did with the numOpenTxns and friends


- Peter Vary


On ápr. 14, 2020, 2:25 du, Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72360/
> ---
> 
> (Updated ápr. 14, 2020, 2:25 du)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-23093: Create new metastore config value for jdbc max batch size
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
>  3bfb0e69cb 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  620c77e589 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  7d0db0c3a0 
> 
> 
> Diff: https://reviews.apache.org/r/72360/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 72359: HIVE-23104: Minimize critical paths of TxnHandler::commitTxn and abortTxn

2020-04-14 Thread Peter Vary via Review Board


> On ápr. 14, 2020, 11:22 de, Peter Vary wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Lines 1325 (patched)
> > 
> >
> > Not really important since this is a private method, but why not return 
> > boolean instead?
> 
> Marton Bod wrote:
> Would be nice, but we need the ResultSet to log out the details of what 
> exact conflict has been found.

Valid point :)


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72359/#review220308
---


On ápr. 14, 2020, 8:57 de, Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72359/
> ---
> 
> (Updated ápr. 14, 2020, 8:57 de)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-23104: Minimize critical paths of TxnHandler::commitTxn and abortTxn
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  7d0db0c3a0 
> 
> 
> Diff: https://reviews.apache.org/r/72359/diff/2/
> 
> 
> Testing
> ---
> 
> Green build: 
> https://builds.apache.org/job/PreCommit-HIVE-Build/21539/testReport
> Benchmark results attached to ticket: 
> https://issues.apache.org/jira/browse/HIVE-23104
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 72359: HIVE-23104: Minimize critical paths of TxnHandler::commitTxn and abortTxn

2020-04-14 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72359/#review220308
---



Thanks for the patch Marton!
Some questions, ideas.

Thanks,
Peter


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Line 1248 (original), 1228 (patched)


Is this 'rs' not reused later? Maybe use a local scoped rs here?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 1325 (patched)


Not really important since this is a private method, but why not return 
boolean instead?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 1351 (patched)


Maybe log the query in debug level?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 1375 (patched)


Might worth to initialize with a size


- Peter Vary


On ápr. 14, 2020, 8:57 de, Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72359/
> ---
> 
> (Updated ápr. 14, 2020, 8:57 de)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-23104: Minimize critical paths of TxnHandler::commitTxn and abortTxn
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  7d0db0c3a0 
> 
> 
> Diff: https://reviews.apache.org/r/72359/diff/1/
> 
> 
> Testing
> ---
> 
> Green build: 
> https://builds.apache.org/job/PreCommit-HIVE-Build/21539/testReport
> Benchmark results attached to ticket: 
> https://issues.apache.org/jira/browse/HIVE-23104
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 72112: HIVE-22869 - Add locking benchmark to metastore-tools/metastore-benchmarks

2020-04-14 Thread Peter Vary via Review Board


> On ápr. 3, 2020, 9:59 de, Peter Vary wrote:
> > standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
> > Lines 341 (patched)
> > 
> >
> > I put this in the code with HIVE-23042:
> >   boolean openTxn(int numTxns) throws TException {
> > client.open_txns(new OpenTxnRequest(numTxns, "Test", "Host"));
> > return true;
> >   }
> >   
> > Maybe merge those?
> 
> Zoltan Chovan wrote:
> The main difference between our two implementations of openTxn is that 
> mine automatically returns the opened txn's id, in your version there has to 
> be an additional getOpenTxns() call made to get the Id. 
> Not sure if getOpenTxns would return some other ids that belong to an 
> other client when multiple threads are used, o sI might be misunderstanding 
> the getOpenTxns() call.
> What do you think?

I still think it would be worth to keep only one version. In this specific case 
your new version, and use it on the other place.
What do you think?


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72112/#review220212
---


On ápr. 9, 2020, 2:58 du, Zoltan Chovan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72112/
> ---
> 
> (Updated ápr. 9, 2020, 2:58 du)
> 
> 
> Review request for hive, Denys Kuzmenko, Aron Hamvas, Marton Bod, and Peter 
> Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Add the possibility to run benchmarks on opening lock in the HMS. Currently 
> this change only introduces single-threaded/single client testing. I'm 
> planning to add multi-client support in a separate change.
> 
> Example parametrisation is as follows:
> hbench -M "lock" -N 10 -d hive_test -W 0 -L 100
> hbench -M ".*Lock.*" -N 10 -d hive_test -W 0 -L 100 -T 8 --params 100
> 
> This will create N number (10) of tables to lock and it'll execute the lock() 
> for L number (100) of times on T (8) threads where each thread will strart an 
> HMS client
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkTool.java
>  041cd76234 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
>  f53f2ef43b 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
>  7cc1e42a8b 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
>  101d6759c5 
> 
> 
> Diff: https://reviews.apache.org/r/72112/diff/1/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-22869.2.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/02/5e35e835-f383-495f-9964-e66773fd6a90__HIVE-22869.2.patch
> HIVE-22869.3.patch
>   
> https://reviews.apache.org/media/uploaded/files/2020/04/09/458beaa7-4743-40fb-a213-1ae4527be823__HIVE-22869.3.patch
> 
> 
> Thanks,
> 
> Zoltan Chovan
> 
>



Re: Review Request 72336: HIVE-23114: Insert overwrite with dynamic partitioning is not working correctly with direct insert

2020-04-08 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72336/#review220255
---


Fix it, then Ship it!




Single very important comment! :)


ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
Lines 2834 (patched)


nit: extra space


- Peter Vary


On ápr. 8, 2020, 12:20 du, Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72336/
> ---
> 
> (Updated ápr. 8, 2020, 12:20 du)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-23114
> https://issues.apache.org/jira/browse/HIVE-23114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The idea behind the patch is the following:
> When doing a multi-statement insert overwrite with dynamic partitioning, the 
> partition information will be written to the manifest file. With this 
> information, each FileSinkOperator can clean-up only the partition 
> directories written by the same FileSinkOperator and do not clean-up the 
> partition directories written by the other FileSinkOperators.
> If a statement from the insert overwrite query, doesn't produce any data, a 
> manifest file will still be written, otherwise the missing manifest file 
> would result a clean-up on table level which could delete the data written by 
> the other FileSinkOperators.
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties e99ce7babb 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> d68d8f9409 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 04166a23ee 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java e25dc54e7d 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 17e6cdf162 
>   ql/src/test/queries/clientpositive/acid_direct_insert_insert_overwrite.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/acid_multiinsert_dyn_part.q PRE-CREATION 
>   
> ql/src/test/results/clientpositive/acid_direct_insert_insert_overwrite.q.out 
> PRE-CREATION 
>   ql/src/test/results/clientpositive/acid_multiinsert_dyn_part.q.out 
> PRE-CREATION 
>   
> ql/src/test/results/clientpositive/llap/acid_direct_insert_insert_overwrite.q.out
>  PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/acid_multiinsert_dyn_part.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/72336/diff/1/
> 
> 
> Testing
> ---
> 
> Added specific q tests for different insert overwrite scenarios.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 72112: HIVE-22869 - Add locking benchmark to metastore-tools/metastore-benchmarks

2020-04-03 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72112/#review220212
---



Thanks for the patch @Zoltan!

Some comments below.

Peter


standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
Line 23 (original), 23 (patched)


Please avoid wildcard imports



standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
Line 31 (original), 29 (patched)


Please avoid wildcard imports



standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
Line 42 (original), 32 (patched)


Please avoid wildcard imports



standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
Lines 423-425 (patched)


Maybe not here, maybe in another jira, but check, measure and execute all 
of the HMS calls for a typical query (not sure that the names, and a list is 
entirely correct:) ) :
- openTxn
- getOpenTxns
- getValidWriteIds
- enqueueLock
- commit/abort



standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
Lines 430 (patched)


Do we need this? Shouldn't the lock be unlocked by commitTnx?



standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
Line 23 (original), 23 (patched)


wildcard import



standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
Line 51 (original), 44 (patched)


wildcard import



standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
Lines 341 (patched)


I put this in the code with HIVE-23042:
  boolean openTxn(int numTxns) throws TException {
client.open_txns(new OpenTxnRequest(numTxns, "Test", "Host"));
return true;
  }
  
Maybe merge those?



standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
Line 24 (original), 24 (patched)


wildcard import



standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
Line 40 (original), 35 (patched)


wildcard import


- Peter Vary


On ápr. 2, 2020, 2:13 du, Zoltan Chovan wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72112/
> ---
> 
> (Updated ápr. 2, 2020, 2:13 du)
> 
> 
> Review request for hive, Denys Kuzmenko, Aron Hamvas, Marton Bod, and Peter 
> Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Add the possibility to run benchmarks on opening lock in the HMS. Currently 
> this change only introduces single-threaded/single client testing. I'm 
> planning to add multi-client support in a separate change.
> 
> Example parametrisation is as follows:
> hbench -M "lock" -N 10 -d hive_test -W 0 -L 100
> hbench -M ".*Lock.*" -N 10 -d hive_test -W 0 -L 100 -T 8 --params 100
> 
> This will create N number (10) of tables to lock and it'll execute the lock() 
> for L number (100) of times on T (8) threads where each thread will strart an 
> HMS client
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/BenchmarkTool.java
>  041cd76234 
>   
> standalone-metastore/metastore-tools/metastore-benchmarks/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSBenchmarks.java
>  f53f2ef43b 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/HMSClient.java
>  7cc1e42a8b 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Util.java
>  101d6759c5 
> 
> 
> Diff: https://reviews.apache.org/r/72112/diff/1/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-22869.2.patch
>   
> 

Re: Review Request 72290: HIVE-23067: Use batch DB calls in TxnHandler for commitTxn and abortTxns

2020-04-02 Thread Peter Vary via Review Board


> On ápr. 1, 2020, 7:38 de, Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Lines 4295 (patched)
> > 
> >
> > Please extract this into 
> > org.apache.hadoop.hive.metastore.tools.SQLGenerator
> 
> Marton Bod wrote:
> as discussed, let's move it to TxnDbUtil

Do we want to repurpose TxnDbUtil? What will be the method to decide if 
something is in SQLGenerator, or TxnDbUtil?
Also TxnDbUtil class comment is this, we might want to fix this, if we change 
the usage pattern :)
/**
 * Utility methods for creating and destroying txn database/schema, plus 
methods for
 * querying against metastore tables.
 * Placed here in a separate class so it can be shared across unit tests.
 */


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72290/#review220164
---


On ápr. 1, 2020, 6:53 de, Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72290/
> ---
> 
> (Updated ápr. 1, 2020, 6:53 de)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-23067: Use batch DB calls in TxnHandler for commitTxn and abortTxns
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  ef88240a79 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  74ef88545e 
> 
> 
> Diff: https://reviews.apache.org/r/72290/diff/3/
> 
> 
> Testing
> ---
> 
> Green build: https://builds.apache.org/job/PreCommit-HIVE-Build/21347/
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 72283: HIVE-23076 Add batching for openTxn

2020-04-02 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72283/
---

(Updated ápr. 2, 2020, 10:23 de)


Review request for hive, Denys Kuzmenko and Marton Bod.


Changes
---

Updated with comments


Bugs: HIVE-23076
https://issues.apache.org/jira/browse/HIVE-23076


Repository: hive-git


Description
---

Add batching for openTxn request for better performance


Diffs (updated)
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 74ef88545e 


Diff: https://reviews.apache.org/r/72283/diff/3/

Changes: https://reviews.apache.org/r/72283/diff/2-3/


Testing
---

Tested it locally against all of the supported RDBMS types:
mysql no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 2.0941.8211.4624.78631.06   
openTxn0-2 2.4192.1611.7205.86732.43   
openTxn0-102.5782.2891.9737.20428.74   
openTxn0-100   6.9486.8355.25411.0315.91   
openTxn0-1000  51.3150.4933.5693.1016.27   
openTxn115k-1  26.9423.6922.24169.656.13   
openTxn115k-2  25.2623.8122.4250.6816.90   
openTxn115k-10 26.2024.2923.0160.7321.94   
openTxn125k-10029.1428.1825.8143.6311.16 

mysql patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 2.2641.9641.6526.02335.59   
openTxn0-2 2.5382.2891.9326.01329.41   
openTxn0-102.9822.6412.1778.82932.54   
openTxn0-100   6.7756.3865.01221.7327.10   
openTxn0-1000  42.9642.9330.8961.9214.46   
openTxn115k-1  24.2923.2722.4073.6221.64   
openTxn115k-2  24.0523.5822.4628.605.651   
openTxn115k-10 24.4824.0222.9429.976.075   
openTxn125k-10027.9127.5125.7842.506.905   

postgres no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 3.7342.8832.50611.4655.16   
openTxn0-2 3.8343.1112.63315.5053.22   
openTxn0-105.0054.1783.44916.8047.56   
openTxn0-100   9.8237.7556.83379.3479.96   
openTxn0-1000  75.5172.0358.62207.923.98   
openTxn115k-1  21.7919.4518.4366.7629.10   
openTxn115k-2  21.9120.1418.8851.4220.92   
openTxn115k-10 22.4320.8519.3845.1818.58   
openTxn125k-10027.7125.3623.1954.9921.46   

postgres patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 1.6881.4231.1307.81455.91   
openTxn0-2 1.9821.6621.3067.78647.13   
openTxn0-102.6802.5641.7615.06926.93   
openTxn0-100   8.3407.5355.35130.0037.97   
openTxn0-1000  41.7337.5524.38107.833.87   
openTxn115k-1  12.2411.6510.2126.2319.75   
openTxn115k-2  13.0711.8610.7668.9547.37   
openTxn115k-10 13.0312.2311.0654.8834.23   
openTxn125k-10015.6214.0312.46102.958.21   

Oracle no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 14.8513.9111.5027.2619.49   
openTxn0-2 17.8917.1314.5627.0013.53   
openTxn0-1023.1221.3817.9167.3725.46   
openTxn0-100   114.199.0382.62214.035.61   
openTxn0-1000  4123 3952 3593 5790 15.96   
openTxn115k-1  16.7416.8814.0121.7514.52   
openTxn115k-2  20.2818.3416.5130.3423.09   
openTxn115k-10 22.4221.0719.8731.3915.74   
openTxn125k-10088.1387.8878.95100.47.990   

Oracle patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 15.8714.0712.2180.4448.32   
openTxn0-2 17.0616.1412.8033.5219.47   
openTxn0-10 

Re: Review Request 72283: HIVE-23076 Add batching for openTxn

2020-04-01 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72283/
---

(Updated ápr. 1, 2020, 1:12 du)


Review request for hive, Denys Kuzmenko and Marton Bod.


Changes
---

addressed review comments


Bugs: HIVE-23076
https://issues.apache.org/jira/browse/HIVE-23076


Repository: hive-git


Description
---

Add batching for openTxn request for better performance


Diffs (updated)
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 74ef88545e 


Diff: https://reviews.apache.org/r/72283/diff/2/

Changes: https://reviews.apache.org/r/72283/diff/1-2/


Testing
---

Tested it locally against all of the supported RDBMS types:
mysql no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 2.0941.8211.4624.78631.06   
openTxn0-2 2.4192.1611.7205.86732.43   
openTxn0-102.5782.2891.9737.20428.74   
openTxn0-100   6.9486.8355.25411.0315.91   
openTxn0-1000  51.3150.4933.5693.1016.27   
openTxn115k-1  26.9423.6922.24169.656.13   
openTxn115k-2  25.2623.8122.4250.6816.90   
openTxn115k-10 26.2024.2923.0160.7321.94   
openTxn125k-10029.1428.1825.8143.6311.16 

mysql patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 2.2641.9641.6526.02335.59   
openTxn0-2 2.5382.2891.9326.01329.41   
openTxn0-102.9822.6412.1778.82932.54   
openTxn0-100   6.7756.3865.01221.7327.10   
openTxn0-1000  42.9642.9330.8961.9214.46   
openTxn115k-1  24.2923.2722.4073.6221.64   
openTxn115k-2  24.0523.5822.4628.605.651   
openTxn115k-10 24.4824.0222.9429.976.075   
openTxn125k-10027.9127.5125.7842.506.905   

postgres no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 3.7342.8832.50611.4655.16   
openTxn0-2 3.8343.1112.63315.5053.22   
openTxn0-105.0054.1783.44916.8047.56   
openTxn0-100   9.8237.7556.83379.3479.96   
openTxn0-1000  75.5172.0358.62207.923.98   
openTxn115k-1  21.7919.4518.4366.7629.10   
openTxn115k-2  21.9120.1418.8851.4220.92   
openTxn115k-10 22.4320.8519.3845.1818.58   
openTxn125k-10027.7125.3623.1954.9921.46   

postgres patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 1.6881.4231.1307.81455.91   
openTxn0-2 1.9821.6621.3067.78647.13   
openTxn0-102.6802.5641.7615.06926.93   
openTxn0-100   8.3407.5355.35130.0037.97   
openTxn0-1000  41.7337.5524.38107.833.87   
openTxn115k-1  12.2411.6510.2126.2319.75   
openTxn115k-2  13.0711.8610.7668.9547.37   
openTxn115k-10 13.0312.2311.0654.8834.23   
openTxn125k-10015.6214.0312.46102.958.21   

Oracle no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 14.8513.9111.5027.2619.49   
openTxn0-2 17.8917.1314.5627.0013.53   
openTxn0-1023.1221.3817.9167.3725.46   
openTxn0-100   114.199.0382.62214.035.61   
openTxn0-1000  4123 3952 3593 5790 15.96   
openTxn115k-1  16.7416.8814.0121.7514.52   
openTxn115k-2  20.2818.3416.5130.3423.09   
openTxn115k-10 22.4221.0719.8731.3915.74   
openTxn125k-10088.1387.8878.95100.47.990   

Oracle patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 15.8714.0712.2180.4448.32   
openTxn0-2 17.0616.1412.8033.5219.47   
openTxn0-10  

Re: Review Request 72290: HIVE-23067: Use batch DB calls in TxnHandler for commitTxn and abortTxns

2020-04-01 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72290/#review220167
---




standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 1379 (patched)


Will this issue unnecessary queries for read only queries? For Oracle this 
could increase exexution time

Also why not use executeQueryiesInBatch for this?


- Peter Vary


On ápr. 1, 2020, 6:53 de, Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72290/
> ---
> 
> (Updated ápr. 1, 2020, 6:53 de)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-23067: Use batch DB calls in TxnHandler for commitTxn and abortTxns
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  74ef88545e 
> 
> 
> Diff: https://reviews.apache.org/r/72290/diff/1/
> 
> 
> Testing
> ---
> 
> Green build: https://builds.apache.org/job/PreCommit-HIVE-Build/21347/
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 72291: HIVE-23107: Remove MIN_HISTORY_LEVEL table

2020-03-31 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72291/#review220153
---


Ship it!




Just a question, maybe not relevant.

Thanks for the patch LGTM +1 (pending tests :))


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
Lines 545-560 (original), 496-512 (patched)


How confident we are with this query?
Maybe delete immediately, like:
DELETE FROM TXN_TO_WRITE_ID WHERE T2W_TXNID < (SELECT MIN...)
The hard part is that it is hard to log what happened


- Peter Vary


On márc. 31, 2020, 1:51 du, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72291/
> ---
> 
> (Updated márc. 31, 2020, 1:51 du)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-23107: Remove MIN_HISTORY_LEVEL table
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 
> 54b616e60c73fa1005c6d679ea76d65e01a0749d 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionInfo.java
>  bf91ae704c83722502acaf445061bf297fde6a6f 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  19a95b64db64b7f8bf8e82fbdedf5b54cd30aed3 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  ef88240a791cbc553b6150b54701c7df8daf3b49 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  74ef88545e6fd680b311ca5c5e4d87f053af1026 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  41d2e7924b1286f82d212e051714107505fe9661 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  48ad67623143fcba2bb2507cc36ad46a58c13b75 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  7a230bde3a4dcd2504538bf8122db0bb2a59932f 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  9ed7f4f8192e1613f8942af3e9496c3ef7f1f04f 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  12d24e9a569310c57d86b1229deaa2b1d080e0b8 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  bc34b511a978af2a6704e47d8f73c9604c6cebf9 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
>  13f03bce6d579c18ff9a14e09133063fdcdbf7af 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
>  8482b5942c6284eda63ec7d2aa7c1202abb3db49 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
>  cbfdd861fd545ce786f5e267b284196f5fd9af03 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
>  aa35a7a7b392d4d46480c03ed95ffdec12b03b22 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
>  9462328a5f0c6dda8cffb683860153b8eb3aacec 
> 
> 
> Diff: https://reviews.apache.org/r/72291/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 72264: HIVE-23052: Optimize lock enqueueing in TxnHandler

2020-03-31 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72264/#review220146
---



Thanks for the patch Marci!
A few questions below, feel free to correct me if I am wrong - this was just a 
cursory review.

Thanks,
Peter


standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 199 (patched)


%d could be problematic with different locales



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 2412 (patched)


nit: unnecessary move, might make backporting harder



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 2545 (patched)


Is it possible to collect the all of the data before starting the RDBMS 
query, and get all of the data for a enqueueLockRequest in one go?



standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
Lines 2599 (patched)


Do we need this? We do not commit temp lock id, so we can have just a 
single value which definitely will not appear in the table


- Peter Vary


On márc. 24, 2020, 2:24 du, Marton Bod wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72264/
> ---
> 
> (Updated márc. 24, 2020, 2:24 du)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-23052: Optimize lock enqueueing in TxnHandler
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/datasource/BoneCPDataSourceProvider.java
>  f92ce7325e 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/datasource/DbCPDataSourceProvider.java
>  85719fdf84 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/datasource/HikariCPDataSourceProvider.java
>  76bbf3bc1e 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  74ef88545e 
> 
> 
> Diff: https://reviews.apache.org/r/72264/diff/2/
> 
> 
> Testing
> ---
> 
> Green build: https://builds.apache.org/job/PreCommit-HIVE-Build/21249/
> Custom benchmark results attached to ticket: 
> https://issues.apache.org/jira/browse/HIVE-23052
> 
> 
> Thanks,
> 
> Marton Bod
> 
>



Re: Review Request 72283: HIVE-23076 Add batching for openTxn

2020-03-30 Thread Peter Vary via Review Board


> On márc. 30, 2020, 10:36 de, Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Lines 603 (patched)
> > 
> >
> > Could we rename it to nextTxnId/firstTxnId? Not clear what first.

Sure, agree will be done


> On márc. 30, 2020, 10:36 de, Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Line 615 (original), 617 (patched)
> > 
> >
> > I wasn't sure, maybe you know, can we modify meta-prop in runtime? if 
> > not maybe we should move batchSize to constructor?

There is a way. We should decide if we want to do it or, not.
Maybe we should handle is as part of HIVE-23093?

Your thoughts?


> On márc. 30, 2020, 10:36 de, Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Lines 634 (patched)
> > 
> >
> > why (i-1) ? txnId starts from 1, right?

Updated as discussed


> On márc. 30, 2020, 10:36 de, Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Line 641 (original), 648 (patched)
> > 
> >
> > Would be great if we could extract query construction to query constants

Let's talk about this.
Part of me prefers this in the place where we call the query so I can see the 
whole picture in one place, another part of me prefers it collected in a 
constant.
Moved the big one to the top, since the formatting/parsing and kept this one 
because of this is used only here, and only once


> On márc. 30, 2020, 10:36 de, Denys Kuzmenko wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
> > Line 666 (original), 673 (patched)
> > 
> >
> > Could we move queryStr to constants?

Let's talk about this.
Part of me prefers this in the place where we call the query so I can see the 
whole picture in one place, another part of me prefers it collected in a 
constant.
Moved the big one to the top, since the formatting/parsing and kept this one 
because of this is used only here, and only once


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72283/#review220103
---


On márc. 30, 2020, 9:51 de, Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72283/
> ---
> 
> (Updated márc. 30, 2020, 9:51 de)
> 
> 
> Review request for hive, Denys Kuzmenko and Marton Bod.
> 
> 
> Bugs: HIVE-23076
> https://issues.apache.org/jira/browse/HIVE-23076
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Add batching for openTxn request for better performance
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  74ef88545e 
> 
> 
> Diff: https://reviews.apache.org/r/72283/diff/1/
> 
> 
> Testing
> ---
> 
> Tested it locally against all of the supported RDBMS types:
> mysql no patch
> Operation  Mean Med  Min  Max  Err%
> openTxn0-1 2.0941.8211.4624.78631.06   
> openTxn0-2 2.4192.1611.7205.86732.43   
> openTxn0-102.5782.2891.9737.20428.74   
> openTxn0-100   6.9486.8355.25411.0315.91   
> openTxn0-1000  51.3150.4933.5693.1016.27   
> openTxn115k-1  26.9423.6922.24169.656.13   
> openTxn115k-2  25.2623.8122.4250.6816.90   
> openTxn115k-10 26.2024.2923.0160.7321.94   
> openTxn125k-10029.1428.1825.8143.6311.16 
> 
> mysql patch
> Operation  Mean Med  Min  Max  Err%
> openTxn0-1 2.2641.9641.6526.02335.59   
> openTxn0-2 2.5382.2891.9326.01329.41   
> openTxn0-102.9822.6412.1778.82932.54   
> openTxn0-100   6.7756.3865.01221.7327.10   
> openTxn0-1000  42.9642.9330.8961.9214.46   
> 

Review Request 72283: HIVE-23076 Add batching for openTxn

2020-03-30 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72283/
---

Review request for hive, Denys Kuzmenko and Marton Bod.


Bugs: HIVE-23076
https://issues.apache.org/jira/browse/HIVE-23076


Repository: hive-git


Description
---

Add batching for openTxn request for better performance


Diffs
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 74ef88545e 


Diff: https://reviews.apache.org/r/72283/diff/1/


Testing
---

Tested it locally against all of the supported RDBMS types:
mysql no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 2.0941.8211.4624.78631.06   
openTxn0-2 2.4192.1611.7205.86732.43   
openTxn0-102.5782.2891.9737.20428.74   
openTxn0-100   6.9486.8355.25411.0315.91   
openTxn0-1000  51.3150.4933.5693.1016.27   
openTxn115k-1  26.9423.6922.24169.656.13   
openTxn115k-2  25.2623.8122.4250.6816.90   
openTxn115k-10 26.2024.2923.0160.7321.94   
openTxn125k-10029.1428.1825.8143.6311.16 

mysql patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 2.2641.9641.6526.02335.59   
openTxn0-2 2.5382.2891.9326.01329.41   
openTxn0-102.9822.6412.1778.82932.54   
openTxn0-100   6.7756.3865.01221.7327.10   
openTxn0-1000  42.9642.9330.8961.9214.46   
openTxn115k-1  24.2923.2722.4073.6221.64   
openTxn115k-2  24.0523.5822.4628.605.651   
openTxn115k-10 24.4824.0222.9429.976.075   
openTxn125k-10027.9127.5125.7842.506.905   

postgres no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 3.7342.8832.50611.4655.16   
openTxn0-2 3.8343.1112.63315.5053.22   
openTxn0-105.0054.1783.44916.8047.56   
openTxn0-100   9.8237.7556.83379.3479.96   
openTxn0-1000  75.5172.0358.62207.923.98   
openTxn115k-1  21.7919.4518.4366.7629.10   
openTxn115k-2  21.9120.1418.8851.4220.92   
openTxn115k-10 22.4320.8519.3845.1818.58   
openTxn125k-10027.7125.3623.1954.9921.46   

postgres patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 1.6881.4231.1307.81455.91   
openTxn0-2 1.9821.6621.3067.78647.13   
openTxn0-102.6802.5641.7615.06926.93   
openTxn0-100   8.3407.5355.35130.0037.97   
openTxn0-1000  41.7337.5524.38107.833.87   
openTxn115k-1  12.2411.6510.2126.2319.75   
openTxn115k-2  13.0711.8610.7668.9547.37   
openTxn115k-10 13.0312.2311.0654.8834.23   
openTxn125k-10015.6214.0312.46102.958.21   

Oracle no patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 14.8513.9111.5027.2619.49   
openTxn0-2 17.8917.1314.5627.0013.53   
openTxn0-1023.1221.3817.9167.3725.46   
openTxn0-100   114.199.0382.62214.035.61   
openTxn0-1000  4123 3952 3593 5790 15.96   
openTxn115k-1  16.7416.8814.0121.7514.52   
openTxn115k-2  20.2818.3416.5130.3423.09   
openTxn115k-10 22.4221.0719.8731.3915.74   
openTxn125k-10088.1387.8878.95100.47.990   

Oracle patch
Operation  Mean Med  Min  Max  Err%
openTxn0-1 15.8714.0712.2180.4448.32   
openTxn0-2 17.0616.1412.8033.5219.47   
openTxn0-1016.8915.6212.3437.9225.18   
openTxn0-100   18.9920.0315.6921.4610.72   

Re: Review Request 72109: HIVE-20948: Eliminate file rename in compactor

2020-02-17 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72109/#review219603
---



LGTM +1, just minor nits?


ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java
Lines 97-98 (original), 95-96 (patched)


nit: Do we need these formatting changes?



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java
Line 107 (original), 105 (patched)


nit: Do we need these formatting changes?



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java
Line 131 (original), 129 (patched)


nit: Do we need these formatting changes?



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java
Line 136 (original), 134 (patched)


nit: Do we need these formatting changes?


- Peter Vary


On febr. 11, 2020, 10:24 de, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72109/
> ---
> 
> (Updated febr. 11, 2020, 10:24 de)
> 
> 
> Review request for hive, Karen Coppage, Marta Kuczora, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20948: Eliminate file rename in compactor
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
> 9ad4e7148226b91b0c759de54e251893d61725a3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 
> 076b77877ae748b757a4c9c08532a3ce029fed38 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 
> 2f5ec5270c0bc7d1a591c9c8c15b1ecb7f9f6ace 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/FileSinkDesc.java 
> ecc7bdee4dacc03cf59ac5be4bed92a75f8e720b 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> bb70db452402dd690e2136a122e9b3bd11fa7522 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> f238eb5dd058fc79c5b7ad3b08920c774b1a7f8c 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> f96a0481b870b04cc97621cd62a43b07ecd5d7fd 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> d2349104902c2af3d6020c9599fd3fa20f9a64a5 
> 
> 
> Diff: https://reviews.apache.org/r/72109/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71904: HIVE-21164: ACID: explore how we can avoid a move step during inserts/compaction

2020-02-04 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71904/#review219487
---



Thanks for the patch! This will be very-very usefull.
Some minor comments, questions...


itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
Lines 55 (patched)


Is this import used?



ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
Lines 843 (patched)


Is inheritPerms still a working stuff? I kinda remember that it was removed 
from Hive some time ago...



ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
Lines 1444 (patched)


Why is this null?



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
Lines 1732-1737 (patched)


What about using lambda here?



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
Lines 1799 (patched)


Maybe slightly different log message, so we can easily ditinguish between 
this and the line below



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
Lines 7379 (patched)


We might want to make this feature configurable, to turn it on/off in case 
we missed some edge cases



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
Lines 7442-7443 (original), 7456-7460 (patched)


nit: Maybe if/else



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
Lines 7526-7543 (patched)


Is this duplicated code?



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
Lines 7562-7563 (original), 7600-7604 (patched)


nit: Maybe if/else?



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
Lines 493-494 (patched)


nit: Formatting? Really not important, just for the completensess shake :D



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
Lines 690-691 (patched)


nit: Formatting?



ql/src/test/org/apache/hadoop/hive/ql/TestTxnNoBuckets.java
Lines 77 (patched)


We created this variable - we should use it? Maybe set it even as a 
constant?



ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java
Lines 1246 (patched)


Is this table always exists? Shall we use "drop table if exists" instead?


- Peter Vary


On jan. 31, 2020, 4:12 du, Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71904/
> ---
> 
> (Updated jan. 31, 2020, 4:12 du)
> 
> 
> Review request for hive, Gopal V and Peter Vary.
> 
> 
> Bugs: HIVE-21164
> https://issues.apache.org/jira/browse/HIVE-21164
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Extended the original patch with saving the task attempt ids in the file 
> names and also fixed some bugs in the original patch.
> With this fix, inserting into an ACID table would not use move task to place 
> the generated files into the final directory. It will inserts every files to 
> the final directory and then clean up the files which are not needed (like 
> written by failed task attempts).
> Also fixed the replication tests which failed for the original patch as well.
> 
> 
> Diffs
> -
> 
>   
> hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java
>  da677c7977 
>   itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/TestAcidOnTez.java 
> 056cd27496 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/history/TestHiveHistory.java
>  31d15fdef9 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
>  c2aa73b5f1 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  4c0137 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> 9a3258115b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 06e4ebee82 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 6c67bc7dd8 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidInputFormat.java bba3960102 
>   

Review Request 72081: HIVE-22805 Vectorization with conditional array or map is not implemented and throws an error

2020-02-04 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72081/
---

Review request for hive and Ramesh Kumar Thangarajan.


Bugs: HIVE-22805
https://issues.apache.org/jira/browse/HIVE-22805


Repository: hive-git


Description
---

Implemented the copySelected and shallowCopyTo methods


Diffs
-

  ql/src/test/queries/clientpositive/vectorization_multi_value.q PRE-CREATION 
  ql/src/test/results/clientpositive/vectorization_multi_value.q.out 
PRE-CREATION 
  
storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/ListColumnVector.java
 8cbcc029a5 
  
storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/MapColumnVector.java 
3143a44ec8 
  
storage-api/src/java/org/apache/hadoop/hive/ql/exec/vector/MultiValuedColumnVector.java
 028084cfc7 


Diff: https://reviews.apache.org/r/72081/diff/1/


Testing
---

query tests


Thanks,

Peter Vary



Re: Review Request 72028: HIVE-22729: Provide a failure reason for failed compactions

2020-01-31 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72028/#review219450
---


Ship it!




- Peter Vary


On jan. 30, 2020, 10:23 de, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72028/
> ---
> 
> (Updated jan. 30, 2020, 10:23 de)
> 
> 
> Review request for hive, Denys Kuzmenko, Karen Coppage, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22729: Provide a failure reason for failed compactions
> 
> 
> Diffs
> -
> 
>   metastore/scripts/upgrade/hive/hive-schema-4.0.0.hive.sql 
> 5421d4d8141becae4e0de6a039bf7c46f0b109bb 
>   metastore/scripts/upgrade/hive/upgrade-3.1.0-to-4.0.0.hive.sql 
> 041190653898a39ef96c6c2bf71c4f4485f6a1a5 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/process/show/compactions/ShowCompactionsDesc.java
>  9348efc5a12b50f55f5952094882e941158405fd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/process/show/compactions/ShowCompactionsOperation.java
>  517d88237cc3b8f0316727bf1eebfc6535152fae 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 
> 6f642901203ab73699ed694009d48ca77263fb10 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> dedc990d0f1e9123497f0fb7c7b9945c7b29bde2 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 
> 5aff71e0e981c429f85663300d3e5c21089529a9 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  e5895547e6006f30a37b5ba0b1ce42253129d3b6 
>   ql/src/test/results/clientpositive/dbtxnmgr_showlocks.q.out 
> 03c6724ec2e50ae1f7c642339c1806d0786a9ec5 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CompactionInfoStruct.java
>  4aee45ce5f0e534823194bc84d13b88210ce0b3c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ShowCompactResponseElement.java
>  8a5682a013b24f8dcf7ad3fdb0b0b606d82cc7c0 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  e8556dcea68f34336df2925c4108e71185d6377f 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  b05e61e84a310273911ef592258bcd3b34e87734 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  868cf69200f69aa89e82b34e22ee0ad792e6d025 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 61a94fee4d82a714c12aeeb27f31e24774592c98 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionInfo.java
>  ba45f3945274853fdc84487d93c4c00ff2982541 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  aded6f5486cc840f397347b39049310009fd3bad 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  da5dd61d08e2ca8fe5e80ffdf9fb4a6f4c4d0ba3 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  c2c97d96c6cc98f9746069fa725d17d12f6c8642 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  67102718867233f29ddb2ea8ec3fbcb6560c6c30 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  ae0a32541a4bb9179b2bb71ae9f9098d7b35a88e 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  221d4f1fffb682aaec3af22a339e7a3077a75f6a 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  bc98d5fc4a5637988c97f2e5a0e02d3be16ae0cb 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  dd761a66db4826580a67d64879e4c85278b8e20c 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
>  6a040a6a64c2086b5eb68a397697c9e2d2ca4d76 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
>  f5ec1ba1aff89d02b66d6a2cd1da8de1b3b08d06 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
>  c7738be2732b839aa2b460733c092e368909f935 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
>  455f98b72578ff977e29301cd2fc595ae80ee4ca 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
>  5c39b0d9f4d27ab82ef44392818c1810cb7664ce 
> 
> 
> Diff: https://reviews.apache.org/r/72028/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 72043: HIVE-21487: COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing appropriate indexes

2020-01-31 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72043/#review219380
---



Have you tested the sqls on every db?
If they run correctly on all supported DB, then +1 from my side

Thanks,
Peter

- Peter Vary


On jan. 24, 2020, 10 de, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72043/
> ---
> 
> (Updated jan. 24, 2020, 10 de)
> 
> 
> Review request for hive, Denys Kuzmenko, Karen Coppage, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21487: COMPLETED_COMPACTIONS and COMPACTION_QUEUE table missing 
> appropriate indexes
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  da5dd61d08e2ca8fe5e80ffdf9fb4a6f4c4d0ba3 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  67102718867233f29ddb2ea8ec3fbcb6560c6c30 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  ae0a32541a4bb9179b2bb71ae9f9098d7b35a88e 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  221d4f1fffb682aaec3af22a339e7a3077a75f6a 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  bc98d5fc4a5637988c97f2e5a0e02d3be16ae0cb 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  dd761a66db4826580a67d64879e4c85278b8e20c 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
>  6a040a6a64c2086b5eb68a397697c9e2d2ca4d76 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
>  f5ec1ba1aff89d02b66d6a2cd1da8de1b3b08d06 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
>  c7738be2732b839aa2b460733c092e368909f935 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
>  455f98b72578ff977e29301cd2fc595ae80ee4ca 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
>  5c39b0d9f4d27ab82ef44392818c1810cb7664ce 
> 
> 
> Diff: https://reviews.apache.org/r/72043/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 72059: HIVE-22793: Update default settings in HMS Benchmarking tool

2020-01-30 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72059/#review219434
---


Ship it!




Ship It!

- Peter Vary


On jan. 30, 2020, 9:35 de, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72059/
> ---
> 
> (Updated jan. 30, 2020, 9:35 de)
> 
> 
> Review request for hive, Denys Kuzmenko, Karen Coppage, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22793: Update default settings in HMS Benchmarking tool
> 
> 
> Diffs
> -
> 
>   standalone-metastore/metastore-tools/metastore-benchmarks/README.md 
> a8c0a41f559261275aef48eb815c1b14f6cfdaed 
>   
> standalone-metastore/metastore-tools/tools-common/src/main/java/org/apache/hadoop/hive/metastore/tools/Constants.java
>  5a584f6adeccb5213916f23244277609ae8373cc 
> 
> 
> Diff: https://reviews.apache.org/r/72059/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 72028: HIVE-22729: Provide a failure reason for failed compactions

2020-01-29 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72028/#review219433
---



Could you please update the hive schema as well? Thanks, Peter
https://github.com/apache/hive/tree/master/metastore/scripts/upgrade/hive

- Peter Vary


On jan. 29, 2020, 9:20 de, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72028/
> ---
> 
> (Updated jan. 29, 2020, 9:20 de)
> 
> 
> Review request for hive, Denys Kuzmenko, Karen Coppage, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22729: Provide a failure reason for failed compactions
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/process/show/compactions/ShowCompactionsDesc.java
>  9348efc5a12b50f55f5952094882e941158405fd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/process/show/compactions/ShowCompactionsOperation.java
>  517d88237cc3b8f0316727bf1eebfc6535152fae 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java 
> 6f642901203ab73699ed694009d48ca77263fb10 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> dedc990d0f1e9123497f0fb7c7b9945c7b29bde2 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 
> 5aff71e0e981c429f85663300d3e5c21089529a9 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  e5895547e6006f30a37b5ba0b1ce42253129d3b6 
>   ql/src/test/results/clientpositive/dbtxnmgr_showlocks.q.out 
> 03c6724ec2e50ae1f7c642339c1806d0786a9ec5 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/CompactionInfoStruct.java
>  4aee45ce5f0e534823194bc84d13b88210ce0b3c 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ShowCompactResponseElement.java
>  8a5682a013b24f8dcf7ad3fdb0b0b606d82cc7c0 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-php/metastore/Types.php
>  e8556dcea68f34336df2925c4108e71185d6377f 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-py/hive_metastore/ttypes.py
>  b05e61e84a310273911ef592258bcd3b34e87734 
>   
> standalone-metastore/metastore-common/src/gen/thrift/gen-rb/hive_metastore_types.rb
>  868cf69200f69aa89e82b34e22ee0ad792e6d025 
>   standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift 
> 61a94fee4d82a714c12aeeb27f31e24774592c98 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionInfo.java
>  ba45f3945274853fdc84487d93c4c00ff2982541 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  aded6f5486cc840f397347b39049310009fd3bad 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  da5dd61d08e2ca8fe5e80ffdf9fb4a6f4c4d0ba3 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  c2c97d96c6cc98f9746069fa725d17d12f6c8642 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/hive-schema-4.0.0.derby.sql
>  67102718867233f29ddb2ea8ec3fbcb6560c6c30 
>   
> standalone-metastore/metastore-server/src/main/sql/derby/upgrade-3.2.0-to-4.0.0.derby.sql
>  ae0a32541a4bb9179b2bb71ae9f9098d7b35a88e 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/hive-schema-4.0.0.mssql.sql
>  221d4f1fffb682aaec3af22a339e7a3077a75f6a 
>   
> standalone-metastore/metastore-server/src/main/sql/mssql/upgrade-3.2.0-to-4.0.0.mssql.sql
>  bc98d5fc4a5637988c97f2e5a0e02d3be16ae0cb 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/hive-schema-4.0.0.mysql.sql
>  dd761a66db4826580a67d64879e4c85278b8e20c 
>   
> standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-3.2.0-to-4.0.0.mysql.sql
>  6a040a6a64c2086b5eb68a397697c9e2d2ca4d76 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/hive-schema-4.0.0.oracle.sql
>  f5ec1ba1aff89d02b66d6a2cd1da8de1b3b08d06 
>   
> standalone-metastore/metastore-server/src/main/sql/oracle/upgrade-3.2.0-to-4.0.0.oracle.sql
>  c7738be2732b839aa2b460733c092e368909f935 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/hive-schema-4.0.0.postgres.sql
>  455f98b72578ff977e29301cd2fc595ae80ee4ca 
>   
> standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-3.2.0-to-4.0.0.postgres.sql
>  5c39b0d9f4d27ab82ef44392818c1810cb7664ce 
> 
> 
> Diff: https://reviews.apache.org/r/72028/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71888: HIVE-22568: Process compaction candidates in parallel by the Initiator

2020-01-14 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71888/#review219251
---


Ship it!




Ship It!

- Peter Vary


On dec. 6, 2019, 12:54 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71888/
> ---
> 
> (Updated dec. 6, 2019, 12:54 du)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-22568
> https://issues.apache.org/jira/browse/HIVE-22568
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> `checkForCompaction` includes many file metadata checks and may be expensive. 
> Therefore, make sense using a thread pool here and running 
> `checkForCompactions` in parallel.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4393a2825e 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 7a0e32463d 
>   ql/src/test/org/apache/hadoop/hive/ql/txn/compactor/TestInitiator.java 
> 564839324f 
> 
> 
> Diff: https://reviews.apache.org/r/71888/diff/1/
> 
> 
> Testing
> ---
> 
> unit test
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71949: HIVE-20934: ACID: Query based compactor for minor compaction

2020-01-08 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71949/#review219177
---


Ship it!




Ship It!

- Peter Vary


On jan. 8, 2020, 10:40 de, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71949/
> ---
> 
> (Updated jan. 8, 2020, 10:40 de)
> 
> 
> Review request for hive, Denys Kuzmenko, Karen Coppage, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20934: ACID: Query based compactor for minor compaction
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
>  PRE-CREATION 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  7ae33fadf7c8da83924e317287d160faf5364e3d 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  b7245e2c3570b362a00b65b23f3f84616d0a3d1e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 
> 33d723a02e28d69a69b88281038f69b5aecfe6a2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 
> dde8878769e10191c5ba61bd1ba44d9b16b172c1 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> 2ac6232460fedb8351b5f0cfae2ce2d0f2e2d948 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 
> 0a96fc30b359043293017b235a36cd044ddb176e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> ad6817c32bbfad1d27023b25912b1204f069a66a 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> e26794355ca1e73be5c103f8405c471da870bbe3 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 2253fda6c6ed5e2f70ef7c1166895eb49f600ea9 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> 38689ef86c607a36f8ec961a88578c13bfcd5b01 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  9b8420902fb688b218fa432d70f71302f9f180e6 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> 1eab5b888deef2d0fb5c097941a1dafa51c7d46b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  41cb4b64fbc79dcf81919769c567b26a2e18cfe5 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommandsForMmTable.java 
> d4c9121c9f17f8d083f1e1af1caf840678a3559d 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommandsForOrcMmTable.java 
> d6435342aa1f56ba5495a657b4a43327fdc49645 
> 
> 
> Diff: https://reviews.apache.org/r/71949/diff/5/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71949: HIVE-20934: ACID: Query based compactor for minor compaction

2020-01-08 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71949/#review219171
---




itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
Line 1597 (original), 1602 (patched)


Question: Is it intentional, that this writeBatch is not moved to the util?



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
Line 1798 (original), 1675 (patched)


nit: still some only formatting change?



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
Line 1831 (original), 1685 (patched)


nit: Only formatting change?



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
Line 64 (original), 63 (patched)


nit: Only formatting change?



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
Line 70 (original), 68 (patched)


why move this to static, if the initialization is still done in @Before?



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
Line 113 (original), 111 (patched)


nit: only formatting?



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
Line 174 (original), 171 (patched)


Only formatting, and some more - will not mark the others



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java
Lines 247 (patched)


Is it worth to use a constant instead for "queryminorcomp"?



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java
Line 287 (original), 319 (patched)


formatting only changes


- Peter Vary


On jan. 7, 2020, 2:24 du, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71949/
> ---
> 
> (Updated jan. 7, 2020, 2:24 du)
> 
> 
> Review request for hive, Denys Kuzmenko, Karen Coppage, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20934: ACID: Query based compactor for minor compaction
> 
> 
> Diffs
> -
> 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
>  PRE-CREATION 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  445e39c260edc68f511550271a7ac471fae908fe 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  b7245e2c3570b362a00b65b23f3f84616d0a3d1e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 
> 33d723a02e28d69a69b88281038f69b5aecfe6a2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 
> 3c508ec6cf620aee6a7791c6ab52c331ad5ec6bd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> 2ac6232460fedb8351b5f0cfae2ce2d0f2e2d948 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 
> 0a96fc30b359043293017b235a36cd044ddb176e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> ad6817c32bbfad1d27023b25912b1204f069a66a 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 2b2cc1a2ba8377aa3681b1a3454a0d64369eef64 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 7a0e32463d28007cff5526ae037cc1447e50a50b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> 38689ef86c607a36f8ec961a88578c13bfcd5b01 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  9b8420902fb688b218fa432d70f71302f9f180e6 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> 1eab5b888deef2d0fb5c097941a1dafa51c7d46b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  41cb4b64fbc79dcf81919769c567b26a2e18cfe5 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommandsForMmTable.java 
> d4c9121c9f17f8d083f1e1af1caf840678a3559d 
>   ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommandsForOrcMmTable.java 
> d6435342aa1f56ba5495a657b4a43327fdc49645 
> 
> 
> Diff: https://reviews.apache.org/r/71949/diff/2/
> 
> 
> Testing
> ---
> 
> 
> 

Re: Review Request 71963: HIVE-22700: Compactions may leak memory when unauthorized

2020-01-07 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71963/#review219145
---


Ship it!




Ship It!

- Peter Vary


On jan. 7, 2020, 2:51 du, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71963/
> ---
> 
> (Updated jan. 7, 2020, 2:51 du)
> 
> 
> Review request for hive, Denys Kuzmenko, Karen Coppage, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22700: Compactions may leak memory when unauthorized
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 7a0e32463d28007cff5526ae037cc1447e50a50b 
> 
> 
> Diff: https://reviews.apache.org/r/71963/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71949: HIVE-20934: ACID: Query based compactor for minor compaction

2020-01-07 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71949/#review219140
---




itests/hive-unit/pom.xml
Lines 440 (patched)


nit: formatting?
question: Do we really need guava? I hate this dependency as a general rule 
try to avoid it.



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
Line 1 (original), 1 (patched)


Hard to review the changes because of the formatting differences... Let's 
talk



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
Line 10 (original), 10 (patched)


nit: Do you know what is this change?



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
Line 116 (original), 112 (patched)


' ' is needed after ','



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
Line 198 (original), 191 (patched)


nit: unnecessary '+' in the middle



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
Line 213 (original), 207 (patched)


nit: unnecessary '+'



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
Line 259 (original), 252 (patched)


nit: unnecessary '+'



itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
Line 264 (original), 256 (patched)


'+' again



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java
Lines 282 (patched)


I do not see this in the original code. What is this for?



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java
Lines 1228 (patched)


Do we need this? With delete_delta we do not supposed to have syntehetic 
rowIDs...



ql/src/test/org/apache/hadoop/hive/ql/TestTxnCommandsForMmTable.java
Lines 619 (patched)


nit: spaces...


- Peter Vary


On jan. 4, 2020, 9:06 de, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71949/
> ---
> 
> (Updated jan. 4, 2020, 9:06 de)
> 
> 
> Review request for hive, Denys Kuzmenko, Karen Coppage, and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20934: ACID: Query based compactor for minor compaction
> 
> 
> Diffs
> -
> 
>   itests/hive-unit/pom.xml bc20cd6168dd61222c75fb866deada26328986dd 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorTestUtil.java
>  PRE-CREATION 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCompactor.java
>  445e39c260edc68f511550271a7ac471fae908fe 
>   
> itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
>  b7245e2c3570b362a00b65b23f3f84616d0a3d1e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/SplitGrouper.java 
> 33d723a02e28d69a69b88281038f69b5aecfe6a2 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 
> 3c508ec6cf620aee6a7791c6ab52c331ad5ec6bd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcRawRecordMerger.java 
> 2ac6232460fedb8351b5f0cfae2ce2d0f2e2d948 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSplit.java 
> 0a96fc30b359043293017b235a36cd044ddb176e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
> 20b0ccd94b5f08aa2c1dace1301a8315bd202bf7 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 2b2cc1a2ba8377aa3681b1a3454a0d64369eef64 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 7a0e32463d28007cff5526ae037cc1447e50a50b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> 38689ef86c607a36f8ec961a88578c13bfcd5b01 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MinorQueryCompactor.java 
> PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  9b8420902fb688b218fa432d70f71302f9f180e6 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> 1eab5b888deef2d0fb5c097941a1dafa51c7d46b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  

Re: Review Request 71844: HIVE-22554: ACID: Wait timeout for blocking compaction should be configurable

2019-11-29 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71844/#review218855
---


Ship it!




Ship It!

- Peter Vary


On nov. 28, 2019, 1:49 du, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71844/
> ---
> 
> (Updated nov. 28, 2019, 1:49 du)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22554: ACID: Wait timeout for blocking compaction should be configurable
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 
> 4393a2825e1f465781fc07a6678ebaa2bab906bd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/ddl/table/storage/AlterTableCompactOperation.java
>  fd0ae3a3df731aa690d024dfdbf89f7754ca2a41 
> 
> 
> Diff: https://reviews.apache.org/r/71844/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71812: HIVE-22534: ACID: Improve Compactor thread logging

2019-11-27 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71812/#review218820
---



Since everybody can comment on logging, I have a few comments :)
Thanks!


ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
Lines 24-25 (patched)


Is this needed?



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
Lines 122 (patched)


Maybe log the exception?



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
Lines 140 (patched)


Log the exception



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java
Line 234 (original), 239 (patched)


There were other occurences for logging InterruptedException - then we 
decided not to log the full exception, just the message. Maybe we should handle 
InterruptedException in the same way - one, or the other (I prefer the full 
stack, but whatever)


- Peter Vary


On nov. 25, 2019, 12:18 du, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71812/
> ---
> 
> (Updated nov. 25, 2019, 12:18 du)
> 
> 
> Review request for hive, Denys Kuzmenko and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22534: ACID: Improve Compactor thread logging
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> ee2c0f3e23ed716f3de0a2740a96a7ec39251bc2 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> 10681c0202a32c338e58b3e2eede03657a00774f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  f7e0a85c1f595bb4f112aa051779db3f00c8e572 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> 80119de22f602d9e3cb7a1f60b48e05a37c6a047 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  41cb4b64fbc79dcf81919769c567b26a2e18cfe5 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Worker.java 
> 3270175a80992e0efb1e0bfd1f33ffd8a96fcf87 
> 
> 
> Diff: https://reviews.apache.org/r/71812/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71775: HIVE-22280: Q tests for partitioned temporary tables

2019-11-27 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71775/#review218819
---


Ship it!




Ship It!

- Peter Vary


On nov. 20, 2019, 4:25 du, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71775/
> ---
> 
> (Updated nov. 20, 2019, 4:25 du)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22280: Q tests for partitioned temporary tables
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties 
> 2918a6852c6f8448ea44472df0be9d521d5c3b27 
>   ql/src/test/queries/clientnegative/temp_table_addpart1.q PRE-CREATION 
>   
> ql/src/test/queries/clientnegative/temp_table_alter_rename_partition_failure.q
>  PRE-CREATION 
>   
> ql/src/test/queries/clientnegative/temp_table_alter_rename_partition_failure2.q
>  PRE-CREATION 
>   
> ql/src/test/queries/clientnegative/temp_table_alter_rename_partition_failure3.q
>  PRE-CREATION 
>   ql/src/test/queries/clientnegative/temp_table_drop_partition_failure.q 
> PRE-CREATION 
>   
> ql/src/test/queries/clientnegative/temp_table_drop_partition_filter_failure.q 
> PRE-CREATION 
>   ql/src/test/queries/clientnegative/temp_table_exchange_partitions.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_add_part_exist.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_add_part_multiple.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_add_part_with_loc.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_alter_partition_change_col.q 
> PRE-CREATION 
>   
> ql/src/test/queries/clientpositive/temp_table_alter_partition_clusterby_sortby.q
>  PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_alter_partition_coltype.q 
> PRE-CREATION 
>   
> ql/src/test/queries/clientpositive/temp_table_alter_partition_onto_nocurrent_db.q
>  PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_alter_rename_partition.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_avro_partitioned.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_avro_partitioned_native.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_default_partition_name.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_drop_multi_partitions.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_drop_partitions_filter.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_drop_partitions_filter2.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_drop_partitions_filter3.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_drop_partitions_filter4.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_exchange_partition.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_exchange_partition2.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_exchange_partition3.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_exchgpartition2lel.q 
> PRE-CREATION 
>   
> ql/src/test/queries/clientpositive/temp_table_insert1_overwrite_partitions.q 
> PRE-CREATION 
>   
> ql/src/test/queries/clientpositive/temp_table_insert2_overwrite_partitions.q 
> PRE-CREATION 
>   
> ql/src/test/queries/clientpositive/temp_table_insert_values_dynamic_partitioned.q
>  PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_insert_values_partitioned.q 
> PRE-CREATION 
>   
> ql/src/test/queries/clientpositive/temp_table_insert_with_move_files_from_source_dir.q
>  PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_llap_partitioned.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_load_dyn_part1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_loadpart1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_loadpart2.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_merge_dynamic_partition.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_merge_dynamic_partition2.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_merge_dynamic_partition3.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_merge_dynamic_partition4.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_merge_dynamic_partition5.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_multi_insert_partitioned.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_orc_diff_part_cols.q 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/temp_table_orc_diff_part_cols2.q 
> PRE-CREATION 
>   
> 

Re: Review Request 71792: COMPLETED_TXN_COMPONENTS table is never cleaned up unless Compactor runs

2019-11-22 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71792/#review218762
---




standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
Lines 83 (patched)


Please use the constants here.


- Peter Vary


On nov. 21, 2019, 5:35 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71792/
> ---
> 
> (Updated nov. 21, 2019, 5:35 du)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21917
> https://issues.apache.org/jira/browse/HIVE-21917
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The Initiator thread in the metastore repeatedly loops over entries in the 
> COMPLETED_TXN_COMPONENTS table to determine which partitions / tables might 
> need to be compacted. However, entries are never removed from this table 
> except by a completed Compactor run.
> 
> In a cluster where most tables / partitions are write-once read-many, this 
> results in stale entries in this table never being cleaned up. In a small 
> test cluster, we have observed approximately 45k entries in this table 
> (virtually equal to the number of partitions in the cluster) while < 100 of 
> these tables have delta files at all. Since most of the tables will never get 
> enough writes to trigger a compaction (and in fact have only ever been 
> written to once), the initiator thread keeps trying to evaluate them on every 
> loop.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java 
> 610cf05204 
>   
> ql/src/test/org/apache/hadoop/hive/metastore/txn/TestCompactionTxnHandler.java
>  b28b57779b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/CompactionTxnHandler.java
>  8253ccb9c9 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  268038795b 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnStore.java
>  e840758c9d 
> 
> 
> Diff: https://reviews.apache.org/r/71792/diff/2/
> 
> 
> Testing
> ---
> 
> Unit tests
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71763: HIVE-22484: Remove Calls to printStackTrace

2019-11-13 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71763/#review218620
---




ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java
Line 646 (original)


Do not want to LOG the Exception at least on debug level?



ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java
Line 659 (original)


Do not want to LOG the Exception at least on debug level?



ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java
Line 673 (original)


Do not want to LOG the Exception at least on debug level?



ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java
Line 307 (original)


Do not want to LOG the Exception at least on debug level?



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
Line 1888 (original), 1888 (patched)


Do not want to LOG the Exception at least on debug level?


- Peter Vary


On nov. 13, 2019, 5:22 du, David Mollitor wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71763/
> ---
> 
> (Updated nov. 13, 2019, 5:22 du)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22484: Remove Calls to printStackTrace
> 
> 
> Diffs
> -
> 
>   jdbc/src/java/org/apache/hive/jdbc/HiveBaseResultSet.java dfaa40fe23 
>   jdbc/src/java/org/apache/hive/jdbc/HiveDriver.java 102683ee18 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryPlan.java 0d7b92d649 
>   ql/src/java/org/apache/hadoop/hive/ql/ddl/DDLSemanticAnalyzerFactory.java 
> c8aaec15d4 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 9ad4e71482 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableDummyOperator.java 
> e8f7dd067e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 1aae142ba7 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/ScriptOperator.java 3210ca5cf8 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java a7770b4e53 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java cd4f2a02a3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/CustomPartitionVertex.java 
> dfabfb81e5 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 077c94f82b 
>   ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistoryViewer.java 
> 616f2d6c10 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java 67996c6db9 
>   ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java 3e45e45b27 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
>  3e81ab5959 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/SortMergeJoinTaskDispatcher.java
>  fbf6852013 
>   serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDe.java 
> 948cddcb28 
>   
> serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeTypeMap.java
>  3f086cdde4 
>   
> serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeTypeSet.java
>  f41959b7d2 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 
> b87b670652 
> 
> 
> Diff: https://reviews.apache.org/r/71763/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> David Mollitor
> 
>



Re: Review Request 71671: HIVE-22401: Refactor CompactorMR

2019-10-28 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71671/#review218420
---



Thanks for the patch.
A nit and a question.


ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java
Line 239 (original), 220 (patched)


Could we remove the extra spaces if we are here, please?



ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
Lines 85-87 (patched)


Maybe this should be checked outside? This is something general? Or am I 
mistaken?


- Peter Vary


On okt. 24, 2019, 3:57 du, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71671/
> ---
> 
> (Updated okt. 24, 2019, 3:57 du)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22401: Refactor CompactorMR
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/CompactorMR.java 
> 0f1579aa542f83b68f2efc92e08e6c0a32bd113d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MajorQueryCompactor.java 
> PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/MmMajorQueryCompactor.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactor.java 
> PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/QueryCompactorFactory.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71671/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-17 Thread Peter Vary via Review Board


> On okt. 17, 2019, 11:10 de, Peter Vary wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Line 992 (original), 1003 (patched)
> > 
> >
> > Why is this a List of Pairs, why not just a Map? Is the order important?
> 
> Denys Kuzmenko wrote:
> Have no idea :) Originally I refactored it to the Map and than reverted 
> back to List of Pairs as it required some more refactoring in Driver. If you 
> are raising same concern, probably makes sence to change this.

Here or in an follow-up might worth to clean-up -> This seems very strange for 
me too


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218258
---


On okt. 17, 2019, 8:41 de, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated okt. 17, 2019, 8:41 de)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 91910d1c0c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/3/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> File Attachments
> 
> 
> HIVE-21114.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-17 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218258
---



nits and a single question


ql/src/java/org/apache/hadoop/hive/ql/Driver.java
Lines 956 (patched)


Remove this " + " please :)



ql/src/java/org/apache/hadoop/hive/ql/Driver.java
Line 985 (original), 989 (patched)


nit: I hate your naming :) tableCol is table column for me :D



ql/src/java/org/apache/hadoop/hive/ql/Driver.java
Line 992 (original), 1003 (patched)


Why is this a List of Pairs, why not just a Map? Is the order important?


- Peter Vary


On okt. 17, 2019, 8:41 de, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated okt. 17, 2019, 8:41 de)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 91910d1c0c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/3/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> File Attachments
> 
> 
> HIVE-21114.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-14 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218201
---




ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java
Lines 48 (patched)


Ohh, and another important As -> AS :D


- Peter Vary


On okt. 10, 2019, 4:09 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated okt. 10, 2019, 4:09 du)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/2/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> File Attachments
> 
> 
> HIVE-21114.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-14 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218200
---



+1 pending tests.
And some nits, just to be constructive :D :D :D


ql/src/java/org/apache/hadoop/hive/ql/Driver.java
Line 355 (original), 356 (patched)


Unnecessary change



ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java
Lines 34 (patched)


Maybe: WITH a AS (SELECT...
But this is really the nit of nits :D :D :D



ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java
Lines 41 (patched)


Maybe: WITH a AS (SELECT...
But this is really the nit of nits :D :D :D


- Peter Vary


On okt. 10, 2019, 4:09 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated okt. 10, 2019, 4:09 du)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/2/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> File Attachments
> 
> 
> HIVE-21114.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2019/10/10/0929ed4a-17be-4098-8c61-0819a30613fd__HIVE-21114.1.patch
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-10-11 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71606/#review218190
---


Ship it!




Ship It!

- Peter Vary


On okt. 10, 2019, 11:39 de, Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71606/
> ---
> 
> (Updated okt. 10, 2019, 11:39 de)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-21407
> https://issues.apache.org/jira/browse/HIVE-21407
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The previous approach didn't solve all use cases. In this new approach the 
> hive type is sent to the Parquet PPD part and trim the value which is pushed 
> to the predicate in case of CHAR hive type.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
>  5b051dd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
> fc9188f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
> 033e26a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
>  ca5e085 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
>  0210a0a 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
>  7c7c657 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
> 4c40908 
>   ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 
>   ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71606/diff/2/
> 
> 
> Testing
> ---
> 
> Added new q test for testing the PPD for char and varchar types. Also 
> extended the unit tests for the 
> ParquetFilterPredicateConverter.toFilterPredicate method.
> 
> 
> The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are 
> both testing the same thing, the behavior of the 
> ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make 
> sense to have tests for the same use case in different test classes, so moved 
> the test cases from the TestParquetRecordReaderWrapper to 
> TestParquetFilterPredicate.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Peter Vary via Review Board


> On okt. 10, 2019, 7:46 de, Peter Vary wrote:
> > ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java
> > Lines 47 (patched)
> > 
> >
> > What about CREATE TABLE AS SELECT * FROM...?
> > We still might miss some cases.
> > 
> > I would create an assertion on generating writeId for read only 
> > transactions (this would be usefull anyway), and use the ptest to run on 
> > all of the test cases to see if this assertion breaks anything.
> > 
> > What do you think?
> 
> Denys Kuzmenko wrote:
> Hi Peter, yes, I was thinking about something similar, however most 
> propably it would be one time check (won't be commited). 
> Somewhere in Driver.compile method, after SemanticAnalyzer add assert if 
> transaction marked as ReadOnly doesn't have assosiated write ids. 
> This way we could make sure we do not misclasify some of the existing 
> queries. Does this makes sence?

I would vote for the check to be committed for several reasons:
- We might cause strange/flaky errors if we assume that a transacion is RO, but 
in reality it writes something. Easier to catch this if we fail fast.
- When introducing new commands, it would be easy to forget to update this 
check, but if the assertion is there we will catch them compile time - again 
fail fast

Just my 2 cents


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218174
---


On okt. 8, 2019, 2:27 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated okt. 8, 2019, 2:27 du)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/1/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71606: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-10-10 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71606/#review218175
---



Thanks for chasing this down!
Really appreciate it!


ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
Lines 157 (patched)


This is the best way to check this?
Is this always starts with char? CHAR? or anything else is not possible?



ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
Lines 181 (patched)


I do not like this.
Either we only aim for space, or we aim for whitespace characters, but the 
check and the replace is different.


- Peter Vary


On okt. 10, 2019, 8:44 de, Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71606/
> ---
> 
> (Updated okt. 10, 2019, 8:44 de)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-21407
> https://issues.apache.org/jira/browse/HIVE-21407
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The previous approach didn't solve all use cases. In this new approach the 
> hive type is sent to the Parquet PPD part and trim the value which is pushed 
> to the predicate in case of CHAR hive type.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/FilterPredicateLeafBuilder.java
>  5b051dd 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
> fc9188f 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java 
> 033e26a 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetFilterPredicateConverter.java
>  ca5e085 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
>  0210a0a 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
>  7c7c657 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/sarg/TestConvertAstToSearchArg.java 
> 4c40908 
>   ql/src/test/queries/clientpositive/parquet_ppd_char.q 386fb25 
>   ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71606/diff/1/
> 
> 
> Testing
> ---
> 
> Added new q test for testing the PPD for char and varchar types. Also 
> extended the unit tests for the 
> ParquetFilterPredicateConverter.toFilterPredicate method.
> 
> 
> The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are 
> both testing the same thing, the behavior of the 
> ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make 
> sense to have tests for the same use case in different test classes, so moved 
> the test cases from the TestParquetRecordReaderWrapper to 
> TestParquetFilterPredicate.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 71589: Create read-only transactions

2019-10-10 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71589/#review218174
---



One more question to find a way to identify write queries.
Otherwise as discussed, I do not see a better way to check the type fo the 
transaction :(


ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java
Lines 47 (patched)


What about CREATE TABLE AS SELECT * FROM...?
We still might miss some cases.

I would create an assertion on generating writeId for read only 
transactions (this would be usefull anyway), and use the ptest to run on all of 
the test cases to see if this assertion breaks anything.

What do you think?


- Peter Vary


On okt. 8, 2019, 2:27 du, Denys Kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71589/
> ---
> 
> (Updated okt. 8, 2019, 2:27 du)
> 
> 
> Review request for hive, Laszlo Pinter and Peter Vary.
> 
> 
> Bugs: HIVE-21114
> https://issues.apache.org/jira/browse/HIVE-21114
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> With HIVE-21036 we have a way to indicate that a txn is read only.
> We should (at least in auto-commit mode) determine if the single stmt is a 
> read and mark the txn accordingly.
> Then we can optimize TxnHandler.commitTxn() so that it doesn't do any checks 
> in write_set etc.
> 
> TxnHandler.commitTxn() already starts with lockTransactionRecord(stmt, txnid, 
> TXN_OPEN) so it can read the txn type in the same SQL stmt.
> 
> HiveOperation only has QUERY, which includes Insert and Select, so this 
> requires figuring out how to determine if a query is a SELECT. By the time 
> Driver.openTransaction(); is called, we have already parsed the query so 
> there should be a way to know if the statement only reads.
> 
> For multi-stmt txns (once these are supported) we should allow user to 
> indicate that a txn is read-only and then not allow any statements that can 
> make modifications in this txn. This should be a different jira.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java bcd4600683 
>   ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java fcf499d53a 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DbTxnManager.java 943aa383bb 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/DummyTxnManager.java 
> ac813c8288 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveTxnManager.java 
> 1c53426966 
>   ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager.java 
> cc86afedbf 
>   ql/src/test/org/apache/hadoop/hive/ql/parse/TestParseUtils.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/71589/diff/1/
> 
> 
> Testing
> ---
> 
> Unit + manual test
> 
> 
> Thanks,
> 
> Denys Kuzmenko
> 
>



Re: Review Request 71574: HIVE-22212: Implement append partition related methods on temporary tables

2019-10-02 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71574/#review218019
---


Ship it!




Ship It!

- Peter Vary


On okt. 2, 2019, 11:49 de, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71574/
> ---
> 
> (Updated okt. 2, 2019, 11:49 de)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22212: Implement append partition related methods on temporary tables
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  2467ee3cfb9af0d64653ad7a012ee6d3e68d6674 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAppendPartitionTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAppendPartitions.java
>  e2593fecdce67a2874a14065c9247428ae8852d8 
> 
> 
> Diff: https://reviews.apache.org/r/71574/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71550: HIVE-22137: Implement alter/rename partition related methods on temporary tables

2019-10-02 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71550/#review218017
---


Ship it!




Just one minor comment, we can fix that in a follow-up jira


ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAlterPartitionsTempTable.java
Lines 183-224 (patched)


I think in a follow-up jira we should move these test to the original 
TestAlterPartition test.


- Peter Vary


On szept. 26, 2019, 4:54 du, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71550/
> ---
> 
> (Updated szept. 26, 2019, 4:54 du)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22137: Implement alter/rename partition related methods on temporary 
> tables
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/PartitionTree.java 
> c84c3ef595a5f26232d4f003e46f74bb14a7ec99 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  a5b16d12a514439046929dde031bbe8e80f71a28 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/TempTable.java 
> fa6dddcbadec59579357b9bc3d4ea42e44a1ca6f 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAlterPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
>  6c7d80ea576d6b10af191d8ceb158c20b1f70b46 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAlterPartitions.java
>  4fc3688f2e5c06e254e2177dbf142b424dcfddd8 
> 
> 
> Diff: https://reviews.apache.org/r/71550/diff/2/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71558: HIVE-21987: Hive is unable to read Parquet int32 annotated with decimal

2019-09-30 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71558/#review217991
---


Ship it!




Ship It!

- Peter Vary


On szept. 30, 2019, 11:53 de, Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71558/
> ---
> 
> (Updated szept. 30, 2019, 11:53 de)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-21987
> https://issues.apache.org/jira/browse/HIVE-21987
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Added support to read INT32 Parquet decimals.
> 
> 
> Diffs
> -
> 
>   data/files/parquet_int_decimal_1.parquet PRE-CREATION 
>   data/files/parquet_int_decimal_2.parquet PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 
> 350ae2d 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/parquet/vector/ParquetDataColumnReaderFactory.java
>  320ce52 
>   ql/src/test/queries/clientpositive/parquet_int_decimal.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_int_decimal.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/type_change_test_fraction.q.out 07cf8fa 
> 
> 
> Diff: https://reviews.apache.org/r/71558/diff/1/
> 
> 
> Testing
> ---
> 
> Added new q tests for the use-case.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 71550: HIVE-22137: Implement alter/rename partition related methods on temporary tables

2019-09-26 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71550/#review217953
---




ql/src/java/org/apache/hadoop/hive/ql/metadata/PartitionTree.java
Lines 208-210 (patched)


What happens when one of the alter partition fails only? How does it 
handled in the HMS handled tables?



ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAlterPartitionsTempTable.java
Lines 195 (patched)


What was the original error on HMS version?
Why it is different here?



ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAlterPartitionsTempTable.java
Lines 201 (patched)


What was the original error on HMS version?
Why it is different here?


- Peter Vary


On szept. 26, 2019, 4:54 du, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71550/
> ---
> 
> (Updated szept. 26, 2019, 4:54 du)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22137: Implement alter/rename partition related methods on temporary 
> tables
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/PartitionTree.java 
> c84c3ef595a5f26232d4f003e46f74bb14a7ec99 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  a5b16d12a514439046929dde031bbe8e80f71a28 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/TempTable.java 
> fa6dddcbadec59579357b9bc3d4ea42e44a1ca6f 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientAlterPartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestAlterPartitions.java
>  4fc3688f2e5c06e254e2177dbf142b424dcfddd8 
> 
> 
> Diff: https://reviews.apache.org/r/71550/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71506: HIVE-22084: Implement exchange partitions related methods on temporary tables.

2019-09-23 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71506/#review217910
---



Thanks Laszlo!
Few quick questions below.
We might want to test the cross exchange methods as well (exchange partition 
between temp and non-temp tables)

Otherwise LGTM


ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
Lines 1580 (patched)


What is the error message when we want to exchange partition between a temp 
table and a normal table? Maybe we want to have a specific error message 
instead?



ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
Lines 1601 (patched)


What is the error message when we want to exchange partition between a temp 
table and a normal table? Maybe we want to have a specific error message 
instead?



ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
Lines 1652-1653 (patched)


Why are we doing it in 2 steps?


- Peter Vary


On szept. 18, 2019, 2:10 du, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71506/
> ---
> 
> (Updated szept. 18, 2019, 2:10 du)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-22084: Implement exchange partitions related methods on temporary tables.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  506bf5dfa8404f3645b8c5db7ea19c0b4add33a7 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientExchangePartitionsTempTable.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestExchangePartitions.java
>  1a2b7e4f9f7562e84006e86061b482d1b535197c 
> 
> 
> Diff: https://reviews.apache.org/r/71506/diff/3/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 71243: HIVE-21875: Implement drop partition related methods on temporary tables.

2019-09-16 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71243/#review217752
---



Fix it and ship it


ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
Lines 1524-1529 (patched)


Please move these check after the assertTempTablePartitioned



ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientDropPartitionsTempTable.java
Lines 139-144 (patched)


Please ensure that for Temp table we are working in the same way than for 
normal tables



service/src/java/org/apache/hive/service/cli/thrift/ThreadPoolExecutorWithOomHook.java
Line 57 (original)


Please check this change


- Peter Vary


On aug. 7, 2019, 8:35 de, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71243/
> ---
> 
> (Updated aug. 7, 2019, 8:35 de)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21875: Implement drop partition related methods on temporary tables.
> 
> This is one of the subtasks for HIVE-21765, to support partitions on 
> temporary tables.
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  a2c84b4620fb1eb90069e294204f604565ffed9b 
>   
> ql/src/test/org/apache/hadoop/hive/ql/metadata/TestSessionHiveMetastoreClientDropPartitionsTempTable.java
>  PRE-CREATION 
>   
> service/src/java/org/apache/hive/service/cli/thrift/ThreadPoolExecutorWithOomHook.java
>  129413681045d790ae20bf4d8060f04162224565 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestDropPartitions.java
>  91c9edac95e9f9688dd2b806aeeb0823af02574f 
> 
> 
> Diff: https://reviews.apache.org/r/71243/diff/1/
> 
> 
> Testing
> ---
> 
> Unit testing is done via 
> TestSessionHiveMetastoreClientDropPartitionsTempTable.java
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 70474: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-05-09 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70474/#review215157
---


Ship it!




Ship It!

- Peter Vary


On máj. 9, 2019, 7:51 de, Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70474/
> ---
> 
> (Updated máj. 9, 2019, 7:51 de)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-21407
> https://issues.apache.org/jira/browse/HIVE-21407
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The idea behind the patch is that for CHAR columns extend the predicate which 
> is pushed to Parquet with an “or” clause which contains the same expression 
> with a padded and a stripped value.
> Example:
> column c is a CHAR(10) type and the search expression is c='apple'
> The predicate which is pushed to Parquet looked like c='apple ' before the 
> patch and it would look like (c='apple ' or c='apple') after the patch.
> Since the value 'apple' is stored in Parquet without padding, the predicate 
> before the patch didn’t return any rows. With the patch it will return the 
> correct row. 
> Since on predicate level, there is no distinction between CHAR or VARCHAR, 
> the predicates for VARCHARs will be changed as well, so the result set 
> returned from Parquet will be wider than before.
> Example:
> A table contains a c VARCHAR(10) column and there is a row where c='apple' 
> and there is an other row where c='apple '. If the search expression is 
> c='apple ', both rows will be returned from Parquet after the patch. But 
> since Hive is doing an additional filtering on the rows returned from 
> Parquet, it won’t be a problem, the result set returned by Hive will contain 
> only the row with the value 'apple '.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
> be4c0d5 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
>  0210a0a 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
>  d464046 
>   ql/src/test/queries/clientpositive/parquet_ppd_char.q 4230d8c 
>   ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70474/diff/2/
> 
> 
> Testing
> ---
> 
> Added new q test for testing the PPD for char and varchar types. Also 
> extended the unit tests for the 
> ParquetFilterPredicateConverter.toFilterPredicate method.
> 
> The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are 
> both testing the same thing, the behavior of the 
> ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make 
> sense to have tests for the same use case in different test classes, so moved 
> the test cases from the TestParquetRecordReaderWrapper to 
> TestParquetFilterPredicate.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 70474: HIVE-21407: Parquet predicate pushdown is not working correctly for char column types

2019-04-15 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70474/#review214659
---


Fix it, then Ship it!




Just one little nit.
Otherwise LGTM +1


ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java
Lines 150 (patched)


please remove extra space


- Peter Vary


On ápr. 14, 2019, 1:03 du, Marta Kuczora wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70474/
> ---
> 
> (Updated ápr. 14, 2019, 1:03 du)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Bugs: HIVE-21407
> https://issues.apache.org/jira/browse/HIVE-21407
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The idea behind the patch is that for CHAR columns extend the predicate which 
> is pushed to Parquet with an “or” clause which contains the same expression 
> with a padded and a stripped value.
> Example:
> column c is a CHAR(10) type and the search expression is c='apple'
> The predicate which is pushed to Parquet looked like c='apple ' before the 
> patch and it would look like (c='apple ' or c='apple') after the patch.
> Since the value 'apple' is stored in Parquet without padding, the predicate 
> before the patch didn’t return any rows. With the patch it will return the 
> correct row. 
> Since on predicate level, there is no distinction between CHAR or VARCHAR, 
> the predicates for VARCHARs will be changed as well, so the result set 
> returned from Parquet will be wider than before.
> Example:
> A table contains a c VARCHAR(10) column and there is a row where c='apple' 
> and there is an other row where c='apple '. If the search expression is 
> c='apple ', both rows will be returned from Parquet after the patch. But 
> since Hive is doing an additional filtering on the rows returned from 
> Parquet, it won’t be a problem, the result set returned by Hive will contain 
> only the row with the value 'apple '.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/io/parquet/LeafFilterFactory.java 
> be4c0d5 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestParquetRecordReaderWrapper.java
>  0210a0a 
>   
> ql/src/test/org/apache/hadoop/hive/ql/io/parquet/read/TestParquetFilterPredicate.java
>  d464046 
>   ql/src/test/queries/clientpositive/parquet_ppd_char.q 4230d8c 
>   ql/src/test/queries/clientpositive/parquet_ppd_char2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/parquet_ppd_char2.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/70474/diff/1/
> 
> 
> Testing
> ---
> 
> Added new q test for testing the PPD for char and varchar types. Also 
> extended the unit tests for the 
> ParquetFilterPredicateConverter.toFilterPredicate method.
> 
> The TestParquetRecordReaderWrapper and the TestParquetFilterPredicate are 
> both testing the same thing, the behavior of the 
> ParquetFilterPredicateConverter.toFilterPredicate method. It doesn't make 
> sense to have tests for the same use case in different test classes, so moved 
> the test cases from the TestParquetRecordReaderWrapper to 
> TestParquetFilterPredicate.
> 
> 
> Thanks,
> 
> Marta Kuczora
> 
>



Re: Review Request 70256: HIVE-21480: Fixed flaky and broken test TestHiveMetaStore.testJDOPersistenceManagerCleanup

2019-03-22 Thread Peter Vary via Review Board


> On márc. 21, 2019, 10:18 de, Peter Vary wrote:
> > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
> > Lines 3153-3155 (patched)
> > 
> >
> > My concern here is that we testing a different case with the patch.
> > 
> > Before we tested that we open/use/close a client and do not have 
> > lingering object. After the patch we test that closing the client will 
> > remove the 1 object - which says that the getAllDatabases will result in 
> > exactly 1 object.
> > 
> > How could that happen that there are lingering objects when we create a 
> > new client? Is there a way to get rid of the lingering object somehow, and 
> > then test the original usecase?
> 
> Morio Ramdenbourg wrote:
> My knowledge on the PersistenceManager code isn't that great, but before 
> the new client is even created, the object count returned from 
> getJDOPersistenceManagerCacheSize() is 1. I believe this object comes from 
> when the HMS is initializing the database schema. There seems to be a 
> lingering object from that, at least when I run this test as a standalone 
> without the other tests in the class.
> 
> Morio Ramdenbourg wrote:
> Do you know of other ways I can clear / wait for the PMF cache to empty 
> itself?

Me neither :(
Maybe 
https://docs.oracle.com/cd/E13189_01/kodo/docs303/jdo-javadoc/javax/jdo/PersistenceManager.html#evictAll()
 this? If not then after marking them to evict and some waiting?


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70256/#review213878
---


On márc. 20, 2019, 6:43 du, Morio Ramdenbourg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70256/
> ---
> 
> (Updated márc. 20, 2019, 6:43 du)
> 
> 
> Review request for hive, Adam Holley, Karthik Manamcheri, Karen Coppage, 
> Peter Vary, and Vihang Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This test was not correctly counting the number of
> objects in the PersistenceManager cache before and after 
> HiveMetaStoreClient.close(). The
> getJDOPersistenceManagerCacheSize() internal helper method did not use
> the updated fields present in the metastore classes, and was
> consistently returning -1. Additionally, there was a chance to cause
> flakiness since the object count before and after close() could
> differ depending on lingering objects from previous
> tests or setup.
> 
> Modified the helper method to use the new
> fields, and fixed the flakiness on this test.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  77e0c98265e7b561f2eb39536e3251dd92e9cab0 
> 
> 
> Diff: https://reviews.apache.org/r/70256/diff/1/
> 
> 
> Testing
> ---
> 
> Unit tests run
> 
> 
> Thanks,
> 
> Morio Ramdenbourg
> 
>



Re: Review Request 70256: HIVE-21480: Fixed flaky and broken test TestHiveMetaStore.testJDOPersistenceManagerCleanup

2019-03-21 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/70256/#review213878
---




standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
Lines 3153-3155 (patched)


My concern here is that we testing a different case with the patch.

Before we tested that we open/use/close a client and do not have lingering 
object. After the patch we test that closing the client will remove the 1 
object - which says that the getAllDatabases will result in exactly 1 object.

How could that happen that there are lingering objects when we create a new 
client? Is there a way to get rid of the lingering object somehow, and then 
test the original usecase?


- Peter Vary


On márc. 20, 2019, 6:43 du, Morio Ramdenbourg wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/70256/
> ---
> 
> (Updated márc. 20, 2019, 6:43 du)
> 
> 
> Review request for hive, Adam Holley, Karthik Manamcheri, Karen Coppage, 
> Peter Vary, and Vihang Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> This test was not correctly counting the number of
> objects in the PersistenceManager cache before and after 
> HiveMetaStoreClient.close(). The
> getJDOPersistenceManagerCacheSize() internal helper method did not use
> the updated fields present in the metastore classes, and was
> consistently returning -1. Additionally, there was a chance to cause
> flakiness since the object count before and after close() could
> differ depending on lingering objects from previous
> tests or setup.
> 
> Modified the helper method to use the new
> fields, and fixed the flakiness on this test.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  77e0c98265e7b561f2eb39536e3251dd92e9cab0 
> 
> 
> Diff: https://reviews.apache.org/r/70256/diff/1/
> 
> 
> Testing
> ---
> 
> Unit tests run
> 
> 
> Thanks,
> 
> Morio Ramdenbourg
> 
>



Re: Review Request 69914: HIVE-21227: HIVE-20776 causes view access regression

2019-02-08 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69914/#review212667
---


Ship it!




Ship It!

- Peter Vary


On febr. 8, 2019, 6:48 de, Na Li wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69914/
> ---
> 
> (Updated febr. 8, 2019, 6:48 de)
> 
> 
> Review request for hive and Vihang Karajgaonkar.
> 
> 
> Bugs: hive-21227
> https://issues.apache.org/jira/browse/hive-21227
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20776 introduces a change that causes regression for view access.
> 
> Before the change, a user with select access of a view can get all columns of 
> a view with select access of a view that is derived from a partitioned table.
> 
> With the change, that user cannot access that view.
> 
> The reason is that
> 
> When user accesses columns of a view, Hive needs to get the partitions of the 
> table that the view is derived from. The user name is the user who issues the 
> query to access the view.
> The change in HIVE-20776 checks if user has access to a table before getting 
> its partitions. When user only has access of a view, not the access of a 
> table itself, this change denies the user access of the view.
> The solution is when getting table partitions, do not filter on table at HMS 
> client
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  a1826fa259d424c9f3d5a2f58a18f617355d586f 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestFilterHooks.java
>  6a0d0aa8840d9fc6799c16463a70ed6f0cc2c354 
> 
> 
> Diff: https://reviews.apache.org/r/69914/diff/2/
> 
> 
> Testing
> ---
> 
> TestGetPartitions and TestListPartitions pass
> 
> 
> Thanks,
> 
> Na Li
> 
>



Re: Review Request 69683: [HIVE-21071] Improve getInputSummary

2019-02-07 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69683/#review212622
---


Ship it!




Ship It!

- Peter Vary


On febr. 6, 2019, 9:17 du, David Mollitor wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69683/
> ---
> 
> (Updated febr. 6, 2019, 9:17 du)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Improve performance of method getInputSummary by changing data structures and 
> allowing multiple threads to do calculations.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 8937b43 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestGetInputSummary.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/exec/TestUtilities.java 90eb45b 
> 
> 
> Diff: https://reviews.apache.org/r/69683/diff/4/
> 
> 
> Testing
> ---
> 
> Unit
> 
> 
> Thanks,
> 
> David Mollitor
> 
>



Re: Review Request 69642: HIVE-20977: Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance

2019-01-25 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69642/#review212343
---


Ship it!




Ship It!

- Peter Vary


On jan. 3, 2019, 1:40 de, Karthik Manamcheri wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69642/
> ---
> 
> (Updated jan. 3, 2019, 1:40 de)
> 
> 
> Review request for hive, Adam Holley, Na Li, Morio Ramdenbourg, Naveen 
> Gangam, Peter Vary, Sergio Pena, and Vihang Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20977: Lazy evaluate the table object in PreReadTableEvent to improve 
> get_partition performance
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  a9398ae1e7 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/events/PreReadTableEvent.java
>  beec72bc12 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/ThrowingSupplier.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  7429d18226 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
>  fe64a91b56 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetPartitions.java
>  4d7f7c1220 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java
>  a338bd4032 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/events/TestPreReadTableEvent.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69642/diff/4/
> 
> 
> Testing
> ---
> 
> Unit tests.
> Manual performance test with Cloudera BDR to notice improved backup 
> performance.
> 
> 
> Thanks,
> 
> Karthik Manamcheri
> 
>



Re: Review Request 69780: HIVE-21099 Do Not Print StackTraces to STDERR in ConditionalResolverMergeFiles

2019-01-21 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69780/#review212173
---


Ship it!




Ship It!

- Peter Vary


On jan. 17, 2019, 10:22 de, MANI M wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69780/
> ---
> 
> (Updated jan. 17, 2019, 10:22 de)
> 
> 
> Review request for hive, Peter Vary and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-21099
> https://issues.apache.org/jira/browse/HIVE-21099
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-21099 Do Not Print StackTraces to STDERR in ConditionalResolverMergeFiles
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
> 80f77b9f0c 
> 
> 
> Diff: https://reviews.apache.org/r/69780/diff/1/
> 
> 
> Testing
> ---
> 
> PRE-COMMIT testing done and the results are available in the JIRA
> https://issues.apache.org/jira/browse/HIVE-21099
> 
> 
> Thanks,
> 
> MANI M
> 
>



Re: Review Request 69683: [HIVE-21071] Improve getInputSummary

2019-01-18 Thread Peter Vary via Review Board


> On jan. 8, 2019, 10:04 de, Peter Vary wrote:
> > Thanks for the patch!
> > Two nits below.
> > Also a bit concerned about the size calculation - seems ok, but it would be 
> > good to have a few test case which validates the contentsummary 
> > calculations (when every path is cached/only few patch is cached/no path is 
> > cached), so we can be sure that further changes will not break the 
> > functionality.
> > 
> > What do you think?
> > 
> > Peter
> 
> David Mollitor wrote:
> Thank you Peter for the review. This functionality is unit tested 
> already. Do you have suggestions for additional unit tests?
> 
> 
> 
> https://github.com/apache/hive/blob/ae008b79b5d52ed6a38875b73025a505725828eb/ql/src/test/org/apache/hadoop/hive/ql/exec/TestUtilities.java

I missed the ones which are testing the full getInputSummary method, and found 
only the ones (testGetInputSummaryPool, testGetInputSummaryPoolAndFailure) 
using the getInputSummaryWithPool.
Seeing those tests I feel much better. Maybe it is an edge case, but it might 
be good to add a test where we call the getInputSummary twice. First with one 
set of path (p1, p2) and then again with another set of path (p1, p2, p3, p4) 
so we can check that the merge of the cached results and the newly fetched ones 
is working.


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69683/#review211758
---


On jan. 14, 2019, 3:36 du, David Mollitor wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69683/
> ---
> 
> (Updated jan. 14, 2019, 3:36 du)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Improve performance of method getInputSummary by changing data structures and 
> allowing multiple threads to do calculations.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 2ff9ad3 
> 
> 
> Diff: https://reviews.apache.org/r/69683/diff/2/
> 
> 
> Testing
> ---
> 
> Unit
> 
> 
> Thanks,
> 
> David Mollitor
> 
>



Re: Review Request 69683: [HIVE-21071] Improve getInputSummary

2019-01-08 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69683/#review211758
---



Thanks for the patch!
Two nits below.
Also a bit concerned about the size calculation - seems ok, but it would be 
good to have a few test case which validates the contentsummary calculations 
(when every path is cached/only few patch is cached/no path is cached), so we 
can be sure that further changes will not break the functionality.

What do you think?

Peter


ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
Lines 2466-2467 (original), 2477-2478 (patched)


nit: This is just a formatting change. Please remove.



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
Line 2503 (original), 2514-2515 (patched)


nit: This is just a formatting change. Please remove.


- Peter Vary


On jan. 7, 2019, 2:26 du, David Mollitor wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69683/
> ---
> 
> (Updated jan. 7, 2019, 2:26 du)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Improve performance of method getInputSummary by changing data structures and 
> allowing multiple threads to do calculations.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java d0f6451 
> 
> 
> Diff: https://reviews.apache.org/r/69683/diff/1/
> 
> 
> Testing
> ---
> 
> Unit
> 
> 
> Thanks,
> 
> David Mollitor
> 
>



Re: Review Request 69633: HIVE-20159 Do Not Print StackTraces to STDERR in ConditionalResolverSkewJoin

2019-01-07 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69633/#review211722
---



Thanks for the fix!
Fix it and ship it!


ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java
Line 121 (original), 125 (patched)


Could you please remove the commented out line?


- Peter Vary


On jan. 5, 2019, 5:50 de, MANI M wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69633/
> ---
> 
> (Updated jan. 5, 2019, 5:50 de)
> 
> 
> Review request for hive, Peter Vary and Vihang Karajgaonkar.
> 
> 
> Bugs: HIVE-20159
> https://issues.apache.org/jira/browse/HIVE-20159
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Do Not Print StackTraces to STDERR in ConditionalResolverSkewJoin HIVE-20159
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverSkewJoin.java 
> 5d25007d77 
> 
> 
> Diff: https://reviews.apache.org/r/69633/diff/1/
> 
> 
> Testing
> ---
> 
> Patch Uploaded to JIRA HIVE-20159, PRE-COMMIT testing completed.
> https://issues.apache.org/jira/browse/HIVE-20159
> 
> 
> Thanks,
> 
> MANI M
> 
>



Re: Review Request 69642: HIVE-20977: Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance

2019-01-04 Thread Peter Vary via Review Board


> On jan. 3, 2019, 1:42 de, Karthik Manamcheri wrote:
> > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetPartitions.java
> > Lines 374 (patched)
> > 
> >
> > The behavior of the getPartitions\* changed to be more inline with how 
> > the other getTable/getDatabase calls work.
> > 
> > Before this change, if you issue a getPartitionsByNames with an empty 
> > database, we threw an exception. After this change, we will return an empty 
> > list of partitions instead. This behavior is similar to what happens if you 
> > issue a getTablesByNames call (an empty list of tables are returned)
> 
> Peter Vary wrote:
> This change is also worrying. This is also an API change which might 
> cause backward incompatibility problems with customers expecting empty list.
> We proposed this kind of changes a way back, and the community consensus 
> was that we should not change even the Exception types that have been thrown.
> 
> Karthik Manamcheri wrote:
> Actually I am fixing a regression bug. In the master branch (without my 
> change), if you remove the TransactionalValidationListener from the list of 
> pre-listeners, the GetPartitions API tests fail! The getPartitions API test 
> depends on the fact there is a a pre-event listener. Customers already expect 
> the API to throw an exception (or return an empty list) depending on if there 
> is a listener attached or not. So the API contract itself is that it can 
> throw an exception (OR return an empty list). This is how earlier versions of 
> Hive are. HIVE-12064 introduced a bug which changed the behavior of clients. 
> My change makes it so that it behaves similar to how earlier versions of Hive 
> behaved (before HIVE-12064).
> 
> We should also fix the tests so that they don't depend of the existence 
> (or non-existence) of listeners or plugins.
> 
> Basically what I am saying is that this API change is not new and is how 
> Hive 1.x/2.x behaved (before HIVE-12064). We wrote the unit tests around the 
> bug!

Bugs which are exist for too long prone to become features :D
I personally agree with you. but I would be more confortible if we could find 1 
more committer to validate our view - maybe Alan Gates, Thejas Nair who were 
the ones who actively participated in our "IMetaStoreClient and HMS Thrift API 
exception handling" discussion. You might want to ping them.

Thanks,
Peter


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69642/#review211625
---


On jan. 3, 2019, 1:40 de, Karthik Manamcheri wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69642/
> ---
> 
> (Updated jan. 3, 2019, 1:40 de)
> 
> 
> Review request for hive, Adam Holley, Na Li, Morio Ramdenbourg, Naveen 
> Gangam, Peter Vary, Sergio Pena, and Vihang Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20977: Lazy evaluate the table object in PreReadTableEvent to improve 
> get_partition performance
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  a9398ae1e7 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/events/PreReadTableEvent.java
>  beec72bc12 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/ThrowingSupplier.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  7429d18226 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
>  fe64a91b56 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetPartitions.java
>  4d7f7c1220 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java
>  a338bd4032 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/events/TestPreReadTableEvent.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69642/diff/4/
> 
> 
> Testing
> ---
> 
> Unit tests.
> Manual performance test with Cloudera BDR to notice improved backup 
> performance.
> 
> 
> Thanks,
> 
> Karthik Manamcheri
> 
>



Re: Review Request 69642: HIVE-20977: Lazy evaluate the table object in PreReadTableEvent to improve get_partition performance

2019-01-03 Thread Peter Vary via Review Board


> On jan. 3, 2019, 1:42 de, Karthik Manamcheri wrote:
> > standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetPartitions.java
> > Lines 374 (patched)
> > 
> >
> > The behavior of the getPartitions\* changed to be more inline with how 
> > the other getTable/getDatabase calls work.
> > 
> > Before this change, if you issue a getPartitionsByNames with an empty 
> > database, we threw an exception. After this change, we will return an empty 
> > list of partitions instead. This behavior is similar to what happens if you 
> > issue a getTablesByNames call (an empty list of tables are returned)

This change is also worrying. This is also an API change which might cause 
backward incompatibility problems with customers expecting empty list.
We proposed this kind of changes a way back, and the community consensus was 
that we should not change even the Exception types that have been thrown.


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69642/#review211625
---


On jan. 3, 2019, 1:40 de, Karthik Manamcheri wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69642/
> ---
> 
> (Updated jan. 3, 2019, 1:40 de)
> 
> 
> Review request for hive, Adam Holley, Na Li, Morio Ramdenbourg, Naveen 
> Gangam, Peter Vary, Sergio Pena, and Vihang Karajgaonkar.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20977: Lazy evaluate the table object in PreReadTableEvent to improve 
> get_partition performance
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
>  a9398ae1e7 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/events/PreReadTableEvent.java
>  beec72bc12 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/ThrowingSupplier.java
>  PRE-CREATION 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
>  7429d18226 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestMetaStoreEventListener.java
>  fe64a91b56 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestGetPartitions.java
>  4d7f7c1220 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/client/TestListPartitions.java
>  a338bd4032 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/events/TestPreReadTableEvent.java
>  PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69642/diff/4/
> 
> 
> Testing
> ---
> 
> Unit tests.
> Manual performance test with Cloudera BDR to notice improved backup 
> performance.
> 
> 
> Thanks,
> 
> Karthik Manamcheri
> 
>



Re: Review Request 69410: HIVE-20330: HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs

2018-11-23 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69410/#review210823
---


Ship it!




Ship It!

- Peter Vary


On nov. 20, 2018, 12:53 du, Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69410/
> ---
> 
> (Updated nov. 20, 2018, 12:53 du)
> 
> 
> Review request for hive, Nandor Kollar and Peter Vary.
> 
> 
> Bugs: HIVE-20330
> https://issues.apache.org/jira/browse/HIVE-20330
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The change in this patch is that we're not just serializing and putting one 
> InputJobInfo into JobConf, but rather always append to a list (or create it 
> on the first occurrence) of InputJobInfo instances in it.
> This ensures that if multiple tables serve as inputs in a job, Pig can 
> retrieve information for each of the tables, not just the last one added.
> 
> I've also discovered a bug in InputJobInfo.writeObject() where the 
> ObjectOutputStream was closed by mistake after writing partition information 
> in a compressed manner. Closing the compressed writer inevitably closed the 
> OOS on the context and prevented any other objects to be written into OOS - I 
> had to fix that because it prevented serializing InputJobInfo instances 
> inside a list.
> 
> 
> Diffs
> -
> 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
> 8e72a1275a5cdcc2d778080fff6bb82198395f5f 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
>  195eaa367933990e3ef0ef879f34049c65822aee 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatBaseInputFormat.java
>  8d7a8f9df9412105ec7d77fad9af0d7dd18f4323 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatInputFormat.java
>  ad6f3eb9f93338023863c6239d6af0449b20ff9c 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java
>  364382d9ccf6eb9fc29689b0eb5f973f422051b4 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InputJobInfo.java
>  ac1dd54be821d32aa008d41514df05a41f16223c 
>   
> hcatalog/core/src/test/java/org/apache/hive/hcatalog/common/TestHCatUtil.java 
> 91aa4fa2693e0b0bd65c1667210af340619f552d 
>   
> hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/HCatLoader.java
>  c3bde2d2a3cbd09fb0b1ed758bf4f2b1041a23cb 
>   
> hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/AbstractHCatLoaderTest.java
>  58981f88ef6abfbf7a4b7ffc3116c53d47e86fde 
> 
> 
> Diff: https://reviews.apache.org/r/69410/diff/1/
> 
> 
> Testing
> ---
> 
> Added (true) unit tests to verify my method of adding/retrieving InputJobInfo 
> instances to/from config instances.
> Added (integration-like) unit tests to mock Pig calling HCatLoader for 
> multiple input tables, and checking the reported input sizes.
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Review Request 69432: HIVE-20964 Create a test that checks the level of the parallel compilation

2018-11-22 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69432/
---

Review request for hive, Denys Kuzmenko, Marta Kuczora, and Adam Szita.


Bugs: HIVE-20964
https://issues.apache.org/jira/browse/HIVE-20964


Repository: hive-git


Description
---

* Created 2 query types in the TestCompileLock mock driver. The original 
SHORT_QUERY is finishing in 0.5s as before, but the new LONG_QUERY will finish 
only after 5s.
* With using the new 5s query I have created a new test where the compile quota 
is 4 and the parallel request number is 10. So the test expects that 6 query 
will fail with timeout.
* Added a new verifyThatTimedOutCompileOpsCount method to validate the number 
of the timed out queries.
* The other changes are just pushing down the query string so the 
compileAndRespond method can decide which query to run.


Diffs
-

  ql/src/test/org/apache/hadoop/hive/ql/TestCompileLock.java 8dc05ff480 


Diff: https://reviews.apache.org/r/69432/diff/1/


Testing
---

Run the new test, and all the old tests in TestCompileLock


Thanks,

Peter Vary



Re: Review Request 69410: HIVE-20330: HCatLoader cannot handle multiple InputJobInfo objects for a job with multiple inputs

2018-11-22 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69410/#review210792
---



My only concen is that some other components might use HCAT_KEY_JOB_INFO 
property values as well? Was this a public property key?

Otherwise nicely done!

- Peter Vary


On nov. 20, 2018, 12:53 du, Adam Szita wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69410/
> ---
> 
> (Updated nov. 20, 2018, 12:53 du)
> 
> 
> Review request for hive, Nandor Kollar and Peter Vary.
> 
> 
> Bugs: HIVE-20330
> https://issues.apache.org/jira/browse/HIVE-20330
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The change in this patch is that we're not just serializing and putting one 
> InputJobInfo into JobConf, but rather always append to a list (or create it 
> on the first occurrence) of InputJobInfo instances in it.
> This ensures that if multiple tables serve as inputs in a job, Pig can 
> retrieve information for each of the tables, not just the last one added.
> 
> I've also discovered a bug in InputJobInfo.writeObject() where the 
> ObjectOutputStream was closed by mistake after writing partition information 
> in a compressed manner. Closing the compressed writer inevitably closed the 
> OOS on the context and prevented any other objects to be written into OOS - I 
> had to fix that because it prevented serializing InputJobInfo instances 
> inside a list.
> 
> 
> Diffs
> -
> 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/common/HCatUtil.java 
> 8e72a1275a5cdcc2d778080fff6bb82198395f5f 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
>  195eaa367933990e3ef0ef879f34049c65822aee 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatBaseInputFormat.java
>  8d7a8f9df9412105ec7d77fad9af0d7dd18f4323 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatInputFormat.java
>  ad6f3eb9f93338023863c6239d6af0449b20ff9c 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InitializeInput.java
>  364382d9ccf6eb9fc29689b0eb5f973f422051b4 
>   
> hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/InputJobInfo.java
>  ac1dd54be821d32aa008d41514df05a41f16223c 
>   
> hcatalog/core/src/test/java/org/apache/hive/hcatalog/common/TestHCatUtil.java 
> 91aa4fa2693e0b0bd65c1667210af340619f552d 
>   
> hcatalog/hcatalog-pig-adapter/src/main/java/org/apache/hive/hcatalog/pig/HCatLoader.java
>  c3bde2d2a3cbd09fb0b1ed758bf4f2b1041a23cb 
>   
> hcatalog/hcatalog-pig-adapter/src/test/java/org/apache/hive/hcatalog/pig/AbstractHCatLoaderTest.java
>  58981f88ef6abfbf7a4b7ffc3116c53d47e86fde 
> 
> 
> Diff: https://reviews.apache.org/r/69410/diff/1/
> 
> 
> Testing
> ---
> 
> Added (true) unit tests to verify my method of adding/retrieving InputJobInfo 
> instances to/from config instances.
> Added (integration-like) unit tests to mock Pig calling HCatLoader for 
> multiple input tables, and checking the reported input sizes.
> 
> 
> Thanks,
> 
> Adam Szita
> 
>



Re: Review Request 69341: HIVE-20891: Call alter_partition in batch when dynamically loading partitions

2018-11-21 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69341/#review210752
---



Mostly just questions about logging


ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java
Lines 139 (patched)


nit: Might want to remove this extra line



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2087 (patched)


Does this really worth a change?
I usually use this only if generating the log message is costly. The 
LOG.debug will definitely will start with the same check...



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2161 (patched)


Does this really worth a change?
I usually use this only if generating the log message is costly. The 
LOG.debug will definitely will start with the same check...



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2211 (patched)


Does this really worth a change?
I usually use this only if generating the log message is costly. The 
LOG.debug will definitely will start with the same check...



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2345 (patched)


Is the output readable with multiple partitions?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2659 (patched)


Does this really worth a change?
I usually use this only if generating the log message is costly. The 
LOG.debug will definitely will start with the same check...



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2720 (patched)


Does this really worth a change?
I usually use this only if generating the log message is costly. The 
LOG.debug will definitely will start with the same check...


- Peter Vary


On nov. 15, 2018, 8:52 de, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69341/
> ---
> 
> (Updated nov. 15, 2018, 8:52 de)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20891: Call alter_partition in batch when dynamically loading partitions
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java 
> e8f362357537e73502f743a9df189dec9be2da5d 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> e185bf49d42da9d1497643c20bbd71edaf071bf1 
> 
> 
> Diff: https://reviews.apache.org/r/69341/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69254: HIVE-20818: Views created with a WHERE subquery will regard views referenced in the subquery as direct input

2018-11-05 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69254/#review210334
---



Just one question.
Thanks,
Peter


ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java
Lines 3321 (patched)


Is the problem only affects CBO, or RBO as well?
What happens when CBO is off?



ql/src/test/org/apache/hadoop/hive/ql/plan/TestViewEntity.java
Lines 198 (patched)


Please remove extra spaces..


- Peter Vary


On nov. 5, 2018, 3:14 du, Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69254/
> ---
> 
> (Updated nov. 5, 2018, 3:14 du)
> 
> 
> Review request for hive.
> 
> 
> Bugs: HIVE-20818
> https://issues.apache.org/jira/browse/HIVE-20818
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> If Hive is configured with an authorization hook like Sentry, and a view is 
> created with a WHERE clause referencing a different view' user has no access 
> to, user cannot access the view as view' is considered direct input.
> WHERE IN and WHERE EXISTS cause the same issue.
> Cascading views created with no WHERE clauses (i.e. with simple SELECTs and 
> FROM clauses) work fine.
> 
> See Jira for example
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/CalcitePlanner.java ab63ce2bc3 
>   ql/src/test/org/apache/hadoop/hive/ql/plan/TestViewEntity.java 6ad38b8467 
> 
> 
> Diff: https://reviews.apache.org/r/69254/diff/1/
> 
> 
> Testing
> ---
> 
> Added unit test
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 69167: HIVE-20796: jdbc URL can contain sensitive information that should not be logged

2018-10-25 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69167/#review210036
---


Ship it!




Ship It!

- Peter Vary


On okt. 25, 2018, 1:36 du, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69167/
> ---
> 
> (Updated okt. 25, 2018, 1:36 du)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20796: jdbc URL can contain sensitive information that should not be 
> logged
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java
>  9c158040497cd3d2762620ce35e2b46bb6d5fffe 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/utils/MetaStoreServerUtils.java
>  f3b38665676391fec9b85eb9a405c14632340dc6 
>   
> standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/utils/TestMetaStoreServerUtils.java
>  f4bdd734dc4e731dda01e6031a4115cde5571baf 
> 
> 
> Diff: https://reviews.apache.org/r/69167/diff/1/
> 
> 
> Testing
> ---
> 
> New unit test created.
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 69155: HIVE-20760: Reducing memory overhead due to multiple HiveConfs

2018-10-25 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69155/#review210029
---



Thanks for the patch Barna!
Indeed this will be a sizeable memory saving!


common/src/java/org/apache/hadoop/hive/common/HiveConfProperties.java
Lines 56 (patched)


Why are we using interner here?



common/src/java/org/apache/hadoop/hive/common/HiveConfProperties.java
Lines 84 (patched)


Can we use interned.getProperty(key, default)?



common/src/java/org/apache/hadoop/hive/common/HiveConfProperties.java
Lines 198 (patched)


I think there are situations when this is not true, if we overwrite 
something the size will be smaller?


- Peter Vary


On okt. 25, 2018, 8:31 de, Barnabas Maidics wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69155/
> ---
> 
> (Updated okt. 25, 2018, 8:31 de)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> The issue is that every Hive task has to load its own version of HiveConf. 
> When running with a large number of cores per executor (HoS), there is a 
> significant (~10%) amount of memory wasted due to this duplication. 
> See more: https://issues.apache.org/jira/browse/HIVE-20760
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/HiveConfProperties.java 
> PRE-CREATION 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 07d5205bed 
>   common/src/test/org/apache/hadoop/hive/conf/TestHiveConfProperties.java 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/69155/diff/1/
> 
> 
> Testing
> ---
> 
> Created unit tests for the new Properties implementation.
> Tested multiple queries output.
> 
> 
> Thanks,
> 
> Barnabas Maidics
> 
>



Re: Review Request 68975: HIVE-20661: Dynamic partitions loading calls add partition for every partition 1-by-1

2018-10-18 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68975/#review209735
---



Thanks Laci!
1 serions question
1 mild one
and several annoying nits :)
Thanks,
Peter


ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 1921 (patched)


nit: one extra space



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Line 1876 (original)


This perf logging is removed... Let's think through if we need other places 
to add some instead considering the new calling structure



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2171 (patched)


What happens when only 1 partiton is already exists from multiple ones?

Also, do we need SynchronizedMSC?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2570 (patched)


Do we need to call this every time in the for loop? I kinda remember that 
we allow only partitions for a single table only... Or I might be mistaken, but 
still might be a good idea to not generate this every time...



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2687 (patched)


I am not really a big fun of printing a stacktrace and rethrowing an error, 
unless I am quite sure that the exception rethrown is not printed later. 
Otherwise this could be quite confusing.


- Peter Vary


On okt. 18, 2018, 7:01 de, Laszlo Pinter wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68975/
> ---
> 
> (Updated okt. 18, 2018, 7:01 de)
> 
> 
> Review request for hive and Peter Vary.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> HIVE-20661: Dynamic partitions loading calls add partition for every 
> partition 1-by-1
> 
> 
> Diffs
> -
> 
>   
> ql/src/java/org/apache/hadoop/hive/metastore/SynchronizedMetaStoreClient.java 
> 0ab77e84c6222d35bcec9ce95ed02014911ef144 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 
> 4de038913a5c9a2c199f71702b8f70ca84d0856b 
>   
> ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
>  dd23d7db3e70c9540e48c42eb7b9a33ed775cea6 
>   
> standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
>  aba63f050b5b98a2aeeb0df6ff2de5e6e06761f2 
> 
> 
> Diff: https://reviews.apache.org/r/68975/diff/5/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Laszlo Pinter
> 
>



Re: Review Request 68975: HIVE-20661: Dynamic partitions loading calls add partition for every partition 1-by-1

2018-10-11 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68975/#review209443
---



Thanks Laszlo!
This is a big patch indeed.
Comments below.
Could you check that the test cases are covering all possible scenarios?
Thanks,
Peter


ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 1896 (patched)


What does this method do?
Moving files and updating partition data if the partition is already exists?
Javadoc would be good anyway.
Follow-up idea: update partition data with alter_partitions (multiple 
updates with 1 HMS call)?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Line 1909 (original), 1939 (patched)


nit: maybe do not reformat these lines if they are not needed



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Line 1939 (original), 1961 (patched)


nit: maybe do not reformat these lines if they are not needed



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Line 1948 (original), 1970 (patched)


nit: maybe do not reformat these lines if they are not needed



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Line 1966 (original), 1988 (patched)


nit: maybe do not reformat these lines if they are not needed



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 1970-1971 (original), 1992-1993 (patched)


nit: maybe do not reformat these lines if they are not needed



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 1978-1979 (original), 2000-2001 (patched)


nit: maybe do not reformat these lines if they are not needed



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2102-2103 (patched)


Concerned about this.
When we call Hive.loadPartition we call 2 methods there:
- loadPartitionInternal - 1 snapshot
- addPartitionToMetastore - 1 snapshot
Are we sure that these calls are:
- Lightweight - so we can happily call them twice
- Return the same value in both ocassions even if some other transacion is 
finished during this time?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2080-2082 (original), 2126-2128 (patched)


Maybe surround this with isDebugEnabled?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2083-2084 (original), 2129-2130 (patched)


We do not use batching for adding partitions?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2492-2506 (patched)


Do we need this? Can't we use equals method of maps to compare instead?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2511 (patched)


Note: We still might fail with OOM if too many new partitions are there. We 
store in memory all of the new and old partition specifications. Was this the 
same before?



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2604-2607 (patched)


Shouldn't we do this when uploading the files?
I think the original intent was to show the progress of the loading of the 
partitions. We might want to keep this functionality.



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Lines 2491-2492 (original), 2615 (patched)


nit: Please-please-please no unneccessary formatting changes... :)



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Line 2496 (original), 2619 (patched)


We need the PerfLogEnd for LOAD_DYNAMIC_PARTITIONS



ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java
Line 2500 (original), 2621 (patched)


Why do we shallow the original exception. We should add as a root cause



ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
Lines 910 (patched)


Do we need this as public?



ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
Lines 952 (patched)


Do we need this as public?



ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java
Lines 1015 (patched)

Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-27 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/#review209081
---



Thanks Denys,
I like this new version.
My last comments are below.
What do you think is it worth to create a new version of the patch?
Thanks,
Peter


ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java
Lines 44 (patched)


Would it be a good idea to remove the public constructor? We are using 
factory to create CompileLock, so we might want to emphasize that



ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java
Lines 64 (patched)


We do not use this anywhere - we might want to consider to remove this 
altogether and keep only the one without parameters?



ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java
Lines 110 (patched)


Can this cause problems if called after a failed tryAcquire? Since this 
method is not used anywhere outside this class, it might be a good idea to 
merge with close.


- Peter Vary


On szept. 26, 2018, 1:08 du, denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68683/
> ---
> 
> (Updated szept. 26, 2018, 1:08 du)
> 
> 
> Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, 
> and Peter Vary.
> 
> 
> Bugs: HIVE-20535
> https://issues.apache.org/jira/browse/HIVE-20535
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When removing the compile lock, it is quite risky to remove it entirely.
> 
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLock.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/lock/CompileLockFactory.java 
> PRE-CREATION 
>   ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68683/diff/7/
> 
> 
> Testing
> ---
> 
> Added CompileLockTest
> 
> 
> File Attachments
> 
> 
> HIVE-20535.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/335b0f4b-ea94-41d4-881a-ec8bb870a376__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/a92b6da2-eeba-46ee-9409-162653826172__HIVE-20535.14.patch
> HIVE-20535.14.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/25/9db4cf76-9188-48fb-bd3d-5b28e43a791b__HIVE-20535.14.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68828: HIVE-20601 : EnvironmentContext null in ALTER_PARTITION event in DbNotificationListener

2018-09-27 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68828/#review209069
---


Ship it!




Ship It!

- Peter Vary


On szept. 24, 2018, 8:42 du, Bharathkrishna Guruvayoor Murali wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68828/
> ---
> 
> (Updated szept. 24, 2018, 8:42 du)
> 
> 
> Review request for hive and Alexander Kolbasov.
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> It will be useful to have the environmentContext passed to 
> DbNotificationListener in this case, to know if the alter happened due to a 
> stat change.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
>  f52ff91a8f2e7710801dcadc4a83ce454992a66a 
> 
> 
> Diff: https://reviews.apache.org/r/68828/diff/1/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Bharathkrishna Guruvayoor Murali
> 
>



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-24 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/#review208968
---



Hi Denys,

Could you please think a little about separating the Manager/Factory and the 
tryAcquire mess?

Incomplete thoughts, but I had to run

Thanks, and sorry :(
Peter


ql/src/java/org/apache/hadoop/hive/ql/CompileLockManager.java
Lines 130 (patched)


nit: I do prefer creating static final variables at the begining of the 
class, or at the first use. Do not create a new patch because of this, but if 
you have to do a new one please move the declaration up to the line ~51



ql/src/java/org/apache/hadoop/hive/ql/Driver.java
Line 1854 (original), 1849-1850 (patched)


This still makes me itching...
I think we should separate the Manager / Factory and the actual lock object.
I would prefer the following:
- CompileLockManager should create the lock object
- Use the lock object as Zoltan suggested (try-with-resources)
- If we decide to keep tryAcquire - can we do it as a wrapper around the 
tryLock method


- Peter Vary


On szept. 19, 2018, 9:37 de, denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68683/
> ---
> 
> (Updated szept. 19, 2018, 9:37 de)
> 
> 
> Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, 
> and Peter Vary.
> 
> 
> Bugs: HIVE-20535
> https://issues.apache.org/jira/browse/HIVE-20535
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When removing the compile lock, it is quite risky to remove it entirely.
> 
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 8c39de3e77 
>   ql/src/java/org/apache/hadoop/hive/ql/CompileLockManager.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 737debd2ad 
>   ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68683/diff/5/
> 
> 
> Testing
> ---
> 
> Added CompileLockTest
> 
> 
> File Attachments
> 
> 
> HIVE-20535.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 68683: Add new configuration to set the size of the global compile lock

2018-09-18 Thread Peter Vary via Review Board


> On szept. 17, 2018, 9:15 de, Zoltan Haindrich wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java
> > Line 507 (original), 666 (patched)
> > 
> >
> > please don't make this method more visible; use compile("sel") or 
> > something...it should work
> 
> denys kuzmenko wrote:
> it's impossible to mock and test compile lock behaviour. Entry point is 
> Driver.compileAndRespond("query"). I do not want to use PowerMock. Actually I 
> tried and faced many issues with hadoop classes.

What about @VisibleForTesting annotation? It could show the intention at 
least...


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68683/#review208625
---


On szept. 17, 2018, 5:55 du, denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68683/
> ---
> 
> (Updated szept. 17, 2018, 5:55 du)
> 
> 
> Review request for hive, Zoltan Haindrich, Zoltan Haindrich, Naveen Gangam, 
> and Peter Vary.
> 
> 
> Bugs: HIVE-20535
> https://issues.apache.org/jira/browse/HIVE-20535
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When removing the compile lock, it is quite risky to remove it entirely.
> 
> It would be good to provide a pool size for the concurrent compilation, so 
> the administrator can limit the load
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3fb8e76 
>   ql/src/java/org/apache/hadoop/hive/ql/CompileLockManager.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java dad2035 
>   ql/src/test/org/apache/hadoop/hive/ql/CompileLockTest.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68683/diff/4/
> 
> 
> Testing
> ---
> 
> Added CompileLockTest
> 
> 
> File Attachments
> 
> 
> HIVE-20535.1.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/13/41f5a84a-70e5-4882-99c1-1cf98c4364e4__HIVE-20535.1.patch
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



Re: Review Request 61663: WebUI query plan graphs

2018-09-14 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61663/#review208622
---



Fix it then ship it!

Thanks for the revitalization of this change!


ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java
Line 508 (original), 509 (patched)


Might want to use LOG.error(String, Throwable)



common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
Lines 3099 (patched)


After committing the change, please do not forget to update the 
configuration Wiki. :)


- Peter Vary


On szept. 7, 2018, 3:24 du, Karen Coppage wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61663/
> ---
> 
> (Updated szept. 7, 2018, 3:24 du)
> 
> 
> Review request for hive, Peter Vary and Xuefu Zhang.
> 
> 
> Bugs: HIVE-17300
> https://issues.apache.org/jira/browse/HIVE-17300
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> I’m working on a feature of the Hive WebUI Query Plan tab that would provide 
> the option to display the query plan as a nice graph (scroll down for 
> screenshots). If you click on one of the graph’s stages, the plan for that 
> stage appears as text below.
> Stages are color-coded if they have a status (Success, Error, Running), and 
> the rest are grayed out. Coloring is based on status already available in the 
> WebUI, under the Stages tab.
> There is an additional option to display stats for MapReduce tasks. This 
> includes the job’s ID, tracking URL (where the logs are found), and mapper 
> and reducer numbers/progress, among other info.
> The library I’m using for the graph is called vis.js (http://visjs.org/). It 
> has an Apache license, and the only necessary file to be included from this 
> library is about 700 KB.
> I tried to keep server-side changes minimal, and graph generation is taken 
> care of by the client. Plans with more than a given number of stages 
> (default: 25) won't be displayed in order to preserve resources.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/common/LogUtils.java 5068eb5be7 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 40ea3ac0c5 
>   
> itests/hive-unit/src/test/java/org/apache/hive/service/cli/session/TestQueryDisplay.java
>  95b46a8149 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java dad2035362 
>   ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java ac45ec46de 
>   ql/src/java/org/apache/hadoop/hive/ql/QueryDisplay.java 9a77c2969e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java 
> eb6cbf71e2 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapRedTask.java a71faf8576 
>   service/src/jamon/org/apache/hive/tmpl/QueryProfileTmpl.jamon f04d655440 
>   service/src/resources/hive-webapps/static/css/query-plan-graph.css 
> PRE-CREATION 
>   service/src/resources/hive-webapps/static/js/query-plan-graph.js 
> PRE-CREATION 
>   service/src/resources/hive-webapps/static/js/vis.min.js PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/61663/diff/5/
> 
> 
> Testing
> ---
> 
> 
> File Attachments
> 
> 
> HIVE-17300.7.patch
>   
> https://reviews.apache.org/media/uploaded/files/2018/09/07/e8ada965-ea33-48a2-a0b1-e56e8185c4fa__HIVE-17300.7.patch
> 
> 
> Thanks,
> 
> Karen Coppage
> 
>



Re: Review Request 68523: Improve org.apache.hadoop.hive.ql.exec.FunctionTask Experience

2018-08-28 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68523/#review208033
---



Thanks for the patch.
LGTM, just one question regarding the tests.

Thanks,
Peter


ql/src/test/queries/clientnegative/create_unknown_permanent_udf.q
Lines 1 (patched)


What is the difference between the create_function_nonexistent_class.q and 
this new test?
Do we need both?


- Peter Vary


On aug. 28, 2018, 1:50 du, denys kuzmenko wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/68523/
> ---
> 
> (Updated aug. 28, 2018, 1:50 du)
> 
> 
> Review request for hive, Marta Kuczora, Peter Vary, and Adam Szita.
> 
> 
> Bugs: HIVE-20466
> https://issues.apache.org/jira/browse/HIVE-20466
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> When a create function statement is submitted, it may fail with the following 
> error:
> 
> Error while processing statement: FAILED: Execution Error, return code 1 from 
> org.apache.hadoop.hive.ql.exec.FunctionTask
> 
> This is not a user-friendly error message.
> 
> 
> Diffs
> -
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 44591842bb 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Registry.java 6f8a8f5504 
>   ql/src/test/queries/clientnegative/create_unknown_permanent_udf.q 
> PRE-CREATION 
>   ql/src/test/results/clientnegative/create_function_nonexistent_class.q.out 
> 77467f66e3 
>   ql/src/test/results/clientnegative/create_unknown_permanent_udf.q.out 
> PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/68523/diff/2/
> 
> 
> Testing
> ---
> 
> added negative qtest to cover this scenario
> 
> 
> Thanks,
> 
> denys kuzmenko
> 
>



  1   2   3   >