Re: Review Request 72392: HIVE-23103 Oracle statement batching

2020-04-21 Thread Marton Bod

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72392/#review220401
---


Ship it!




Ship It!

- Marton Bod


On April 21, 2020, 12:41 p.m., Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72392/
> ---
> 
> (Updated April 21, 2020, 12:41 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko and Marton Bod.
> 
> 
> Bugs: HIVE-23103
> https://issues.apache.org/jira/browse/HIVE-23103
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Examine how to really get better performance for oracle statement batches.
> 
> Oracle JDBC doc describes:
> 
> The Oracle implementation of standard update batching does not implement true 
> batching for generic statements and callable statements. Even though Oracle 
> JDBC supports the use of standard batching for Statement and 
> CallableStatement objects, you are unlikely to see performance improvement.
> 
> I would look for connection properties to set, so it is handled anyway, or if 
> not, then use:
> 
> begin
>   query1;
>   query2;
>   query3;
> end;
> to we will have only a single roundtrip for the db.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  bb29410e7d 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  d080df417b 
> 
> 
> Diff: https://reviews.apache.org/r/72392/diff/2/
> 
> 
> Testing
> ---
> 
> Baseline:
> Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
> Error  Units
> TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  42.988 ± 
> 4.569  ms/op
> TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  45.029 ± 
> 4.686  ms/op
> 
> After patch:
> Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
> Error  Units
> TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  36.208 ± 
> 3.869  ms/op
> TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  37.038 ± 
> 3.746  ms/op
> 
> 
> Thanks,
> 
> Peter Vary
> 
>



Re: Review Request 72392: HIVE-23103 Oracle statement batching

2020-04-21 Thread Peter Vary via Review Board


> On ápr. 21, 2020, 7:51 de, Marton Bod wrote:
> > standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
> > Lines 662 (patched)
> > 
> >
> > Once this part executes, we wouldn't have 'begin' for the next batch, 
> > no? Also, the sb would need to be cleared I think

good find!
Fixed it


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72392/#review220387
---


On ápr. 21, 2020, 12:41 du, Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72392/
> ---
> 
> (Updated ápr. 21, 2020, 12:41 du)
> 
> 
> Review request for hive, Denys Kuzmenko and Marton Bod.
> 
> 
> Bugs: HIVE-23103
> https://issues.apache.org/jira/browse/HIVE-23103
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Examine how to really get better performance for oracle statement batches.
> 
> Oracle JDBC doc describes:
> 
> The Oracle implementation of standard update batching does not implement true 
> batching for generic statements and callable statements. Even though Oracle 
> JDBC supports the use of standard batching for Statement and 
> CallableStatement objects, you are unlikely to see performance improvement.
> 
> I would look for connection properties to set, so it is handled anyway, or if 
> not, then use:
> 
> begin
>   query1;
>   query2;
>   query3;
> end;
> to we will have only a single roundtrip for the db.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  bb29410e7d 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  d080df417b 
> 
> 
> Diff: https://reviews.apache.org/r/72392/diff/2/
> 
> 
> Testing
> ---
> 
> Baseline:
> Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
> Error  Units
> TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  42.988 ± 
> 4.569  ms/op
> TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  45.029 ± 
> 4.686  ms/op
> 
> After patch:
> Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
> Error  Units
> TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  36.208 ± 
> 3.869  ms/op
> TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  37.038 ± 
> 3.746  ms/op
> 
> 
> Thanks,
> 
> Peter Vary
> 
>



Re: Review Request 72392: HIVE-23103 Oracle statement batching

2020-04-21 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72392/
---

(Updated ápr. 21, 2020, 12:41 du)


Review request for hive, Denys Kuzmenko and Marton Bod.


Changes
---

Addessed Marton's comment


Bugs: HIVE-23103
https://issues.apache.org/jira/browse/HIVE-23103


Repository: hive-git


Description
---

Examine how to really get better performance for oracle statement batches.

Oracle JDBC doc describes:

The Oracle implementation of standard update batching does not implement true 
batching for generic statements and callable statements. Even though Oracle 
JDBC supports the use of standard batching for Statement and CallableStatement 
objects, you are unlikely to see performance improvement.

I would look for connection properties to set, so it is handled anyway, or if 
not, then use:

begin
  query1;
  query2;
  query3;
end;
to we will have only a single roundtrip for the db.


Diffs (updated)
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
 bb29410e7d 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 d080df417b 


Diff: https://reviews.apache.org/r/72392/diff/2/

Changes: https://reviews.apache.org/r/72392/diff/1-2/


Testing
---

Baseline:
Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
Error  Units
TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  42.988 ± 
4.569  ms/op
TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  45.029 ± 
4.686  ms/op

After patch:
Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
Error  Units
TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  36.208 ± 
3.869  ms/op
TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  37.038 ± 
3.746  ms/op


Thanks,

Peter Vary



Re: Review Request 72392: HIVE-23103 Oracle statement batching

2020-04-21 Thread Marton Bod

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72392/#review220387
---




standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
Lines 662 (patched)


Once this part executes, we wouldn't have 'begin' for the next batch, no? 
Also, the sb would need to be cleared I think


- Marton Bod


On April 20, 2020, 2:53 p.m., Peter Vary wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/72392/
> ---
> 
> (Updated April 20, 2020, 2:53 p.m.)
> 
> 
> Review request for hive, Denys Kuzmenko and Marton Bod.
> 
> 
> Bugs: HIVE-23103
> https://issues.apache.org/jira/browse/HIVE-23103
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Examine how to really get better performance for oracle statement batches.
> 
> Oracle JDBC doc describes:
> 
> The Oracle implementation of standard update batching does not implement true 
> batching for generic statements and callable statements. Even though Oracle 
> JDBC supports the use of standard batching for Statement and 
> CallableStatement objects, you are unlikely to see performance improvement.
> 
> I would look for connection properties to set, so it is handled anyway, or if 
> not, then use:
> 
> begin
>   query1;
>   query2;
>   query3;
> end;
> to we will have only a single roundtrip for the db.
> 
> 
> Diffs
> -
> 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
>  bb29410e7d 
>   
> standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
>  d080df417b 
> 
> 
> Diff: https://reviews.apache.org/r/72392/diff/1/
> 
> 
> Testing
> ---
> 
> Baseline:
> Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
> Error  Units
> TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  42.988 ± 
> 4.569  ms/op
> TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  45.029 ± 
> 4.686  ms/op
> 
> After patch:
> Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
> Error  Units
> TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  36.208 ± 
> 3.869  ms/op
> TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  37.038 ± 
> 3.746  ms/op
> 
> 
> Thanks,
> 
> Peter Vary
> 
>



Review Request 72392: HIVE-23103 Oracle statement batching

2020-04-20 Thread Peter Vary via Review Board

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/72392/
---

Review request for hive, Denys Kuzmenko and Marton Bod.


Bugs: HIVE-23103
https://issues.apache.org/jira/browse/HIVE-23103


Repository: hive-git


Description
---

Examine how to really get better performance for oracle statement batches.

Oracle JDBC doc describes:

The Oracle implementation of standard update batching does not implement true 
batching for generic statements and callable statements. Even though Oracle 
JDBC supports the use of standard batching for Statement and CallableStatement 
objects, you are unlikely to see performance improvement.

I would look for connection properties to set, so it is handled anyway, or if 
not, then use:

begin
  query1;
  query2;
  query3;
end;
to we will have only a single roundtrip for the db.


Diffs
-

  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnDbUtil.java
 bb29410e7d 
  
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnHandler.java
 d080df417b 


Diff: https://reviews.apache.org/r/72392/diff/1/


Testing
---

Baseline:
Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
Error  Units
TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  42.988 ± 
4.569  ms/op
TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  45.029 ± 
4.686  ms/op

After patch:
Benchmark(dbProduct)  (txnType)  Mode  Cnt   Score   
Error  Units
TxnHandlerBenchRunner.commitTxn   ORACLEDEFAULTss  100  36.208 ± 
3.869  ms/op
TxnHandlerBenchRunner.commitTxn   ORACLE  READ_ONLYss  100  37.038 ± 
3.746  ms/op


Thanks,

Peter Vary