Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-12-01 Thread András Piros via Review Board


> On July 4, 2017, 11:24 a.m., Peter Bacsko wrote:
> > tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
> > Lines 95 (patched)
> > 
> >
> > Should we make this configurable?
> > 
> > Can a bigger value possibly speed up the whole process?

It depends on the available heap, and the average size of an entity. 1000 I 
found in practice fast enough and also consuming not an awful lot of heap.

Anyway, made this configurable.


- András


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/#review179561
---


On July 17, 2017, 4:05 p.m., András Piros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52782/
> ---
> 
> (Updated July 17, 2017, 4:05 p.m.)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and 
> Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
> size. Furthermore, of at least one row inside that tx fails, we retry the 
> whole batch into separate `EntityTransaction`s each.
> 
> Following error handling is implemented:
> 
> 1. check if all necessary tables are present and empty
> 2. rows are imported till the end even if there are skipped rows in the 
> meanwhile
> 3. if at least one row is skipped in the meanwhile for some 
> `ConstraintViolationException`, we delete all rows of all necessary tables. 
> That enables the user to have the log messages of all the erroneous rows in 
> one run, and Oozie database is never in an inconsistent state of some rows 
> present, some not present of an import
> 
> 
> Diffs
> -
> 
>   core/src/main/resources/oozie-default.xml 
> 832bbe14a4b027e198061527a956e2992cbec174 
>   tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 
> 0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 
>   tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java 
> c43223ef05aa702be49565ba2626314628e63749 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
>   tools/src/test/resources/dumpData/ooziedb_ac.json  
>   tools/src/test/resources/dumpData/ooziedb_bna.json  
>   tools/src/test/resources/dumpData/ooziedb_bnj.json  
>   tools/src/test/resources/dumpData/ooziedb_ca.json  
>   tools/src/test/resources/dumpData/ooziedb_cj.json  
>   tools/src/test/resources/dumpData/ooziedb_slareg.json  
>   tools/src/test/resources/dumpData/ooziedb_slasum.json  
>   tools/src/test/resources/dumpData/ooziedb_sysinfo.json  
>   tools/src/test/resources/dumpData/ooziedb_wf.json  
>   tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/52782/diff/6/
> 
> 
> Testing
> ---
> 
> See `TestDBLoadDump` for further reference.
> 
> 
> Thanks,
> 
> András Piros
> 
>



Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-07-31 Thread Peter Cseh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/#review181790
---


Ship it!




Ship It!

- Peter Cseh


On July 17, 2017, 4:05 p.m., András Piros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52782/
> ---
> 
> (Updated July 17, 2017, 4:05 p.m.)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and 
> Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
> size. Furthermore, of at least one row inside that tx fails, we retry the 
> whole batch into separate `EntityTransaction`s each.
> 
> Following error handling is implemented:
> 
> 1. check if all necessary tables are present and empty
> 2. rows are imported till the end even if there are skipped rows in the 
> meanwhile
> 3. if at least one row is skipped in the meanwhile for some 
> `ConstraintViolationException`, we delete all rows of all necessary tables. 
> That enables the user to have the log messages of all the erroneous rows in 
> one run, and Oozie database is never in an inconsistent state of some rows 
> present, some not present of an import
> 
> 
> Diffs
> -
> 
>   core/src/main/resources/oozie-default.xml 
> 832bbe14a4b027e198061527a956e2992cbec174 
>   tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 
> 0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 
>   tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java 
> c43223ef05aa702be49565ba2626314628e63749 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
>   tools/src/test/resources/dumpData/ooziedb_ac.json  
>   tools/src/test/resources/dumpData/ooziedb_bna.json  
>   tools/src/test/resources/dumpData/ooziedb_bnj.json  
>   tools/src/test/resources/dumpData/ooziedb_ca.json  
>   tools/src/test/resources/dumpData/ooziedb_cj.json  
>   tools/src/test/resources/dumpData/ooziedb_slareg.json  
>   tools/src/test/resources/dumpData/ooziedb_slasum.json  
>   tools/src/test/resources/dumpData/ooziedb_sysinfo.json  
>   tools/src/test/resources/dumpData/ooziedb_wf.json  
>   tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/52782/diff/6/
> 
> 
> Testing
> ---
> 
> See `TestDBLoadDump` for further reference.
> 
> 
> Thanks,
> 
> András Piros
> 
>



Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-07-17 Thread András Piros

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/
---

(Updated July 17, 2017, 4:05 p.m.)


Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and Robert 
Kanter.


Changes
---

Addressing review comments.


Repository: oozie-git


Description
---

We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
size. Furthermore, of at least one row inside that tx fails, we retry the whole 
batch into separate `EntityTransaction`s each.

Following error handling is implemented:

1. check if all necessary tables are present and empty
2. rows are imported till the end even if there are skipped rows in the 
meanwhile
3. if at least one row is skipped in the meanwhile for some 
`ConstraintViolationException`, we delete all rows of all necessary tables. 
That enables the user to have the log messages of all the erroneous rows in one 
run, and Oozie database is never in an inconsistent state of some rows present, 
some not present of an import


Diffs (updated)
-

  core/src/main/resources/oozie-default.xml 
832bbe14a4b027e198061527a956e2992cbec174 
  tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 
0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 
  tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java 
c43223ef05aa702be49565ba2626314628e63749 
  tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
  tools/src/test/resources/dumpData/ooziedb_ac.json  
  tools/src/test/resources/dumpData/ooziedb_bna.json  
  tools/src/test/resources/dumpData/ooziedb_bnj.json  
  tools/src/test/resources/dumpData/ooziedb_ca.json  
  tools/src/test/resources/dumpData/ooziedb_cj.json  
  tools/src/test/resources/dumpData/ooziedb_slareg.json  
  tools/src/test/resources/dumpData/ooziedb_slasum.json  
  tools/src/test/resources/dumpData/ooziedb_sysinfo.json  
  tools/src/test/resources/dumpData/ooziedb_wf.json  
  tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 


Diff: https://reviews.apache.org/r/52782/diff/6/

Changes: https://reviews.apache.org/r/52782/diff/5-6/


Testing
---

See `TestDBLoadDump` for further reference.


Thanks,

András Piros



Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-07-13 Thread Peter Cseh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/#review180420
---




tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 204-208 (original), 298-317 (patched)


By reading in all the entries for one table into a list first then 
persisting it in oozie.db.import.batch.size batches the OOME won't go away. The 
oozie.db.import.batch.size should be taken into account while reading in the 
json file so we don't have to keep more than a oozie.db.import.batch.size 
entities in the memory at once.


- Peter Cseh


On July 4, 2017, noon, András Piros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52782/
> ---
> 
> (Updated July 4, 2017, noon)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and 
> Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
> size. Furthermore, of at least one row inside that tx fails, we retry the 
> whole batch into separate `EntityTransaction`s each.
> 
> Following error handling is implemented:
> 
> 1. check if all necessary tables are present and empty
> 2. rows are imported till the end even if there are skipped rows in the 
> meanwhile
> 3. if at least one row is skipped in the meanwhile for some 
> `ConstraintViolationException`, we delete all rows of all necessary tables. 
> That enables the user to have the log messages of all the erroneous rows in 
> one run, and Oozie database is never in an inconsistent state of some rows 
> present, some not present of an import
> 
> 
> Diffs
> -
> 
>   core/src/main/resources/oozie-default.xml 
> c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
>   tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 
> 0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 
>   tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java 
> c43223ef05aa702be49565ba2626314628e63749 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
>   tools/src/test/resources/dumpData/ooziedb_ac.json  
>   tools/src/test/resources/dumpData/ooziedb_bna.json  
>   tools/src/test/resources/dumpData/ooziedb_bnj.json  
>   tools/src/test/resources/dumpData/ooziedb_ca.json  
>   tools/src/test/resources/dumpData/ooziedb_cj.json  
>   tools/src/test/resources/dumpData/ooziedb_slareg.json  
>   tools/src/test/resources/dumpData/ooziedb_slasum.json  
>   tools/src/test/resources/dumpData/ooziedb_sysinfo.json  
>   tools/src/test/resources/dumpData/ooziedb_wf.json  
>   tools/src/test/resources/dumpData/valid/ooziedb_bna.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_bnj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/52782/diff/5/
> 
> 
> Testing
> ---
> 
> See `TestDBLoadDump` for further reference.
> 
> 
> Thanks,
> 
> András Piros
> 
>



Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-07-04 Thread András Piros

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/
---

(Updated July 4, 2017, noon)


Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and Robert 
Kanter.


Changes
---

Addressing new review comments.


Repository: oozie-git


Description
---

We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
size. Furthermore, of at least one row inside that tx fails, we retry the whole 
batch into separate `EntityTransaction`s each.

Following error handling is implemented:

1. check if all necessary tables are present and empty
2. rows are imported till the end even if there are skipped rows in the 
meanwhile
3. if at least one row is skipped in the meanwhile for some 
`ConstraintViolationException`, we delete all rows of all necessary tables. 
That enables the user to have the log messages of all the erroneous rows in one 
run, and Oozie database is never in an inconsistent state of some rows present, 
some not present of an import


Diffs (updated)
-

  core/src/main/resources/oozie-default.xml 
c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c 
  tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 
0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 
  tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java 
c43223ef05aa702be49565ba2626314628e63749 
  tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
  tools/src/test/resources/dumpData/ooziedb_ac.json  
  tools/src/test/resources/dumpData/ooziedb_bna.json  
  tools/src/test/resources/dumpData/ooziedb_bnj.json  
  tools/src/test/resources/dumpData/ooziedb_ca.json  
  tools/src/test/resources/dumpData/ooziedb_cj.json  
  tools/src/test/resources/dumpData/ooziedb_slareg.json  
  tools/src/test/resources/dumpData/ooziedb_slasum.json  
  tools/src/test/resources/dumpData/ooziedb_sysinfo.json  
  tools/src/test/resources/dumpData/ooziedb_wf.json  
  tools/src/test/resources/dumpData/valid/ooziedb_bna.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_bnj.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 


Diff: https://reviews.apache.org/r/52782/diff/5/

Changes: https://reviews.apache.org/r/52782/diff/4-5/


Testing
---

See `TestDBLoadDump` for further reference.


Thanks,

András Piros



Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-07-04 Thread András Piros


> On July 4, 2017, 11:21 a.m., Peter Bacsko wrote:
> > tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
> > Lines 488 (patched)
> > 
> >
> > Is it possible that currentTransaction() is already open at this point? 
> > If I'm not mistaken, calling begin() again results in an exception being 
> > thrown.

`persistAndTryCommit()` should only be called after the previous 
`EntityTransaction` has already been rolled back. So no, `currentTransaction` 
should not be open. Adding a check.


- András


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/#review179560
---


On July 4, 2017, 10:09 a.m., András Piros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52782/
> ---
> 
> (Updated July 4, 2017, 10:09 a.m.)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and 
> Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
> size. Furthermore, of at least one row inside that tx fails, we retry the 
> whole batch into separate `EntityTransaction`s each.
> 
> Following error handling is implemented:
> 
> 1. check if all necessary tables are present and empty
> 2. rows are imported till the end even if there are skipped rows in the 
> meanwhile
> 3. if at least one row is skipped in the meanwhile for some 
> `ConstraintViolationException`, we delete all rows of all necessary tables. 
> That enables the user to have the log messages of all the erroneous rows in 
> one run, and Oozie database is never in an inconsistent state of some rows 
> present, some not present of an import
> 
> 
> Diffs
> -
> 
>   tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 
> 0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 
>   tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java 
> c43223ef05aa702be49565ba2626314628e63749 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
>   tools/src/test/resources/dumpData/ooziedb_ac.json  
>   tools/src/test/resources/dumpData/ooziedb_bna.json  
>   tools/src/test/resources/dumpData/ooziedb_bnj.json  
>   tools/src/test/resources/dumpData/ooziedb_ca.json  
>   tools/src/test/resources/dumpData/ooziedb_cj.json  
>   tools/src/test/resources/dumpData/ooziedb_slareg.json  
>   tools/src/test/resources/dumpData/ooziedb_slasum.json  
>   tools/src/test/resources/dumpData/ooziedb_sysinfo.json  
>   tools/src/test/resources/dumpData/ooziedb_wf.json  
>   tools/src/test/resources/dumpData/valid/ooziedb_bna.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_bnj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/52782/diff/4/
> 
> 
> Testing
> ---
> 
> See `TestDBLoadDump` for further reference.
> 
> 
> Thanks,
> 
> András Piros
> 
>



Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-07-04 Thread Peter Bacsko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/#review179561
---




tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 95 (patched)


Should we make this configurable?

Can a bigger value possibly speed up the whole process?


- Peter Bacsko


On júl. 4, 2017, 10:09 de, András Piros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52782/
> ---
> 
> (Updated júl. 4, 2017, 10:09 de)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and 
> Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
> size. Furthermore, of at least one row inside that tx fails, we retry the 
> whole batch into separate `EntityTransaction`s each.
> 
> Following error handling is implemented:
> 
> 1. check if all necessary tables are present and empty
> 2. rows are imported till the end even if there are skipped rows in the 
> meanwhile
> 3. if at least one row is skipped in the meanwhile for some 
> `ConstraintViolationException`, we delete all rows of all necessary tables. 
> That enables the user to have the log messages of all the erroneous rows in 
> one run, and Oozie database is never in an inconsistent state of some rows 
> present, some not present of an import
> 
> 
> Diffs
> -
> 
>   tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 
> 0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 
>   tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java 
> c43223ef05aa702be49565ba2626314628e63749 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
>   tools/src/test/resources/dumpData/ooziedb_ac.json  
>   tools/src/test/resources/dumpData/ooziedb_bna.json  
>   tools/src/test/resources/dumpData/ooziedb_bnj.json  
>   tools/src/test/resources/dumpData/ooziedb_ca.json  
>   tools/src/test/resources/dumpData/ooziedb_cj.json  
>   tools/src/test/resources/dumpData/ooziedb_slareg.json  
>   tools/src/test/resources/dumpData/ooziedb_slasum.json  
>   tools/src/test/resources/dumpData/ooziedb_sysinfo.json  
>   tools/src/test/resources/dumpData/ooziedb_wf.json  
>   tools/src/test/resources/dumpData/valid/ooziedb_bna.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_bnj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/52782/diff/4/
> 
> 
> Testing
> ---
> 
> See `TestDBLoadDump` for further reference.
> 
> 
> Thanks,
> 
> András Piros
> 
>



Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-07-04 Thread Peter Bacsko


> On jún. 28, 2017, 11:14 de, Peter Bacsko wrote:
> > tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
> > Lines 353 (patched)
> > 
> >
> > Do we have to instantiate the batch handling mechanism just for the 
> > sake of a Tx begin/commit?
> 
> András Piros wrote:
> I think `BatchTransactionGuard` is pretty lightweight, and it delivers 
> some statistics that may be of good use. It's also a good practice to have 
> appropriate levels of abstraction, and use them.

To me it's a bit distracting. If a read the code, it's not immediately clear 
why we have a "batch handler" if we don't do any batching. We already have the 
entityManager instance in use in the same scope.


> On jún. 28, 2017, 11:14 de, Peter Bacsko wrote:
> > tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
> > Lines 395 (patched)
> > 
> >
> > Define the size of the list?
> 
> András Piros wrote:
> To know the exact size we need to iterate through the file anyway, 
> unfortunately. In practice I didn't encounter measurable performance 
> degradation because of using an auto-growing `ArrayList`.

Was thinking about a practical value like 1024 or 2048, but nevermind.


- Peter


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/#review179086
---


On júl. 4, 2017, 10:09 de, András Piros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52782/
> ---
> 
> (Updated júl. 4, 2017, 10:09 de)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and 
> Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
> size. Furthermore, of at least one row inside that tx fails, we retry the 
> whole batch into separate `EntityTransaction`s each.
> 
> Following error handling is implemented:
> 
> 1. check if all necessary tables are present and empty
> 2. rows are imported till the end even if there are skipped rows in the 
> meanwhile
> 3. if at least one row is skipped in the meanwhile for some 
> `ConstraintViolationException`, we delete all rows of all necessary tables. 
> That enables the user to have the log messages of all the erroneous rows in 
> one run, and Oozie database is never in an inconsistent state of some rows 
> present, some not present of an import
> 
> 
> Diffs
> -
> 
>   tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 
> 0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 
>   tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java 
> c43223ef05aa702be49565ba2626314628e63749 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
>   tools/src/test/resources/dumpData/ooziedb_ac.json  
>   tools/src/test/resources/dumpData/ooziedb_bna.json  
>   tools/src/test/resources/dumpData/ooziedb_bnj.json  
>   tools/src/test/resources/dumpData/ooziedb_ca.json  
>   tools/src/test/resources/dumpData/ooziedb_cj.json  
>   tools/src/test/resources/dumpData/ooziedb_slareg.json  
>   tools/src/test/resources/dumpData/ooziedb_slasum.json  
>   tools/src/test/resources/dumpData/ooziedb_sysinfo.json  
>   tools/src/test/resources/dumpData/ooziedb_wf.json  
>   tools/src/test/resources/dumpData/valid/ooziedb_bna.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_bnj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/52782/diff/4/
> 
> 
> Testing
> ---
> 
> See `TestDBLoadDump` for further reference.
> 
> 
> Thanks,
> 
> András Piros
> 
>



Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-07-04 Thread Peter Bacsko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/#review179560
---




tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 413 (patched)


Could you rephrese the first sentence pls? It sounds a bit cryptic.

Possible sentence I can think of: "Persists entities in batches, that is, 
the actual commit will be done when the number of persistable entities reach 
the batch limit." Just an example.

Also explain why we need to commit in batches (IIRC it was because of OOME, 
right?)



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 467 (patched)


Please check if extra text is necessary after newEntity (JDK8 javadoc is 
very strict)



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 484 (patched)


Same here



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 488 (patched)


Is it possible that currentTransaction() is already open at this point? If 
I'm not mistaken, calling begin() again results in an exception being thrown.



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 505 (patched)


Java8 javadoc checker will whine about the missing text after @return.



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 513 (patched)


Same here



tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java
Lines 234 (patched)


Another Class :)



tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java
Lines 241 (patched)


If the class passed here can only be WorkflowActionBean, does it make sense 
to have this argument at all?



tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java
Lines 264 (patched)


Class


- Peter Bacsko


On júl. 4, 2017, 10:09 de, András Piros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52782/
> ---
> 
> (Updated júl. 4, 2017, 10:09 de)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and 
> Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
> size. Furthermore, of at least one row inside that tx fails, we retry the 
> whole batch into separate `EntityTransaction`s each.
> 
> Following error handling is implemented:
> 
> 1. check if all necessary tables are present and empty
> 2. rows are imported till the end even if there are skipped rows in the 
> meanwhile
> 3. if at least one row is skipped in the meanwhile for some 
> `ConstraintViolationException`, we delete all rows of all necessary tables. 
> That enables the user to have the log messages of all the erroneous rows in 
> one run, and Oozie database is never in an inconsistent state of some rows 
> present, some not present of an import
> 
> 
> Diffs
> -
> 
>   tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 
> 0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 
>   tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java 
> c43223ef05aa702be49565ba2626314628e63749 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
>   tools/src/test/resources/dumpData/ooziedb_ac.json  
>   tools/src/test/resources/dumpData/ooziedb_bna.json  
>   tools/src/test/resources/dumpData/ooziedb_bnj.json  
>   tools/src/test/resources/dumpData/ooziedb_ca.json  
>   tools/src/test/resources/dumpData/ooziedb_cj.json  
>   tools/src/test/resources/dumpData/ooziedb_slareg.json  
>   tools/src/test/resources/dumpData/ooziedb_slasum.json  
>   tools/src/test/resources/dumpData/ooziedb_sysinfo.json  
>   tools/src/test/resources/dumpData/ooziedb_wf.json  
>   tools/src/test/resources/dumpData/valid/ooziedb_bna.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_bnj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
>   tools/src/test/resources

Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-07-04 Thread András Piros

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/
---

(Updated July 4, 2017, 10:09 a.m.)


Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and Robert 
Kanter.


Changes
---

Addressing review comments.


Repository: oozie-git


Description
---

We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
size. Furthermore, of at least one row inside that tx fails, we retry the whole 
batch into separate `EntityTransaction`s each.

Following error handling is implemented:

1. check if all necessary tables are present and empty
2. rows are imported till the end even if there are skipped rows in the 
meanwhile
3. if at least one row is skipped in the meanwhile for some 
`ConstraintViolationException`, we delete all rows of all necessary tables. 
That enables the user to have the log messages of all the erroneous rows in one 
run, and Oozie database is never in an inconsistent state of some rows present, 
some not present of an import


Diffs (updated)
-

  tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 
0e14a30693a76b8b2bdc2f7ceaf3f045d69f4155 
  tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java 
c43223ef05aa702be49565ba2626314628e63749 
  tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
  tools/src/test/resources/dumpData/ooziedb_ac.json  
  tools/src/test/resources/dumpData/ooziedb_bna.json  
  tools/src/test/resources/dumpData/ooziedb_bnj.json  
  tools/src/test/resources/dumpData/ooziedb_ca.json  
  tools/src/test/resources/dumpData/ooziedb_cj.json  
  tools/src/test/resources/dumpData/ooziedb_slareg.json  
  tools/src/test/resources/dumpData/ooziedb_slasum.json  
  tools/src/test/resources/dumpData/ooziedb_sysinfo.json  
  tools/src/test/resources/dumpData/ooziedb_wf.json  
  tools/src/test/resources/dumpData/valid/ooziedb_bna.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_bnj.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 


Diff: https://reviews.apache.org/r/52782/diff/4/

Changes: https://reviews.apache.org/r/52782/diff/3-4/


Testing
---

See `TestDBLoadDump` for further reference.


Thanks,

András Piros



Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-07-04 Thread András Piros


> On June 28, 2017, 11:14 a.m., Peter Bacsko wrote:
> > tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
> > Line 116 (original), 154 (patched)
> > 
> >
> > Wouldn't it be better if this was a public non-static method? That 
> > would make unit testing easier because we wouldn't have to mess around with 
> > calling main() method with proper args to perform the DB export/import. The 
> > main() could still be tested separately. 
> > 
> > Altough it would require modifying the export part, too (so that the 
> > two classes have the same style).
> > 
> > Anyway we can consider this improvement.

Since `TestDBLoadDump` is more of and end-to-end at the moment, I'd leave like 
it is.


> On June 28, 2017, 11:14 a.m., Peter Bacsko wrote:
> > tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
> > Lines 353 (patched)
> > 
> >
> > Do we have to instantiate the batch handling mechanism just for the 
> > sake of a Tx begin/commit?

I think `BatchTransactionGuard` is pretty lightweight, and it delivers some 
statistics that may be of good use. It's also a good practice to have 
appropriate levels of abstraction, and use them.


> On June 28, 2017, 11:14 a.m., Peter Bacsko wrote:
> > tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
> > Lines 395 (patched)
> > 
> >
> > Define the size of the list?

To know the exact size we need to iterate through the file anyway, 
unfortunately. In practice I didn't encounter measurable performance 
degradation because of using an auto-growing `ArrayList`.


> On June 28, 2017, 11:14 a.m., Peter Bacsko wrote:
> > tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java
> > Lines 156 (patched)
> > 
> >
> > Is this extra method call necessary?

IMO it makes code more readable. Zero parameters are always better than one ;)


- András


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/#review179086
---


On June 27, 2017, 10:14 a.m., András Piros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52782/
> ---
> 
> (Updated June 27, 2017, 10:14 a.m.)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and 
> Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
> size. Furthermore, of at least one row inside that tx fails, we retry the 
> whole batch into separate `EntityTransaction`s each.
> 
> Following error handling is implemented:
> 
> 1. check if all necessary tables are present and empty
> 2. rows are imported till the end even if there are skipped rows in the 
> meanwhile
> 3. if at least one row is skipped in the meanwhile for some 
> `ConstraintViolationException`, we delete all rows of all necessary tables. 
> That enables the user to have the log messages of all the erroneous rows in 
> one run, and Oozie database is never in an inconsistent state of some rows 
> present, some not present of an import
> 
> 
> Diffs
> -
> 
>   tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 0e14a30 
>   tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java c43223e 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_bna.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_bnj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_slareg.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_slasum.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
>   tools/src/test/resources/dumpData/ooziedb_ac.json 22bbdc2 
>   tools/src/test/resources/dumpData/ooziedb_bna.json e69de29 
>   tools/src/test/resources/dumpData/ooziedb_bnj.json e69de29 
>   tools/src/test/resources/dumpData/ooziedb_ca.json 2715b94 
>   tools/src/test/resources/dumpData/ooziedb_cj.json 979c10e 
>   tools/src/test/resources/dumpData/ooziedb_slareg.json e69de29 
>   tools/src/test/resources/dumpData/ooziedb_slasum.json e69de29 
>   tools/src/test/resources/dumpData/ooz

Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-06-28 Thread Peter Bacsko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/#review179086
---




tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Line 51 (original), 68 (patched)


I know you didn't do this, but if we're enhancing stuff, we should import 
classes separately.



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 87 (patched)


Does it make sense to make this configurable?



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 95 (patched)


Shouldn't be Class to avoid raw type warning?



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Line 116 (original), 154 (patched)


Wouldn't it be better if this was a public non-static method? That would 
make unit testing easier because we wouldn't have to mess around with calling 
main() method with proper args to perform the DB export/import. The main() 
could still be tested separately. 

Altough it would require modifying the export part, too (so that the two 
classes have the same style).

Anyway we can consider this improvement.



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 271 (patched)


Please make sure you properly document this method. At first glance, it's 
very unclear what happens here.



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 275 (patched)


Minor: what about defining the size of the underlying array?



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 300 (patched)


If importEntry is null, this piece of code runs regardless - which isn't a 
big deal, but we could return earlier.



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 349 (patched)


Class



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 353 (patched)


Do we have to instantiate the batch handling mechanism just for the sake of 
a Tx begin/commit?



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 389 (patched)


Minor: add new line



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 390 (patched)


Please document the purpose of this class.

I would class it BatchTransactionHandler. I feel the word "guard" is 
slightly misleading.



tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java
Lines 395 (patched)


Define the size of the list?



tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java
Lines 21 (patched)


No stars :)



tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java
Line 26 (original), 29 (patched)


No stars :)



tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java
Lines 75 (patched)


Please add a short comment why this is necessary (and what is does)



tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java
Lines 108 (patched)


Could you move all private methods below the public test methods?



tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java
Lines 116 (patched)


Minor: line is not properly aligned



tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java
Lines 156 (patched)


Is this extra method call necessary?


- Peter Bacsko


On jún. 27, 2017, 10:14 de, András Piros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52782/
> ---
> 
> (Updated jún. 27, 2017, 10:14 de)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and 
> Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
> size. Furthermore, of at least one row inside that tx fails, we retry the 
> whole batch into separate `EntityTransaction`s each.
> 
> 

Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2017-06-27 Thread András Piros

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/
---

(Updated June 27, 2017, 10:14 a.m.)


Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and Robert 
Kanter.


Changes
---

Syncing w/ upstream patch.


Repository: oozie-git


Description
---

We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
size. Furthermore, of at least one row inside that tx fails, we retry the whole 
batch into separate `EntityTransaction`s each.

Following error handling is implemented:

1. check if all necessary tables are present and empty
2. rows are imported till the end even if there are skipped rows in the 
meanwhile
3. if at least one row is skipped in the meanwhile for some 
`ConstraintViolationException`, we delete all rows of all necessary tables. 
That enables the user to have the log messages of all the erroneous rows in one 
run, and Oozie database is never in an inconsistent state of some rows present, 
some not present of an import


Diffs (updated)
-

  tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 0e14a30 
  tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java c43223e 
  tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_bna.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_bnj.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_slareg.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_slasum.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
  tools/src/test/resources/dumpData/ooziedb_ac.json 22bbdc2 
  tools/src/test/resources/dumpData/ooziedb_bna.json e69de29 
  tools/src/test/resources/dumpData/ooziedb_bnj.json e69de29 
  tools/src/test/resources/dumpData/ooziedb_ca.json 2715b94 
  tools/src/test/resources/dumpData/ooziedb_cj.json 979c10e 
  tools/src/test/resources/dumpData/ooziedb_slareg.json e69de29 
  tools/src/test/resources/dumpData/ooziedb_slasum.json e69de29 
  tools/src/test/resources/dumpData/ooziedb_sysinfo.json 15de009 
  tools/src/test/resources/dumpData/ooziedb_wf.json 05e7e36 
  tools/src/test/resources/dumpData/valid/ooziedb_ac.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_bna.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_bnj.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_ca.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_cj.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_sysinfo.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_wf.json PRE-CREATION 


Diff: https://reviews.apache.org/r/52782/diff/3/

Changes: https://reviews.apache.org/r/52782/diff/2-3/


Testing
---

See `TestDBLoadDump` for further reference.


Thanks,

András Piros



Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2016-10-12 Thread András Piros

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/
---

(Updated Oct. 12, 2016, 3:01 p.m.)


Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and Robert 
Kanter.


Changes
---

Some `JPAService` modifications due to needed schema creation is not needed 
anymore.


Repository: oozie-git


Description
---

We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
size. Furthermore, of at least one row inside that tx fails, we retry the whole 
batch into separate `EntityTransaction`s each.

Following error handling is implemented:

1. check if all necessary tables are present and empty
2. rows are imported till the end even if there are skipped rows in the 
meanwhile
3. if at least one row is skipped in the meanwhile for some 
`ConstraintViolationException`, we delete all rows of all necessary tables. 
That enables the user to have the log messages of all the erroneous rows in one 
run, and Oozie database is never in an inconsistent state of some rows present, 
some not present of an import


Diffs (updated)
-

  tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 0e14a30 
  tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java c43223e 
  tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_bna.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_bnj.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_slareg.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_slasum.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
  tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
  tools/src/test/resources/dumpData/ooziedb_ac.json 22bbdc2 
  tools/src/test/resources/dumpData/ooziedb_bna.json e69de29 
  tools/src/test/resources/dumpData/ooziedb_bnj.json e69de29 
  tools/src/test/resources/dumpData/ooziedb_ca.json 2715b94 
  tools/src/test/resources/dumpData/ooziedb_cj.json 979c10e 
  tools/src/test/resources/dumpData/ooziedb_slareg.json e69de29 
  tools/src/test/resources/dumpData/ooziedb_slasum.json e69de29 
  tools/src/test/resources/dumpData/ooziedb_sysinfo.json 15de009 
  tools/src/test/resources/dumpData/ooziedb_wf.json 05e7e36 
  tools/src/test/resources/dumpData/valid/ooziedb_ac.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_bna.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_bnj.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_ca.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_cj.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_sysinfo.json PRE-CREATION 
  tools/src/test/resources/dumpData/valid/ooziedb_wf.json PRE-CREATION 

Diff: https://reviews.apache.org/r/52782/diff/


Testing
---

See `TestDBLoadDump` for further reference.


Thanks,

András Piros



Re: Review Request 52782: OOZIE-2662 DB migration fails if DB is too big

2016-10-12 Thread Peter Cseh

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52782/#review152308
---




core/src/main/java/org/apache/oozie/service/JPAService.java (lines 183 - 186)


Can you add this into hsqldb-oozie-site.xml used for the tests rather than 
the JPAService?


- Peter Cseh


On Oct. 12, 2016, 1:51 p.m., András Piros wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52782/
> ---
> 
> (Updated Oct. 12, 2016, 1:51 p.m.)
> 
> 
> Review request for oozie, Attila Sasvari, Peter Cseh, Peter Bacsko, and 
> Robert Kanter.
> 
> 
> Repository: oozie-git
> 
> 
> Description
> ---
> 
> We get each 1000 rows into a separate JPA `EntityTransaction` to reduce heap 
> size. Furthermore, of at least one row inside that tx fails, we retry the 
> whole batch into separate `EntityTransaction`s each.
> 
> Following error handling is implemented:
> 
> 1. check if all necessary tables are present and empty
> 2. rows are imported till the end even if there are skipped rows in the 
> meanwhile
> 3. if at least one row is skipped in the meanwhile for some 
> `ConstraintViolationException`, we delete all rows of all necessary tables. 
> That enables the user to have the log messages of all the erroneous rows in 
> one run, and Oozie database is never in an inconsistent state of some rows 
> present, some not present of an import
> 
> 
> Diffs
> -
> 
>   core/src/main/java/org/apache/oozie/service/JPAService.java 028381d 
>   tools/src/main/java/org/apache/oozie/tools/OozieDBImportCLI.java 0e14a30 
>   tools/src/test/java/org/apache/oozie/tools/TestDBLoadDump.java c43223e 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ac.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_bna.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_bnj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_ca.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_cj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_slareg.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_slasum.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_sysinfo.json PRE-CREATION 
>   tools/src/test/resources/dumpData/invalid/ooziedb_wf.json PRE-CREATION 
>   tools/src/test/resources/dumpData/ooziedb_ac.json 22bbdc2 
>   tools/src/test/resources/dumpData/ooziedb_bna.json e69de29 
>   tools/src/test/resources/dumpData/ooziedb_bnj.json e69de29 
>   tools/src/test/resources/dumpData/ooziedb_ca.json 2715b94 
>   tools/src/test/resources/dumpData/ooziedb_cj.json 979c10e 
>   tools/src/test/resources/dumpData/ooziedb_slareg.json e69de29 
>   tools/src/test/resources/dumpData/ooziedb_slasum.json e69de29 
>   tools/src/test/resources/dumpData/ooziedb_sysinfo.json 15de009 
>   tools/src/test/resources/dumpData/ooziedb_wf.json 05e7e36 
>   tools/src/test/resources/dumpData/valid/ooziedb_ac.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_bna.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_bnj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_ca.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_cj.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slareg.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_slasum.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_sysinfo.json PRE-CREATION 
>   tools/src/test/resources/dumpData/valid/ooziedb_wf.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/52782/diff/
> 
> 
> Testing
> ---
> 
> See `TestDBLoadDump` for further reference.
> 
> 
> Thanks,
> 
> András Piros
> 
>