> On July 4, 2017, 1:02 p.m., Peter Bacsko wrote: > > core/src/main/java/org/apache/oozie/service/JPAService.java > > Line 272 (original), 329 (patched) > > <https://reviews.apache.org/r/60544/diff/2/?file=1768467#file1768467line332> > > > > I really don't like this code! It's been here for a long time, but it's > > very smelly. > > > > Should we remove this? Could pls check if we have tests that take > > advantage of this fault injection>
Unfortunately there are lots of tests based on `SkipCommitFaultInjection` system property: ``` TestActionFailover TestBatchQueryExecutor TestBundleJobsDeleteJPAExecutor TestCoordActionsDeleteJPAExecutor TestCoordJobsDeleteJPAExecutor TestWorkflowJobsDeleteJPAExecutor TestSLACalculationJPAExecutor ``` so I'm letting it stay :-( > On July 4, 2017, 1:02 p.m., Peter Bacsko wrote: > > core/src/main/java/org/apache/oozie/service/JPAService.java > > Line 334 (original), 400 (patched) > > <https://reviews.apache.org/r/60544/diff/2/?file=1768467#file1768467line403> > > > > Check this injection usage Unfortunately there are lots of tests based on `SkipCommitFaultInjection` system property: ``` TestActionFailover TestBatchQueryExecutor TestBundleJobsDeleteJPAExecutor TestCoordActionsDeleteJPAExecutor TestCoordJobsDeleteJPAExecutor TestWorkflowJobsDeleteJPAExecutor TestSLACalculationJPAExecutor ``` so I'm letting it stay :-( > On July 4, 2017, 1:02 p.m., Peter Bacsko wrote: > > core/src/main/java/org/apache/oozie/service/JPAService.java > > Line 432 (original), 515 (patched) > > <https://reviews.apache.org/r/60544/diff/2/?file=1768467#file1768467line518> > > > > Just like above - check if this is worth keeping Unfortunately there are lots of tests based on `SkipCommitFaultInjection` system property: ``` TestActionFailover TestBatchQueryExecutor TestBundleJobsDeleteJPAExecutor TestCoordActionsDeleteJPAExecutor TestCoordJobsDeleteJPAExecutor TestWorkflowJobsDeleteJPAExecutor TestSLACalculationJPAExecutor ``` so I'm letting it stay :-( > On July 4, 2017, 1:02 p.m., Peter Bacsko wrote: > > core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java > > Lines 28 (patched) > > <https://reviews.apache.org/r/60544/diff/2/?file=1768477#file1768477line28> > > > > Let's rewrite this explanation and discuss f2f. > > Peter Bacsko wrote: > My idea: > > "This class tracks nested OperationRetryHandler calls. Some JPAExecutor > implementations call other JPAExecutors. This results in two (or possibly > more) OperationRetryHandler.executeWithRetry() calls. If the innermost retry > handler has exhausted all attempts and re-throws the exception, then the > outer handler catches it and would re-start the JPA operation again. In order > to avoid this, RetryHandlers must communicate with each other on the same > thread by incrementing/decrementing the nesting level and signalling whether > the maximum number of attempts have been reached. > > We use thread locals because RetryHandlers might be called from different > threads in parallel. If the nesting level is 0, it's important to reset the > "retryAttemptsExhausted" back to false since this variable is re-used in the > thread pool." Thanks, incorporated the class javadocs. > On July 4, 2017, 1:02 p.m., Peter Bacsko wrote: > > minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java > > Lines 80 (patched) > > <https://reviews.apache.org/r/60544/diff/2/?file=1768488#file1768488line80> > > > > This assertion will not be evaluated by JUnit because it's called on a > > different thread. Instead modify a volatile boolean flag that something > > wasn't quite right. Using a `private volatile Exception` here. - András ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/60544/#review179570 ----------------------------------------------------------- On July 3, 2017, 3:15 p.m., András Piros wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/60544/ > ----------------------------------------------------------- > > (Updated July 3, 2017, 3:15 p.m.) > > > Review request for oozie, Attila Sasvari, Peter Cseh, and Peter Bacsko. > > > Repository: oozie-git > > > Description > ------- > > https://issues.apache.org/jira/browse/OOZIE-2854 > > > Diffs > ----- > > core/pom.xml acddf349a89cf09a7fc4f384ebcaec56dfd0ab48 > > core/src/main/java/org/apache/oozie/executor/jpa/JsonBeanPersisterExecutor.java > PRE-CREATION > core/src/main/java/org/apache/oozie/executor/jpa/QueryExecutor.java > 8d94c23e40d1281864db40e141b200ca207a6324 > core/src/main/java/org/apache/oozie/service/JPAService.java > 028381d3b72bcc3b8c2cd27cacb3e0ac6d48d146 > core/src/main/java/org/apache/oozie/sla/SLASummaryBean.java > cfe1522a4b1f89085eb29e7f1281c2abd631bdc2 > core/src/main/java/org/apache/oozie/store/WorkflowStore.java > c565e74893b863caef6c93015cfe38fe520d04ec > core/src/main/java/org/apache/oozie/util/db/BasicDataSourceWrapper.java > PRE-CREATION > core/src/main/java/org/apache/oozie/util/db/DatabaseRetryPredicate.java > PRE-CREATION > core/src/main/java/org/apache/oozie/util/db/FailingConnectionWrapper.java > PRE-CREATION > core/src/main/java/org/apache/oozie/util/db/FailingHSQLDBDriverWrapper.java > PRE-CREATION > core/src/main/java/org/apache/oozie/util/db/FailingMySQLDriverWrapper.java > PRE-CREATION > core/src/main/java/org/apache/oozie/util/db/OperationRetryHandler.java > PRE-CREATION > > core/src/main/java/org/apache/oozie/util/db/PersistenceExceptionSubclassFilterRetryPredicate.java > PRE-CREATION > core/src/main/java/org/apache/oozie/util/db/RetryAttemptCounter.java > PRE-CREATION > core/src/main/java/org/apache/oozie/util/db/RuntimeExceptionInjector.java > PRE-CREATION > core/src/main/resources/META-INF/persistence.xml > bad9278597fcd4f93b4cc482afae8af14beaa922 > core/src/main/resources/oozie-default.xml > c60a4581a84d4c67a1ac1cf3dfdc252b85ccd01c > core/src/main/resources/oozie-log4j.properties > c065f3cd4c5a3df1308b69d7c16e8fcfa8796efc > core/src/test/java/org/apache/oozie/test/XTestCase.java > 161927ac8f1132b3080d2924844826fcc7b807a5 > > core/src/test/java/org/apache/oozie/util/db/TestOozieDmlStatementPredicate.java > PRE-CREATION > core/src/test/java/org/apache/oozie/util/db/TestOperationRetryHandler.java > PRE-CREATION > > core/src/test/java/org/apache/oozie/util/db/TestPersistenceExceptionSubclassFilterRetryPredicate.java > PRE-CREATION > core/src/test/java/org/apache/oozie/util/db/TestRetryAttemptCounter.java > PRE-CREATION > minitest/pom.xml 9515284bb5f32c279a93161c10e6571680e4f9fc > > minitest/src/test/java/org/apache/oozie/test/TestParallelJPAOperationRetries.java > PRE-CREATION > minitest/src/test/java/org/apache/oozie/test/TestWorkflowRetries.java > PRE-CREATION > minitest/src/test/java/org/apache/oozie/test/WorkflowTest.java > 2845f0af6efb9ef75fdbfcb326115c62e6fb3bdd > minitest/src/test/resources/hsqldb-oozie-site.xml > fa5fe9c3185e973e8247d7bf10b126119d9c02c9 > minitest/src/test/resources/oozie-log4j.properties > c142d725140930bfa89cd2b163d0768a4c3a750a > minitest/src/test/resources/parallel-fs-and-shell.xml PRE-CREATION > minitest/src/test/resources/wf-test.xml > 20c4946862039a65c76ed7f49991345e90a694de > pom.xml 16c5137d44d7db891da46f80adb51c85e4c1b214 > > > Diff: https://reviews.apache.org/r/60544/diff/2/ > > > Testing > ------- > > Tests covered in code: > > Unit tests > ========== > > * testing the retry handler, the retry predicate filter, and parallel calls > to JPA `EntityManager` (mostly Oozie database reads and writes) when > injecting failures > > Integration tests > ================= > > * using the `MiniOozieTestCase` framework > * fixing it so that also asynchronous workflow applications (the ones that > use `CallableQueueService`) can be run > * following workflow scenarios: > * a very simple one consisting only of a `<start/>` and an `<end/>` node > * a more sophisticated one consisting of multiple synchronous `<fs/>` nodes > and a `<decision/>` node > * the ultimate one consisting of a `<decision/>` node, and two branches of an > `<fs/>` and an asynchronous `<shell/>` nodes > > Test cases run: > ``` > mvn clean test > -Dtest=TestOperationRetryHandler,TestPersistenceExceptionSubclassFilterRetryPredicate,TestParallelJPAOperationRetries,TestWorkflow,TestWorkflowRetries,TestJPAService,TestRetryAttemptCounter > ``` > > Functional and stress tests performed on a 4-node MySQL cluster. MySQL daemon > has been stopped / killed / restarted several times. Also firewall rules have > been modified temporarily to simulate network outages. > > > Thanks, > > András Piros > >
