[ 
https://issues.apache.org/jira/browse/OOZIE-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16101683#comment-16101683
 ] 

Andras Piros edited comment on OOZIE-3009 at 7/26/17 2:46 PM:
--------------------------------------------------------------

Root cause is following:
* there are (at least following) five test classes that are at the moment 
retrying the database operations for an extended time period, but they 
shouldn't (since these use {{FaultInjection}} facility, or perform any other 
kind of faulty {{JPAService.execute()}} call):
** {{TestBundleJobsDeleteJPAExecutor}}
** {{TestCoordJobsDeleteJPAExecutor}}
** {{TestPurgeService}}
** {{TestPurgeXCommand}}
** {{TestWorkflowJobsDeleteJPAExecutor}}
* as a consequence, these tend to timeout often
* when that happens, all the other modules are skipped because of the timeout
* the test cases of the skipped modules are not rerun on the second run

We need to extend {{PersistenceExceptionSubclassFilterRetryPredicate}} by 
following:
* if the {{Throwable}} does not have a root cause, and is either a 
{{RuntimeException}} (in case of {{FaultInjection}}), or a 
{{JPAExecutorException}} (e.g. {{Query.getResultList()}} returned an empty 
{{List}}), the predicate returns {{false}}
* for all the other cases the existing behavior kicks in

It's also needed to enhance 
{{PersistenceExceptionSubclassFilterRetryPredicate}} because there are lots of 
places in the codebase where causeless {{JPAExecutorException}} instances are 
thrown where there is actually a reasonable cause, e.g. {{NoResultException}}. 
For those cases we will not retry the database operation.


was (Author: andras.piros):
Root cause is following:
* there are (at least following) five test classes that are at the moment 
retrying the database operations for an extended time period, but they 
shouldn't (since these use {{FaultInjection}} facility, or perform any other 
kind of faulty {{JPAService.execute()}} call):
** {{TestBundleJobsDeleteJPAExecutor}}
** {{TestCoordJobsDeleteJPAExecutor}}
** {{TestPurgeService}}
** {{TestPurgeXCommand}}
** {{TestWorkflowJobsDeleteJPAExecutor}}
* as a consequence, these tend to timeout often
* when that happens, all the other modules are skipped because of the timeout
* the test cases of the skipped modules are not rerun on the second run

We need to extend {{PersistenceExceptionSubclassFilterRetryPredicate}} by 
following:
* if the {{Throwable}} does not have a root cause, and is either a 
{{RuntimeException}} (in case of {{FaultInjection}}), or a 
{{JPAExecutorException}} (e.g. {{Query.getResultList()}} returned an empty 
{{List}}), the predicate returns {{false}}
* for all the other cases the existing behavior kicks in

> Number of Oozie tests executed dropped after OOZIE-2854
> -------------------------------------------------------
>
>                 Key: OOZIE-3009
>                 URL: https://issues.apache.org/jira/browse/OOZIE-3009
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Attila Sasvari
>            Assignee: Andras Piros
>            Priority: Blocker
>
> I noticed that the number of executed tests has been significantly dropped 
> after OOZIE-2854.
> - Tests run: *1080* https://issues.apache.org/jira/browse/OOZIE-2854
> Previous tests:
> - OOZIE-2371 - Tests run: *1965* 
> https://issues.apache.org/jira/browse/OOZIE-2371?focusedCommentId=16076996&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16076996
> - OOZIE-2911 - Tests run: *1966* 
> https://issues.apache.org/jira/browse/OOZIE-2911?focusedCommentId=16078078&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16078078
> In OOZIE-2854, we can also see that the number of test cases vary from patch 
> to patch.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to