[ 
https://issues.apache.org/jira/browse/OOZIE-1978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15375671#comment-15375671
 ] 

Hadoop QA commented on OOZIE-1978:
----------------------------------

Testing JIRA OOZIE-1978

Cleaning local git workspace

----------------------------

{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:red}-1 RAW_PATCH_ANALYSIS{color}
.    {color:green}+1{color} the patch does not introduce any @author tags
.    {color:green}+1{color} the patch does not introduce any tabs
.    {color:red}-1{color} the patch contains 1 line(s) with trailing spaces
.    {color:red}-1{color} the patch contains 2 line(s) longer than 132 
characters
.    {color:green}+1{color} the patch does adds/modifies 2 testcase(s)
{color:green}+1 RAT{color}
.    {color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.    {color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.    {color:green}+1{color} HEAD compiles
.    {color:green}+1{color} patch compiles
.    {color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.    {color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.    {color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.    Tests run: 1787
.    Tests failed: 1
.    Tests errors: 0

.    The patch failed the following testcases:

.      
testActionCheckTransientDuringLauncher(org.apache.oozie.command.wf.TestActionCheckXCommand)

{color:green}+1 DISTRO{color}
.    {color:green}+1{color} distro tarball builds with the patch 

----------------------------
{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/3037/

> Forkjoin validation code is ridiculously slow in some cases
> -----------------------------------------------------------
>
>                 Key: OOZIE-1978
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1978
>             Project: Oozie
>          Issue Type: Bug
>          Components: core
>    Affects Versions: trunk, 4.0.1
>            Reporter: Robert Kanter
>            Assignee: Peter Bacsko
>             Fix For: trunk
>
>         Attachments: OOZIE-1978-001.patch, OOZIE-1978-002.patch, 
> OOZIE-1978_wip.001.patch, workflow.xml
>
>
> We've had a few users who have run into problems where submitting a workflow 
> appears to hang (in the case of a subworkflow, it's similar but stuck in 
> PREP).  It turns out that if you wait long enough, it will actually go 
> through and the workflow will run normally.  The problem is that the forkjoin 
> validation code is taking a really long time.
> The attached example has a series of 20 forks where each fork has 6 actions 
> (it's based on an actual workflow, but all of the names were changed and the 
> actions were all replaced by simple shell actions).  One of our support guys 
> said it took 1-2 hours , but on my computer it was taking {color:red}*15+ 
> hours*{color} (I had to cancel it)
> While this example doesn't have any nested forks, those can also take a long 
> time too.
> It's easy to verify that it's the forkjoin validation code that's taking so 
> long by looking at a jstack of the Oozie server and seeing deep recursive 
> calls to 
> {{org.apache.oozie.workflow.lite.LiteWorkflowAppParser.validateForkJoin}}.  I 
> also noticed a lot of sitting around in calls LinkedList.contains.  
> I think we have 3 options:
> # See if we can make the existing code faster somehow.  Perhaps there's a way 
> to parallelize it?  Maybe there's some redundant checking that we can 
> identify and skip? Change some data structures? etc
> # See if we can write a new way to do this validation.  I had originally 
> completely rewritten this code a while ago, and we've since made a few fixes 
> to catch edge cases and things.  Perhaps it needs another rewrite?
> # Try to identify when it's taking a long time and at least let the user know 
> what's happening or something.  Right now, it just appears that the Oozie CLI 
> has hung and the job doesn't show up in the Oozie server.  Most users aren't 
> going to wait more than a minute or two.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to