[ 
https://issues.apache.org/jira/browse/OOZIE-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14003427#comment-14003427
 ] 

Hadoop QA commented on OOZIE-1849:
----------------------------------

Testing JIRA OOZIE-1849

Cleaning local git workspace

----------------------------

{color:green}+1 PATCH_APPLIES{color}
{color:green}+1 CLEAN{color}
{color:green}+1 RAW_PATCH_ANALYSIS{color}
.    {color:green}+1{color} the patch does not introduce any @author tags
.    {color:green}+1{color} the patch does not introduce any tabs
.    {color:green}+1{color} the patch does not introduce any trailing spaces
.    {color:green}+1{color} the patch does not introduce any line longer than 
132
.    {color:green}+1{color} the patch does adds/modifies 1 testcase(s)
{color:green}+1 RAT{color}
.    {color:green}+1{color} the patch does not seem to introduce new RAT 
warnings
{color:green}+1 JAVADOC{color}
.    {color:green}+1{color} the patch does not seem to introduce new Javadoc 
warnings
{color:green}+1 COMPILE{color}
.    {color:green}+1{color} HEAD compiles
.    {color:green}+1{color} patch compiles
.    {color:green}+1{color} the patch does not seem to introduce new javac 
warnings
{color:green}+1 BACKWARDS_COMPATIBILITY{color}
.    {color:green}+1{color} the patch does not change any JPA 
Entity/Colum/Basic/Lob/Transient annotations
.    {color:green}+1{color} the patch does not modify JPA files
{color:red}-1 TESTS{color}
.    Tests run: 1447
.    Tests failed: 3
.    Tests errors: 4

.    The patch failed the following testcases:

.      
testBundleStatusTransitRunningFromKilled(org.apache.oozie.service.TestStatusTransitService)
.      
testCoordMaterializeTriggerService3(org.apache.oozie.service.TestCoordMaterializeTriggerService)
.      
testCoordinatorActionCommandsSubmitAndStart(org.apache.oozie.sla.TestSLAEventGeneration)

{color:green}+1 DISTRO{color}
.    {color:green}+1{color} distro tarball builds with the patch 

----------------------------
{color:red}*-1 Overall result, please check the reported -1(s)*{color}


The full output of the test-patch run is available at

.   https://builds.apache.org/job/oozie-trunk-precommit-build/1246/

> If the underlying job finishes while a Workflow is suspended, Oozie can take 
> a while to realize it
> --------------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-1849
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1849
>             Project: Oozie
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 4.0.1
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: OOZIE-1849.patch
>
>
> Suppose you have a Workflow and you suspend it while one of the actions is 
> still RUNNING.  The underlying MR/Pig/etc job will continue running (as 
> expected, because we can't pause those).  However, if that job finishes while 
> the workflow is SUSPENDED, the CallbackServlet will receive the callback, but 
> the ActionCheckXCommand won't update the action:
> {noformat}
> 2014-05-16 17:40:57,959  INFO CallbackServlet:541 - SERVER[rkanter-mbp.local] 
> USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000002-140516173529928-oozie-rkan-W] 
> ACTION[0000002-140516173529928-oozie-rkan-W@mr-node] callback for action 
> [0000002-140516173529928-oozie-rkan-W@mr-node]
> 2014-05-16 17:40:57,985  WARN ActionCheckXCommand:544 - 
> SERVER[rkanter-mbp.local] USER[rkanter] GROUP[-] TOKEN[] APP[map-reduce-wf] 
> JOB[0000002-140516173529928-oozie-rkan-W] 
> ACTION[0000002-140516173529928-oozie-rkan-W@mr-node] E0818: Action 
> [0000002-140516173529928-oozie-rkan-W@mr-node] status is running but WF Job 
> [0000002-140516173529928-oozie-rkan-W] status is [SUSPENDED]. Expected status 
> is RUNNING., Error Code: E0818
> {noformat}
> If you then resume the workflow, the action will stay RUNNING for up to 10 
> minutes (the default fallback polling interval), at which point the 
> ActionCheckerService will run an ActionCheckXCommand that will pass, check 
> the job, and finally mark the action as SUCCESSFUL.
> We should fix this by one of the following:
> # ResumeXCommand should also queue a ActionCheckXCommand (if the workflow was 
> SUSPENDED) so we don't have to wait for the ActionCheckerService
> # ActionCheckXCommand's precondition check should allow SUSPENDED workflows



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to