[ 
https://issues.apache.org/jira/browse/OOZIE-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735125#comment-13735125
 ] 

Virag Kothari commented on OOZIE-1448:
--------------------------------------

[~tucu00], [~rkanter], OOZIE-1424 doesn't have to do anything with this. ( I 
searched for 'new CoordActionUpdateXCommand' but no entry). I believe this has 
been in Oozie since a long time. 
I think we have this hacky retry logic because if there is some exception in 
retrieving the bean, we dont want to immediately fail the command. We retry few 
times because if the command fails, there is no way of retrieving it again. 
Ideally, this piece should be removed and the RecoveryService should have way 
to recover this command. Also, due to no recovery of this command, its always 
being called directly instead of being queued.
Robert, For the coord action being null it seems that was only required before 
your patch. Now it can be removed as you take care in updateParentIfNecessary.






                
> A CoordActionUpdateXCommand gets queued for all workflows even if they were 
> not launched by a coordinator
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-1448
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1448
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: OOZIE-1448.patch, OOZIE-1448.patch, OOZIE-1448.patch, 
> OOZIE-1448.patch, OOZIE-1448.patch
>
>
> Once a workflow (that wasn't started by a coordinator) ends, there's almost 
> always a warning/error logged that looks like this:
> {noformat}
> 2013-07-09 16:16:54,711  WARN CoordActionUpdateXCommand:542 - USER[rkanter] 
> GROUP[-] TOKEN[] APP[pig-wf] JOB[0000000-130709161625948-oozie-rkan-W] 
> ACTION[-] E1100: Command precondition does not hold before execution, [, 
> coord action is null], Error Code: E1100
> {noformat}
> The error is harmless, but it tends to confuse users who think that something 
> went wrong.  It also means that we have an extra unnecessary command in the 
> queue for every workflow that wasn't started by a coordinator.
> In SignalXCommand, there is a line like this:
> {code:java}
> new CoordActionUpdateXCommand(wfJob).call();    //Note: Called even if wf is 
> not necessarily instantiated by coordinator
> {code}
> The comment is part of the original code, and makes me think that this was 
> done on purpose or perhaps when there wasn't a good way to check if a 
> workflow was started by a coordinator?
> I think we can fix this by simply checking if the parent of {{wfJob}} is a 
> coordinator.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to