[ 
https://issues.apache.org/jira/browse/OOZIE-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13873217#comment-13873217
 ] 

Shwetha G S commented on OOZIE-885:
-----------------------------------

[~virag], wondering why this was added in RecoveryService.runBundleRecovery()
{noformat}
                    if (baction.getCoordId() == null) {
                        log.error("CoordId is null for Bundle action " + 
baction.getBundleActionId());
                        continue;
                    }
{noformat}
In BundleStartXCommand(), a row is created in bundle action(coord id is null) 
and queues CoordSubmitXCommand. If CoordSubmitXCommand is lost, recovery 
service should pick this bundle action and queue CoordSubmitXCommand. But this 
if condition exits if coord id is null. How does recovery on bundle action work 
with this? What am I missing here? 

> A race condition can cause the workflow/coordinator to run even after the 
> bundle job is killed
> ----------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-885
>                 URL: https://issues.apache.org/jira/browse/OOZIE-885
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Virag Kothari
>            Assignee: Virag Kothari
>             Fix For: 3.3.0
>
>         Attachments: OOZIE-885-v2.patch, OOZIE-885.patch
>
>
> Steps to reproduce:
> 1) Start the bundle job with a bunch of coordinators
> 2) Immediately kill it
> Observation:
> Some coordinators still keep on running
> Reason:
> Bundle cannot kill a coordinator unless a coord-id is associated to it.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to