[ 
https://issues.apache.org/jira/browse/FALCON-740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14139229#comment-14139229
 ] 

Balu Vellanki commented on FALCON-740:
--------------------------------------

[~shwethags] - The bug Venkatesh referred to is an internal bug. When a falcon 
user tries to delete an entity, Falcon catches and throws the  following 
exception being thrown by Oozie. This causes entity delete to fail. 

{code}
2014-09-10 07:50:06,260 ERROR BundleJobChangeXCommand:540 - USER- GROUP- 
TOKEN[] APP- JOB0000743-140910031253668-oozie-oozi-B ACTION- XException, 
org.apache.oozie.command.CommandException: E1320: Bundle Job change error, [[ 
0000744-140910031253668-oozie-oozi-C : Coord is in killed state ]]
at 
org.apache.oozie.command.bundle.BundleJobChangeXCommand.execute(BundleJobChangeXCommand.java:208)
at 
org.apache.oozie.command.bundle.BundleJobChangeXCommand.execute(BundleJobChangeXCommand.java:50)
at org.apache.oozie.command.XCommand.call(XCommand.java:281)
at org.apache.oozie.BundleEngine.change(BundleEngine.java:85)
at org.apache.oozie.servlet.V1JobServlet.changeBundleJob(V1JobServlet.java:585)
{code}

Bowen from Oozie team confirmed that  this is caused by Falcon killing 
coord_jobs of a bundle, and then trying to change the bundle job endtime, 
followed by falcon killing the bundle job.  This is caused because Oozie 
changed how it handles bundle change command. The related oozie jira is 
https://issues.apache.org/jira/browse/OOZIE-1807

Since you confirmed that we can now remove set end time code block - I will do 
that, create a patch and test it before submitting the patch.

Thanks



> Entity kill job calls OozieClient.kill on bundle coord job ids before calling 
> kill on bundle job id
> ---------------------------------------------------------------------------------------------------
>
>                 Key: FALCON-740
>                 URL: https://issues.apache.org/jira/browse/FALCON-740
>             Project: Falcon
>          Issue Type: Bug
>          Components: webapp
>    Affects Versions: 0.6
>            Reporter: Balu Vellanki
>            Assignee: Balu Vellanki
>
> When Falcon user makes an entity kill API call, Falcon does the following in 
> org.apache.falcon.workflow.engine.OozieWorkflowEngine.killBundle(String 
> clusterName, BundleJob job)
> {code}
>  //kill all coords
>             for (CoordinatorJob coord : job.getCoordinators()) {
>                 client.kill(coord.getId());
>                 LOG.debug("Killed coord {} on cluster {}", coord.getId(), 
> clusterName);
>             }
>             //set end time of bundle
>             client.change(job.getId(), OozieClient.CHANGE_VALUE_ENDTIME + "=" 
> + SchemaHelper.formatDateUTC(new Date()));
>             LOG.debug("Changed end time of bundle {} on cluster {}", 
> job.getId(), clusterName);
>             //kill bundle
>             client.kill(job.getId());
>             LOG.debug("Killed bundle {} on cluster {}", job.getId(), 
> clusterName);
> {code}
> Two questions.
> 1. Why should we kill the coordinator jobs before killing the bundle job? 
> OozieClient.kill(bundle_job_id) should kill all the bundle's coord jobs.
> 2. Why is the endtime changed for  bundle job? 
> https://oozie.apache.org/docs/4.0.1/DG_CommandLineTool.html#Changing_pausetime_of_a_Bundle_Job
>  does not say that endtime can be changed for bundlejob. 
> I think this code should be updated, please comment if you think I made any 
> wrong assumptions.
> Thank you



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to