[ 
https://issues.apache.org/jira/browse/OOZIE-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034939#comment-14034939
 ] 

Shwetha G S commented on OOZIE-1532:
------------------------------------

[~puru], I agree with the requirements that you outlined, I understand it and I 
don't have any issues with that. The problem is with the way 
CoordJobGetActionsSubsetJPAExecutor is implemented. The API takes start and len 
and returns back coord actions from C@(start-1) to C@(start+len-1). The API 
should return the same actions irrespective of purging is enabled or not.

The way the API implemented is using query 
{code}
select a.id, a.actionNumber, a.consoleUrl, a.errorCode, a.errorMessage, 
a.externalId, a.externalStatus, a.jobId, a.trackerUri, a.createdTimestamp, 
a.nominalTimestamp, a.statusStr, a.lastModifiedTimestamp, 
a.missingDependencies, a.pushMissingDependencies, a.timeOut from 
CoordinatorActionBean a where a.jobId = :jobId order by a.nominalTimestamp
{code}
start and length are handled using
{code}
        query.setFirstResult(start - 1);
        query.setMaxResults(len);
{code}

With the purging of coord actions, the API will return actions C@(min action in 
DB + start - 1) to C@(min action in DB + start+len-1). So the API needs to be 
fixed as well so that it still returns C@(start-1) to C@(start+len-1).

Consider a daily coord that starts on May 1st and today is June 18th, I want to 
get all coord actions from June 10th to June 18th. The way to do it is to call 
API with start = 41 and len = 9. If the purge service deletes all actions older 
than 30 days, the min action in DB is for May 18th. So the same API with start 
= 41 and len = 9 will not work anymore. To get the required actions from June 
10th to June 18th, he needs to know how many actions are deleted which is too 
much to expect from the user.

Yes, recent n is useful. But even start and length are required. So, don't 
remove these params.

> Purging should remove completed children job for long running coordinator jobs
> ------------------------------------------------------------------------------
>
>                 Key: OOZIE-1532
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1532
>             Project: Oozie
>          Issue Type: New Feature
>            Reporter: Srikanth Sundarrajan
>            Assignee: Bowen Zhang
>         Attachments: oozie-1532.patch, oozie-1532.patch
>
>
> Specifically, this is for long running coordinator jobs with high frequency. 
> all child workflows are never purged as the coord job is still running.
> Oozie server configuration that indicates how many coordinator actions 
> frequency ticks to keep. By doing this it would be possible to purge running 
> coord jobs. By default this would not be enabled and the current logic would 
> remain.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to