[ 
https://issues.apache.org/jira/browse/OOZIE-1401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13966365#comment-13966365
 ] 

Jaydeep Vishwakarma commented on OOZIE-1401:
--------------------------------------------

[~chitnis],
I saw the code snippet for this. It first fetch all eligible workflows for 
deletion and than start removing one by one.
The way current code is written for purging work flow might not create issues 
when you have less count of workflow, But when you have more than a million 
work flow it will run very slow and create extra load on DB. I think all 
eligible workflows should be deleted by single query. 
Although I have small patch ready for this bug, Still I feel we should think 
other prospects as well. 

> PurgeCommand should purge the workflow jobs w/o end_time
> --------------------------------------------------------
>
>                 Key: OOZIE-1401
>                 URL: https://issues.apache.org/jira/browse/OOZIE-1401
>             Project: Oozie
>          Issue Type: Sub-task
>          Components: bundle, coordinator, workflow
>    Affects Versions: trunk
>            Reporter: Mona Chitnis
>             Fix For: trunk
>
>
> Currently, Purge logic is not working with those workflow jobs with 
> end_time=null. This command needs to take care of those jobs as well. This 
> happens in the case of long stuck jobs after Hadoop restarts or DB failures. 
> It could be done by checking created_time if end_time is not available.
> The current query:
> select w from WorkflowJobBean w where w.endTimestamp < :endTime



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to