[
https://issues.apache.org/jira/browse/OOZIE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172028#comment-14172028
]
Shwetha G S commented on OOZIE-1803:
------------------------------------
Currently, a workflow is deleted only if all its child workflows are complete
and this is done recursively as sub-worklows can contain sub-workflows
again(theoretically). Samething goes for coordinator actions, making sure all
the related workflows are complete and so on. So, the code is pretty
complicated and runs slow. How about we simplify this and delete all workflows
whose created time is older than say 15 days(configurable). We can use the same
logic even for coord actions, but instead of constant 15 days, it can be some
function of coord frequency(may be how many instances to retain). For
coordinator and bundle, we can use end time. They are small tables anyways.
If no one has looked at a stuck workflow for more than 15 days(configurable), I
don't think they will need it anyways. This is the only way it will work with
partitioning.
This logic serves the purpose, simple and runs faster. Both Yahoo and InMobi
run this as cron outside oozie. Why not make it part of oozie and provide it as
an alternative purging logic. Users can choose depend on their usecase.
> Improvement in Purge service
> ----------------------------
>
> Key: OOZIE-1803
> URL: https://issues.apache.org/jira/browse/OOZIE-1803
> Project: Oozie
> Issue Type: Improvement
> Components: core
> Reporter: Jaydeep Vishwakarma
> Assignee: Jaydeep Vishwakarma
> Attachments: OOZIE-1803-v1.patch, OOZIE-1803-v2.patch,
> OOZIE-1803-v3.patch, OOZIE-1803.patch, purgeservice-1.patch,
> purgeservice.patch
>
>
> Current purge service of oozie have some performance issues and it might help
> to look at the queries and indexes to improve the the purge service.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)