Hello everyone, I think how to solve following problem. I have an oozie workflow which produce some intermediate results and some final results on HDFS. I would like to ensure that those files will be deleted after certain time. I would like to achieve that just with oozie and hadoop ecosystem. My workflow gets working directory as an input so I know that all files will be created within this directory. My idea is that I would create coordinator job in the first step in the workflow. This coordinator will be configured to fire exactly once after configured period. The coordinator will execute very simple oozie workflow which will just remove given working directory.
What do you think about this approach? I know that there is no support for creating coordinator within workflow so I have to implement that probably as a java action. Also it means that for each workflow there will be one coordinator. Is there any limit for how much coordinators can be active? Thank you Martin
