Have you ever looked at Apache Falcon? On 13 Jul 2015 23:15, "Martin Chalupa" <[email protected]> wrote:
> Hello everyone, > > I think how to solve following problem. I have an oozie workflow which > produce some intermediate results and some final results on HDFS. I would > like to ensure that those files will be deleted after certain time. I would > like to achieve that just with oozie and hadoop ecosystem. My workflow gets > working directory as an input so I know that all files will be created > within this directory. My idea is that I would create coordinator job in > the first step in the workflow. This coordinator will be configured to fire > exactly once after configured period. The coordinator will execute very > simple oozie workflow which will just remove given working directory. > > What do you think about this approach? > > I know that there is no support for creating coordinator within workflow > so I have to implement that probably as a java action. Also it means that > for each workflow there will be one coordinator. Is there any limit for how > much coordinators can be active? > > Thank you > Martin
