Or, maybe have a look at Apache Falcon: Falcon - Apache Falcon - Data management and processing platform
Falcon - Apache Falcon - Data management and processing platform Apache Falcon - Data management and processing platform View on falcon.incubator.apache.org Preview by Yahoo From: Stanley Shi <[email protected]> >To: "[email protected]" <[email protected]> >Sent: Thursday, August 28, 2014 1:15 AM >Subject: Re: What happens when .....? > > > >Normally MR job is used for batch processing. So I don't think this is a good >use case here for MR. >Since you need to run the program periodically, you cannot submit a single >mapreduce job for this. >An possible way is to create a cron job to scan the folder size and submit a >MR job if necessary; > > > > > >On Wed, Aug 27, 2014 at 7:38 PM, Kandoi, Nikhil <[email protected]> wrote: > >Hi All, >> >>I have a system where files are coming in hdfs at regular intervals and I >>perform an operation everytime the directory size goes above a particular >>point. >>My Question is that when I submit a map reduce job, would it only work on the >>files present at that point ?? >> >>Regards, >>Nikhil Kandoi >> >> >> > > > >-- > >Regards, >Stanley Shi, > > >
