Agree with Bejoy. The problem you've mentioned sounds like building something like a workflow, which is what Oozie is supposed to do.
Thanks hemanth On Wed, Sep 26, 2012 at 12:22 AM, Bejoy Ks <[email protected]> wrote: > Hi Peter > > AFAIK oozie has a mechanism to achieve this. You can trigger your jobs as > soon as the files are written to a certain hdfs directory. > > > On Tue, Sep 25, 2012 at 10:23 PM, Peter Sheridan < > [email protected]> wrote: > >> These are log files being deposited by other processes, which we may >> not have control over. >> >> We don't want multiple processes to write to the same files — we just >> don't want to start our jobs until they have been completely written. >> >> Sorry for lack of clarity & thanks for the response. >> >> >> --Pete >> >> From: Bertrand Dechoux <[email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Tuesday, September 25, 2012 12:33 PM >> To: "[email protected]" <[email protected]> >> Subject: Re: Detect when file is not being written by another process >> >> Hi, >> >> Multiple files and aggregation or something like hbase? >> >> Could you tell use more about your context? What are the volumes? Why do >> you want multiple processes to write to the same file? >> >> Regards >> >> Bertrand >> >> On Tue, Sep 25, 2012 at 6:28 PM, Peter Sheridan < >> [email protected]> wrote: >> >>> Hi all. >>> >>> We're using Hadoop 1.0.3. We need to pick up a set of large (4+GB) >>> files when they've finished being written to HDFS by a different process. >>> There doesn't appear to be an API specifically for this. We had >>> discovered through experimentation that the FileSystem.append() method can >>> be used for this purpose — it will fail if another process is writing to >>> the file. >>> >>> However: when running this on a multi-node cluster, using that API >>> actually corrupts the file. Perhaps this is a known issue? Looking at the >>> bug tracker I see https://issues.apache.org/jira/browse/HDFS-265 and a >>> bunch of similar-sounding things. >>> >>> What's the right way to solve this problem? Thanks. >>> >>> >>> --Pete >>> >>> >> >> >> -- >> Bertrand Dechoux >> > >
