@Raj: so, updating the data and storing them into the same destination would work?
@Shahab the file is very small, and therefore I am expecting to read it at once. what would you suggest? On Fri, May 31, 2013 at 5:30 PM, Shahab Yunus <[email protected]>wrote: > I might not have understood your usecase properly so I apologize for that. > > But what I think here you need is something outside of Hadoop/HDFS. I am > presuming that you need to read the whole updated file when you are going > to process it with your never-ending job, right? You don't expect to read > it piecemeal or in chunks. If that is indeed the case, then your never > ending job can use generic techniques to check whether file's signature or > any property has changed from the last time and only process it if it has > changed. You file writing/updating process can update the file > independently of the reading/processing one. > > Regards, > Shahab > > > On Fri, May 31, 2013 at 11:23 AM, Adamantios Corais < > [email protected]> wrote: > >> I am new to hadoop so apologize beforehand for my very-fundamental >> question. >> >> Lets assume that I have a file stored into hadoop that it gets updated >> once a day, Also assume that there is a task running at the back end of >> hadoop that never stops. How could I reload this file so that hadoop starts >> considering the updated values than the old ones??? >> > >
