I might not have understood your usecase properly so I apologize for that. But what I think here you need is something outside of Hadoop/HDFS. I am presuming that you need to read the whole updated file when you are going to process it with your never-ending job, right? You don't expect to read it piecemeal or in chunks. If that is indeed the case, then your never ending job can use generic techniques to check whether file's signature or any property has changed from the last time and only process it if it has changed. You file writing/updating process can update the file independently of the reading/processing one.
Regards, Shahab On Fri, May 31, 2013 at 11:23 AM, Adamantios Corais < [email protected]> wrote: > I am new to hadoop so apologize beforehand for my very-fundamental > question. > > Lets assume that I have a file stored into hadoop that it gets updated > once a day, Also assume that there is a task running at the back end of > hadoop that never stops. How could I reload this file so that hadoop starts > considering the updated values than the old ones??? >
