Re: file manipulation

Alan Gates Sat, 02 Jun 2012 19:58:43 -0700

MapReduce (and hence Pig) does not support file append.  This is because in 
MapReduce tasks may be run multiple times in the case of failure or due to 
speculative execution.  This would result in duplicate appends.  Also, if the 
job fails, it would not be able to remove the appended data.

As far as updating your data, what kind of updates do you want to do?  Stores 
like HBase (which can be accessed from Pig) support updates.  But whether this 
is a good fit depends on your use case.

Alan.

On Jun 1, 2012, at 11:54 AM, Michael G. wrote:

> Hi all
> I'm new in pig and in hadoop .
> Can you tell me how I can :
> 1. append to existing file on HDFS with pig
> 2. update file  with pig, if it could be passible.
> 
> 10x.
> 
> -- 
> -- Michael G. --

Re: file manipulation

Reply via email to