Hi Alan has already given you background of append in Hadoop.
Another suggestion to merge two files , you can also look at Pig Union http://pig.apache.org/docs/r0.10.0/basic.html#union UNION operator to merge the contents of two or more relations The simple workflow can be load A load B Store union of A and B Have a look at how Pig Union works On Sun, Jun 3, 2012 at 8:28 AM, Alan Gates <[email protected]> wrote: > MapReduce (and hence Pig) does not support file append. This is because > in MapReduce tasks may be run multiple times in the case of failure or due > to speculative execution. This would result in duplicate appends. Also, > if the job fails, it would not be able to remove the appended data. > > As far as updating your data, what kind of updates do you want to do? > Stores like HBase (which can be accessed from Pig) support updates. But > whether this is a good fit depends on your use case. > > Alan. > > On Jun 1, 2012, at 11:54 AM, Michael G. wrote: > > > Hi all > > I'm new in pig and in hadoop . > > Can you tell me how I can : > > 1. append to existing file on HDFS with pig > > 2. update file with pig, if it could be passible. > > > > 10x. > > > > -- > > -- Michael G. -- > >
