Re: file manipulation

Michael G. Sat, 02 Jun 2012 23:11:57 -0700

thanks for all yours answers.
Michael G.


2012/6/3 Jagat <[email protected]>

> Hi
>
> Alan has already given you background of append in Hadoop.
>
> Another suggestion to merge two files , you can also look at Pig Union
>
> http://pig.apache.org/docs/r0.10.0/basic.html#union
>
> UNION operator to merge the contents of two or more relations
> The simple workflow can be
>
> load A
> load B
> Store union of A and B
>
> Have a look at how Pig Union works
>  On Sun, Jun 3, 2012 at 8:28 AM, Alan Gates <[email protected]> wrote:
>
> > MapReduce (and hence Pig) does not support file append.  This is because
> > in MapReduce tasks may be run multiple times in the case of failure or
> due
> > to speculative execution.  This would result in duplicate appends.  Also,
> > if the job fails, it would not be able to remove the appended data.
> >
> > As far as updating your data, what kind of updates do you want to do?
> >  Stores like HBase (which can be accessed from Pig) support updates.  But
> > whether this is a good fit depends on your use case.
> >
> > Alan.
> >
> > On Jun 1, 2012, at 11:54 AM, Michael G. wrote:
> >
> > > Hi all
> > > I'm new in pig and in hadoop .
> > > Can you tell me how I can :
> > > 1. append to existing file on HDFS with pig
> > > 2. update file  with pig, if it could be passible.
> > >
> > > 10x.
> > >
> > > --
> > > -- Michael G. --
> >
> >
>



-- 
-- Michael G. --

Re: file manipulation

Reply via email to