Prashanth Pappu
Tue, 30 Sep 2008 09:42:51 -0700
More importantly, can you please tell us the svn version of the build you
are using?
Some of us use PIG extensively for our applications and it would be nice if
we can start doing some kind of release management.
I.e., Though I want to upgrade PIG (to work with Hadoop 17/18 etc.), using
top of SVN seems a little risky.
So, if we have some idea of which svn builds we consider to be fairly
stable, we can upgrade a little conservatively.
FYI, I've been using -
For hadoop 16: svn 653894 (split command is buggy; rest seems ok)
I wasn't sure which version was best for 1.0 release (is there a separate
branch?), but
For hadoop 17: svn 694017 (I haven't had a chance to fully test this
version)
Prashanth
On Tue, Sep 30, 2008 at 9:03 AM, Olga Natkovich <[EMAIL PROTECTED]> wrote:
> Can you try the code in types barnch and see if the problem goes away?
>
> Olga
>
> > -----Original Message-----
> > From: Arthur Zwiegincew [EMAIL PROTECTED]
> > Sent: Monday, September 29, 2008 7:05 PM
> > To: pig-user@incubator.apache.org
> > Subject: Re: Union support
> >
> > Oh, I didn't realize cat was a Pig command... The result is
> > still the same.
> >
> > Is this a regression? If so, what's the right build to use?
> >
> > Thanks,
> > Arthur
> >
> > On Mon, Sep 29, 2008 at 6:59 PM, Mridul Muralidharan
> > <[EMAIL PROTECTED]>wrote:
> >
> > >
> > > I suggested store instead of dump to see if the problem is
> > related to
> > > dump only or whether it is a general issue.
> > >
> > > cat in pig works the same whether it is a file or a directory
> > > (appropriate files in dir ofcourse).
> > >
> > > Though looking at your ls output, I suspect the map output did not
> > > produce the required result ...
> > >
> > > - Mridul
> > >
> > >
> > > Arthur Zwiegincew wrote:
> > >
> > >> I created two scripts: the first one the same as before, but using
> > >> STORE instead of DUMP, and the second one, which loads the stored
> > >> file and dumps it. No difference-I just get {(4, 20), (5, 20)}.
> > >>
> > >> Also, your suggestion of using cat indicates that you might be
> > >> thinking about local mode (where union works). As I see it,
> > >> PigStorage() in Hadoop mode ends up creating a directory,
> > not just a single file:
> > >>
> > >> $ dir union.out
> > >> total 8
> > >> -rw-r--r-- 1 arthur staff 10B Sep 29 18:40 map-map
> > >> -rw-r--r-- 1 arthur staff 0B Sep 29 18:40 part-00000
> > >>
> > >> -Arthur
> > >>
> > >> On Mon, Sep 29, 2008 at 6:36 PM, Mridul Muralidharan
> > >> <[EMAIL PROTECTED]>wrote:
> > >>
> > >> Hi Arthur,
> > >>>
> > >>> Does a store instead of dump work ?
> > >>> Something like :
> > >>>
> > >>> -- start --
> > >>> data = load '/Users/arthur/tmp/data' as (x, y);
> > >>> data2 = load '/Users/arthur/tmp/data-2' as (x, y); both = union
> > >>> data, data2; store both into 'temp' using PigStorage();
> > >>>
> > >>> cat temp
> > >>> -- end --
> > >>>
> > >>>
> > >>>
> > >>> Regards,
> > >>> Mridul
> > >>>
> > >>> Arthur Zwiegincew wrote:
> > >>>
> > >>> I've come across a very basic problem-unions simply do
> > not work in
> > >>>> Hadoop
> > >>>> mode.
> > >>>>
> > >>>> data files:
> > >>>>
> > >>>> $ cat ~/tmp/data
> > >>>> 1 1
> > >>>> 2 1
> > >>>> 3 10
> > >>>>
> > >>>> $ cat ~/tmp/data-2
> > >>>> 4 20
> > >>>> 5 20
> > >>>>
> > >>>> pig script:
> > >>>> data = load '/Users/arthur/tmp/data' as (x, y);
> > >>>> data2 = load '/Users/arthur/tmp/data-2' as (x, y); both = union
> > >>>> data, data2; dump both;
> > >>>>
> > >>>> result:
> > >>>> (4, 20)
> > >>>> (5, 20)
> > >>>>
> > >>>>
> > >>>> I've opened a bug
> > <https://issues.apache.org/jira/browse/PIG-390>
> > >>>> on this, but there has been no response.
> > >>>>
> > >>>>
> > >>>> Am I missing anything?
> > >>>>
> > >>>>
> > >>>> Thanks,
> > >>>> Arthur
> > >>>>
> > >>>>
> > >>>>
> > >>
> > >
> >
>