pig-user  

RE: Union support

Olga Natkovich
Fri, 03 Oct 2008 09:31:05 -0700

Hi Kevin,

Thanks for checkin. We have done most of the major work on the types
branch. We want the new code to stablize for a few weeks before we move
it onto trunk probably in late october or early november.

Olga

> -----Original Message-----
> From: Kevin Weil [EMAIL PROTECTED] 
> Sent: Friday, October 03, 2008 1:19 AM
> To: pig-user@incubator.apache.org
> Subject: Re: Union support
> 
> Olga,
> 
> I have verified that union does indeed work in the types 
> branch (specifically, the release tag in the types branch) 
> running in hadoop mode
> on hadoop-18.   This is great!
> 
> Is there a document somewhere with the release/merge 
> timelines for this next version?  We can run off the branch 
> for now, but we'll probably switch to trunk as soon as trunk 
> supports unions.
> 
> Thanks,
> Kevin
> 
> On Tue, Sep 30, 2008 at 9:03 AM, Olga Natkovich 
> <[EMAIL PROTECTED]> wrote:
> 
> > Can you try the code in types barnch and see if the problem 
> goes away?
> >
> > Olga
> >
> > > -----Original Message-----
> > > From: Arthur Zwiegincew [EMAIL PROTECTED]
> > > Sent: Monday, September 29, 2008 7:05 PM
> > > To: pig-user@incubator.apache.org
> > > Subject: Re: Union support
> > >
> > > Oh, I didn't realize cat was a Pig command... The result is still 
> > > the same.
> > >
> > > Is this a regression? If so, what's the right build to use?
> > >
> > > Thanks,
> > > Arthur
> > >
> > > On Mon, Sep 29, 2008 at 6:59 PM, Mridul Muralidharan
> > > <[EMAIL PROTECTED]>wrote:
> > >
> > > >
> > > > I suggested store instead of dump to see if the problem is
> > > related to
> > > > dump only or whether it is a general issue.
> > > >
> > > > cat in pig works the same whether it is a file or a directory 
> > > > (appropriate files in dir ofcourse).
> > > >
> > > > Though looking at your ls output, I suspect the map 
> output did not 
> > > > produce the required result ...
> > > >
> > > > - Mridul
> > > >
> > > >
> > > > Arthur Zwiegincew wrote:
> > > >
> > > >> I created two scripts: the first one the same as before, but 
> > > >> using STORE instead of DUMP, and the second one, which 
> loads the 
> > > >> stored file and dumps it. No difference-I just get 
> {(4, 20), (5, 20)}.
> > > >>
> > > >> Also, your suggestion of using cat indicates that you might be 
> > > >> thinking about local mode (where union works). As I see it,
> > > >> PigStorage() in Hadoop mode ends up creating a directory,
> > > not just a single file:
> > > >>
> > > >> $ dir union.out
> > > >> total 8
> > > >> -rw-r--r--  1 arthur  staff    10B Sep 29 18:40 map-map
> > > >> -rw-r--r--  1 arthur  staff     0B Sep 29 18:40 part-00000
> > > >>
> > > >> -Arthur
> > > >>
> > > >> On Mon, Sep 29, 2008 at 6:36 PM, Mridul Muralidharan
> > > >> <[EMAIL PROTECTED]>wrote:
> > > >>
> > > >>  Hi Arthur,
> > > >>>
> > > >>>  Does a store instead of dump work ?
> > > >>> Something like :
> > > >>>
> > > >>> -- start --
> > > >>> data = load '/Users/arthur/tmp/data' as (x, y);
> > > >>> data2 = load '/Users/arthur/tmp/data-2' as (x, y); 
> both = union 
> > > >>> data, data2; store both into 'temp' using PigStorage();
> > > >>>
> > > >>> cat temp
> > > >>> -- end --
> > > >>>
> > > >>>
> > > >>>
> > > >>> Regards,
> > > >>> Mridul
> > > >>>
> > > >>> Arthur Zwiegincew wrote:
> > > >>>
> > > >>>  I've come across a very basic problem-unions simply do
> > > not work in
> > > >>>> Hadoop
> > > >>>> mode.
> > > >>>>
> > > >>>> data files:
> > > >>>>
> > > >>>> $ cat ~/tmp/data
> > > >>>> 1 1
> > > >>>> 2 1
> > > >>>> 3 10
> > > >>>>
> > > >>>> $ cat ~/tmp/data-2
> > > >>>> 4 20
> > > >>>> 5 20
> > > >>>>
> > > >>>> pig script:
> > > >>>> data = load '/Users/arthur/tmp/data' as (x, y);
> > > >>>> data2 = load '/Users/arthur/tmp/data-2' as (x, y); 
> both = union 
> > > >>>> data, data2; dump both;
> > > >>>>
> > > >>>> result:
> > > >>>> (4, 20)
> > > >>>> (5, 20)
> > > >>>>
> > > >>>>
> > > >>>> I've opened a bug
> > > <https://issues.apache.org/jira/browse/PIG-390>
> > > >>>> on this, but there has been no response.
> > > >>>>
> > > >>>>
> > > >>>> Am I missing anything?
> > > >>>>
> > > >>>>
> > > >>>> Thanks,
> > > >>>> Arthur
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>
> > > >
> > >
> >
>