Hi, yeah I thought so,

the only slightly confusing issue is that the output would be:
bar.dat bar.dat

? (i.e. - showing you a.filename b.filename ) ?

Rob.



2010/1/12 Dmitriy Ryaboy <[email protected]>

> Rob, it's just a join.
>
> a = load 'rel1' using FooStorage() as (id, filename);
> b = load 'rel2' using FooStorage() as (id, filename);
> c = join a by filename, b by filename;
>
> Rows that don't match won't make it.
> If you DO want them to make it in, you need to use "outer" for the
> relations whose non-matching rows you want retained (the rest of the
> fields in the resulting relation will be filled in with nulls).
>
> Naturally, since Pig can do it, MR can do it.
>
> -D
>
> On Tue, Jan 12, 2010 at 2:57 PM, Rob Stewart
> <[email protected]> wrote:
> > Hi folks,
> >
> > I have a somewhat obvious question, that needs asking (for my sakes).
> >
> > Pig can do Joins, I realise that. But take for example:
> > Table_1
> > ----------------------
> > | ID | fileName |
> >  1     foo.dat
> >  2     bar.dat
> >  3     harry.dat
> >
> > Table_2
> > ----------------------
> > | ID | fileName |
> >  1      tom.dat
> >  2      bar.dat
> >  3      gamma.dat
> >
> >
> > SQL Syntax for conditional select:
> > "select t1.fileName from Table_1 t1, Table_2 t2 where t1.fileName =
> > t2.fileName"
> >
> > Result
> > --------
> > bar.dat
> >
> > How is such a query represented in Pig?
> > tableOne = LOAD 'input1.dat' USING PigStorage() AS (id:int,
> > filename:chararray);
> > tableTwo = LOAD 'input2.dat' USING PigStorage() AS (id:int,
> > filename:chararray);
> > [Now what??]
> > STORE query INTO 'Output.pig' USING PigStorage();
> >
> >
> > As a bonus question, can anybody tell me if this sort of conditional
> select
> > query is possible writing in Java MapReduce?
> >
> > thanks,
> >
> > Rob Stewart
> >
>

Reply via email to