Thanks, I appreciate the example - what happens if File A and B have many
more columns (all different data types)?  The logic doesn't seem to work in
that case - unless we set up the values in the Map function to include the
file name (maybe the output value is a HashMap or something, which might
work).

Also, I was asking to see a reduce-side join as we have other things going
on in the Mapper and I'm not sure if we can tweak it's output (we send
output to multiple places).  Does anyone have an example using the
contrib/DataJoin or something similar?

thanks

On Mon, Apr 5, 2010 at 7:03 PM, He Chen <[email protected]> wrote:

> For the Map function:
> Input key: default
> input value: File A and File B lines
>
> output key: A, B, C,....(first colomn of the final result)
> output value: 12, 24, Car, 13, Van, SUV...
>
> Reduce function:
> take the Map output and do:
> for each key
> {       if the value of a key is integer
>            then same it to array1;
>       else save it to array2
> }
> for ith element in array1
>      for jth element in array2
>           output(key, array1[i]+"\t"+array2[j]);
> done
>
> Hope this helps.
>
>
> On Mon, Apr 5, 2010 at 4:10 PM, M B <[email protected]> wrote:
>
> > Hi, I need a good java example to get me started with some joining we
> need
> > to do, any examples would be appreciated.
> >
> > File A:
> > Field1  Field2
> > A        12
> > B        13
> > C        22
> > A        24
> >
> > File B:
> >  Field1  Field2   Field3
> > A        Car       ...
> > B        Truck    ...
> > B        SUV     ...
> > B        Van      ...
> >
> > So, we need to first join File A and B on Field1 (say both are string
> > fields).  The result would just be:
> > A   12   Car   ...
> > A   24   Car   ...
> > B   13   Truck   ...
> > B   13   SUV   ...
> >  B   13   Van   ...
> > and so on - with all the fields from both files returning.
> >
> > Once we have that, we sometimes need to then transform it so we have a
> > single record per key (Field1):
> > A (12,Car) (24,Car)
> > B (13,Truck) (13,SUV) (13,Van)
> > --however it looks, basically tuples for each key (we'll modify this
> later
> > to return a conatenated set of fields from B, etc)
> >
> > At other times, instead of transforming to a single row, we just need to
> > modify rows based on values.  So if B.Field2 equals "Van", we need to set
> > Output.Field2 = whatever then output to file ...
> >
> > Are there any good examples of this in native java (we can't use
> > pig/hive/etc)?
> >
> > thanks.
> >
>
>
>
> --
> Best Wishes!
>
>
> --
> Chen He
>  PhD. student of CSE Dept.
> Holland Computing Center
> University of Nebraska-Lincoln
> Lincoln NE 68588
>

Reply via email to