Thanks, I appreciate the example - what happens if File A and B have many more columns (all different data types)? The logic doesn't seem to work in that case - unless we set up the values in the Map function to include the file name (maybe the output value is a HashMap or something, which might work).
Also, I was asking to see a reduce-side join as we have other things going on in the Mapper and I'm not sure if we can tweak it's output (we send output to multiple places). Does anyone have an example using the contrib/DataJoin or something similar? thanks On Mon, Apr 5, 2010 at 7:03 PM, He Chen <[email protected]> wrote: > For the Map function: > Input key: default > input value: File A and File B lines > > output key: A, B, C,....(first colomn of the final result) > output value: 12, 24, Car, 13, Van, SUV... > > Reduce function: > take the Map output and do: > for each key > { if the value of a key is integer > then same it to array1; > else save it to array2 > } > for ith element in array1 > for jth element in array2 > output(key, array1[i]+"\t"+array2[j]); > done > > Hope this helps. > > > On Mon, Apr 5, 2010 at 4:10 PM, M B <[email protected]> wrote: > > > Hi, I need a good java example to get me started with some joining we > need > > to do, any examples would be appreciated. > > > > File A: > > Field1 Field2 > > A 12 > > B 13 > > C 22 > > A 24 > > > > File B: > > Field1 Field2 Field3 > > A Car ... > > B Truck ... > > B SUV ... > > B Van ... > > > > So, we need to first join File A and B on Field1 (say both are string > > fields). The result would just be: > > A 12 Car ... > > A 24 Car ... > > B 13 Truck ... > > B 13 SUV ... > > B 13 Van ... > > and so on - with all the fields from both files returning. > > > > Once we have that, we sometimes need to then transform it so we have a > > single record per key (Field1): > > A (12,Car) (24,Car) > > B (13,Truck) (13,SUV) (13,Van) > > --however it looks, basically tuples for each key (we'll modify this > later > > to return a conatenated set of fields from B, etc) > > > > At other times, instead of transforming to a single row, we just need to > > modify rows based on values. So if B.Field2 equals "Van", we need to set > > Output.Field2 = whatever then output to file ... > > > > Are there any good examples of this in native java (we can't use > > pig/hive/etc)? > > > > thanks. > > > > > > -- > Best Wishes! > > > -- > Chen He > PhD. student of CSE Dept. > Holland Computing Center > University of Nebraska-Lincoln > Lincoln NE 68588 >
