Have a solution which I personally don't like, but it seems to work for now:
T = LOAD 'T.dat' AS (a:chararray, b:chararray, c:chararray, d:chararray, x:chararray, y:chararray); U = LOAD 'U.dat' AS (a:chararray, b:chararray, c:chararray, d:chararray, x:chararray, y:chararray); A = JOIN T BY (a,b,c,d) FULL OUTER, U BY (a,b,c,d); B = FOREACH A GENERATE (T::a IS NOT NULL AND T::b IS NOT NULL AND T::c IS NOT NULL AND T::d IS NOT NULL ? T::a : U::a) AS a:chararray, (T::a IS NOT NULL AND T::b IS NOT NULL AND T::c IS NOT NULL AND T::d IS NOT NULL ? T::b : U::b) AS b:chararray, (T::a IS NOT NULL AND T::b IS NOT NULL AND T::c IS NOT NULL AND T::d IS NOT NULL ? T::c : U::c) AS c:chararray, (T::a IS NOT NULL AND T::b IS NOT NULL AND T::c IS NOT NULL AND T::d IS NOT NULL ? T::d : U::d) AS d:chararray, (T::a IS NOT NULL AND T::b IS NOT NULL AND T::c IS NOT NULL AND T::d IS NOT NULL ? T::x : U::x) AS x:chararray, (T::a IS NOT NULL AND T::b IS NOT NULL AND T::c IS NOT NULL AND T::d IS NOT NULL ? T::y : U::y) AS y:chararray; DUMP B; ---- (a1,b1,c1,d1,g1,h1) (a2,b2,c2,d2,e2,f2) (a3,b3,c3,d3,g3,h3) (a4,b4,c4,d4,e4,f4) (a5,b5,c5,d5,e5,f5) (a6,b6,c6,d6,g6,h6) Bumping for an optimization here. Thanks, -Dan On Fri, Apr 19, 2013 at 2:24 PM, Dan DeCapria, CivicScience < [email protected]> wrote: > Consider two aliases (T) and (U), loaded from data with schema defined > below. I was considering a left outer join to 'merge' the two records, > overriding those in U with the join fields in T, but the result isn't what > I desired. Possibly a filter then union is required? I'm just not sure; > looking for a quick way to get to this result (R): > > R: ?[JOIN T BY (a,b,c,d) LEFT OUTER, U BY (a,b,c,d)]: schema : a,b,c,d,x,y > a1 b1 c1 d1 *g1 h1* > a2 b2 c2 d2 e2 f2 > a3 b3 c3 d3 *g3 h3* > a4 b4 c4 d4 e4 f4 > a5 b5 c5 d5 e5 f5 > a6 b6 c6 d6 g6 h6 > > T : schema : a,b,c,d,x,y > a1 b1 c1 d1 g1 h1 > a3 b3 c3 d3 g3 h3 > a6 b6 c6 d6 g6 h6 > > U: schema : a,b,c,d,x,y > a1 b1 c1 d1 e1 f1 > a2 b2 c2 d2 e2 f2 > a3 b3 c3 d3 e3 f3 > a4 b4 c4 d4 e4 f4 > a5 b5 c5 d5 e5 f5 > > Many thanks, -Dan >
