yeah, something like this should work: o = foreach cg generate FLATTEN(A.a_column) as a_column, ((IsEmpty(B))?toBag(toTuple(null)):(B.b_column)) as B2: o2 = foreach o generate a_column, FLATTEN(B2);
Although, I seem to recall a more elegant way of doing this, but it escapes at the moment ... How come you didn't try the outer join? cg = JOIN A by aid RIGHT OUTER, B by bid; ? On Tue, Jun 1, 2010 at 11:31 AM, Dave Viner <[email protected]> wrote: > I am having some trouble getting cogroup and flattening to work as I'd > like. > The cogroup statement looks like: > > cg = COGROUP A BY aid INNER, B BY bid; > > The cg group has rows in which the information in B may be empty (as > expected). I'd like to output a series of rows each of which has the same > number of columns. If the cg group has empty information for B, then it > should output either NULL or an empty string. But, I can't seem to make it > work. > > > for_output = FOREACH cg > GENERATE FLATTEN(A.aid) AS aid, > FLATTEN(B.optional_b_col); > > If the cogroup cg has empty values in the B bag, then there is no > corresponding row in for_output. > > How do I get the row to be added to for_output with an empty value for > "optional_b_col"? > > I also tried something like: > > for_output = FOREACH cg > GENERATE FLATTEN(A.aid) AS aid, > (B.optional_b_col IS NOT NULL ? B.optional_b_col : ''); > > But, this gives an error when trying to dump the results: > ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1050: Unsupported input type > for BinCond: left hand side: bag; right hand side: chararray > > > I imagine there must be some way to output empty strings, I just can't seem > to figure it out. > > Thanks > Dave Viner >
