yeah, something like this should work:

o = foreach cg generate FLATTEN(A.a_column) as a_column,
((IsEmpty(B))?toBag(toTuple(null)):(B.b_column)) as B2:
o2 = foreach o generate a_column, FLATTEN(B2);


Although, I seem to recall a more elegant way of doing this, but it escapes
at the moment ...


How come you didn't try the outer join?

cg = JOIN A by aid RIGHT OUTER, B by bid;


?


On Tue, Jun 1, 2010 at 11:31 AM, Dave Viner <[email protected]> wrote:

> I am having some trouble getting cogroup and flattening to work as I'd
> like.
>  The cogroup statement looks like:
>
> cg = COGROUP A BY aid INNER,  B BY bid;
>
> The cg group has rows in which the information in B may be empty (as
> expected).  I'd like to output a series of rows each of which has the same
> number of columns.  If the cg group has empty information for B, then it
> should output either NULL or an empty string.  But, I can't seem to make it
> work.
>
>
> for_output = FOREACH cg
>    GENERATE FLATTEN(A.aid) AS aid,
>        FLATTEN(B.optional_b_col);
>
> If the cogroup cg has empty values in the B bag, then there is no
> corresponding row in for_output.
>
> How do I get the row to be added to for_output with an empty value for
> "optional_b_col"?
>
> I also tried something like:
>
> for_output = FOREACH cg
>    GENERATE FLATTEN(A.aid) AS aid,
>        (B.optional_b_col IS NOT NULL ? B.optional_b_col : '');
>
> But, this gives an error when trying to dump the results:
> ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1050: Unsupported input type
> for BinCond: left hand side: bag; right hand side: chararray
>
>
> I imagine there must be some way to output empty strings, I just can't seem
> to figure it out.
>
> Thanks
> Dave Viner
>

Reply via email to