Hi Dave try C = FOREACH B generate(t.y, Flatten(t.CUSTS) AS (anothery:chararray, custbag:bag));
On Sat, Sep 8, 2012 at 8:41 AM, David Lapsley <[email protected]>wrote: > Hi Folks: > > I am new to the pig world. I have been using it for about a week and I am > completely blown away with how good it is. > > I have a question about Schemas. I have a processing chain similar to the > following: > > A = LOAD 'data' USING PigStorage('\u0001') AS (y:chararray, cust1:int, > cust2:int); > B = FOREACH A GENERATE (y, {(cust1), (cust2)}) AS t: tuple(y, CUSTS); > C = FOREACH B GENERATE(t.y, FLATTEN(t.CUSTS)); > > So, basically, my raw data contains multiple customer records per row, and > some common data. I would like to "explode" each row, so that I have one > row per customer data (which also includes the common data). > > The code above does this, however, I am not able to supply a schema for C. > Whenever I try to do this, I get an error regarding mismatched schemas. > > I would greatly appreciate any pointers you may have. > > Best regards, > > Dave. > >
