Not sure if this is exactly the same, but when I've created tuples within tuples in UDFs (to preserve order of pairs), from bag input, Pig has allowed it - but I can't work with that data in subsequent steps.
On Fri, Apr 2, 2010 at 12:37 PM, hc busy <hc.b...@gmail.com> wrote: > Yeah, I'm sure it has nested tuples. Pig doesn't natively support > introduction of tuples > > h = foreach g generate ((x,y,z)), (x), ((((x)))) > > doesn't work, but i have a udf that does that.... don't ask why...., and > I've seen it print double pair of paren's when I took a dump. > > Our hadoop guys here says it's CDH2 and that the "upgrade" was just > re-installation of CDH2... ("same jars") But certainly my script suddenly > started doing weird things when it flattened that all the way through. > > I'd support the prior behavior as well, because that seems to match my > reading of documentation on behavior of FLATTEN. > > > > Has anybody else had this problem with recent cloudera/pig versions? > > > thnx!! > > > On Fri, Apr 2, 2010 at 11:43 AM, zaki rahaman <zaki.raha...@gmail.com > >wrote: > > > Stupid question but are you sure your bag has the dual sets of > parentheses? > > (And if I may ask, why is that the case?) > > > > On Fri, Apr 2, 2010 at 2:11 PM, zaki rahaman <zaki.raha...@gmail.com> > > wrote: > > > > > If I'm not mistaken, the output is the expected behavior. Flatten > should > > > unnest bags. I'm assuming your statement is something like FOREACH ... > > > GENERATE field1, field2, FLATTEN(bag1) which would 'duplicate' the > first > > two > > > fields of a tuple for every tuple in the nested bag. > > > > > > > > > > > > > > > On Fri, Apr 2, 2010 at 2:02 PM, hc busy <hc.b...@gmail.com> wrote: > > > > > >> doh!!!! s/map/bag/g > > >> > > >> I seem to get maps and bags mixed up or some reason... > > >> > > >> Guys, I have a row containing a *bag* > > >> > > >> 'id','data', {((1,2)), ((2,3)), ((4,5))} > > >> > > >> What is the expected behavior when I flatten on that bag? I had > expected > > >> it > > >> to result in > > >> > > >> 'id','data', (1,2) > > >> 'id','data', (2,3) > > >> 'id','data', (4,5) > > >> > > >> > > >> But it appears to me that the result of applying FLATTEN to that bag > is > > >> this > > >> instead: > > >> > > >> 'id','data', 1,2 > > >> 'id','data', 2,3 > > >> 'id','data', 4,5 > > >> > > >> > > >> The latter is returned by the current cloudera's CDH2 and I've seen > the > > >> prior behavior on other versions of pig. > > >> > > >> Which is the correct behavior by design? > > >> > > >> What will pig 0.6 do when it is released? > > >> > > >> thanks! > > >> On Fri, Apr 2, 2010 at 11:29 AM, hc busy <hc.b...@gmail.com> wrote: > > >> > > >> > Guys, I have a row containing a map > > >> > > > >> > 'id','data', {((1,2)), ((2,3)), ((4,5))} > > >> > > > >> > What is the expected behavior when I flatten on that bag? I had > > expected > > >> it > > >> > to result in > > >> > > > >> > 'id','data', (1,2) > > >> > 'id','data', (2,3) > > >> > 'id','data', (4,5) > > >> > > > >> > > > >> > But it appears to me that the result of applying FLATTEN to that bag > > is > > >> > this instead: > > >> > > > >> > 'id','data', 1,2 > > >> > 'id','data', 2,3 > > >> > 'id','data', 4,5 > > >> > > > >> > > > >> > The latter is returned by the current cloudera's CDH2 and I've seen > > the > > >> > prior behavior on other versions of pig. > > >> > > > >> > Which is the correct behavior by design? > > >> > > > >> > What will pig 0.6 do when it is released? > > >> > > > >> > thanks! > > >> > > > >> > > > > > > > > > > > > -- > > > Zaki Rahaman > > > > > > > > > > > > -- > > Zaki Rahaman > > >