Thanks for the info, guys! Will look into using a recent snapshot. Thanks! Amit
On 2/16/11 11:46 AM, "Daniel Dai" <[email protected]> wrote: > Yes, it is fixed by PIG-998. Doing a describe on trunk will get: > > data: {f0: chararray,b1::t1: (f1: chararray,f2: int),b3: {(f3: chararray)}} > > Daniel > > Alan Gates wrote: >> The issue here is that describe is incorrectly removing the second >> level of tuple, even though dump is doing the right thing. I tested >> it again the top of trunk code, and describe now does the right >> thing. I suspect this is a side effect of the semantics work that's >> been going on (see https://issues.apache.org/jira/browse/PIG-998). >> >> Alan. >> >> On Feb 15, 2011, at 3:10 AM, amramesh wrote: >> >> >>> Hi, >>> >>> I have been using Pig for a few months now, was using 0.7 earlier and >>> recently migrated to 0.8. In a script I am working on right now I >>> hit a >>> snag where the script failed. I investigated some more, and have been >>> able to generate a small illustrative example which I think is the >>> cause >>> of the problem. Here it is: >>> >>> data = LOAD '$INPUT' USING PigStorage(',') AS (f0:chararray, f1: >>> chararray, f2:int, f3:chararray); >>> >>> DUMP data; >>> (A, apple,1, alpha) >>> (A, airplane,1, alpha) >>> (B, ball,2, beta) >>> (C, cat,3, gamma) >>> (C, candle,3, gamma) >>> (D, dog,4, delta) >>> >>> data = FOREACH data GENERATE f0, TOTUPLE(f1, f2) AS t1, f3; >>> data = GROUP data BY f0; >>> data = FOREACH data GENERATE group AS f0, data.t1 AS b1, data.f3 AS >>> b3; >>> data = FOREACH data GENERATE f0, FLATTEN(b1), b3; >>> >>> DESCRIBE data; >>> data: {f0: chararray,b1::f1: chararray,b1::f2: int,b3: {f3: >>> chararray}} >>> >>> DUMP data; >>> (A,( apple,1),{( alpha),( alpha)}) >>> (A,( airplane,1),{( alpha),( alpha)}) >>> (B,( ball,2),{( beta)}) >>> (C,( cat,3),{( gamma),( gamma)}) >>> (C,( candle,3),{( gamma),( gamma)}) >>> (D,( dog,4),{( delta)}) >>> >>> DESCRIBE appears to claim that the tuple would also be flattened >>> out >>> into two fields, while DUMP keeps the tuple as is (which should be the >>> correct behavior). When I try a subsequent FLATTEN on the tuple the >>> script fails with error 2229. >>> >>> Any insights/solutions would be very helpful! >>> >>> Thanks! >>> Amit >>> >>> >> >> >
