actually FLATTEN(FLATTEN(....)) is not syntactically correct , at least in 0.8. also semantically it's not what I wanted either, cuz FLATTEN works on bags, while I wanted to project ALL fields of a tuple.
I ended up adding a T:tuple( ) to the AS clause, and adding an explicit projection after the udf call. Thanks Yang On Mon, Jun 25, 2012 at 6:45 AM, Yang <[email protected]> wrote: > thanks Robert, I'll try it > On Jun 25, 2012 3:56 AM, "Norbert Burger" <[email protected]> > wrote: > >> Yang -- I think you'll get the representation you're looking for by >> applying the FLATTEN a second time. Each instance of a FLATTEN strips off >> a single layer. >> >> Norbert >> >> On Sun, Jun 24, 2012 at 5:57 PM, Jonathan Coveney <[email protected] >> >wrote: >> >> > generate K.(x1), K.(x2), K.(x3) .... , K.(x100); and generate >> > K(x1,...,x100) are actually very different. >> > >> > The latter is a bag, with columns x1, x2..x100. This is generally what >> is >> > desired. >> > >> > The former is a bag of column x1, then a bag of column x2, then a bag of >> > column x3, etc. Each will be unordered and independent. >> > >> > 2012/6/24 yonghu <[email protected]> >> > >> > > You can also write like >> > > >> > > K1.(x1,x2,...,x100). >> > > >> > > regards! >> > > >> > > Yong >> > > >> > > On Sun, Jun 24, 2012 at 8:40 PM, Yang <[email protected]> wrote: >> > > > thanks, >> > > > >> > > > but this is a bit more cumbersome: if I have >> > > > >> > > > generate K.(x1), K.(x2), K.(x3) .... , K.(x100); >> > > > >> > > > I'd have to re-write each xn by adding K.( ) >> > > > >> > > > >> > > > it would be nice if the schema of K can strip off the surrounding {( >> > )}. >> > > > actually it should, >> > > > since this is after a FLATTEN() >> > > > >> > > > >> > > > Yang >> > > > >> > > > On Sun, Jun 24, 2012 at 11:17 AM, yonghu <[email protected]> >> > wrote: >> > > > >> > > >> So, I think you want to project the x in K. You can write the pig >> as: >> > > >> >> > > >> M = foreach K generate K.(x) as X; >> > > >> >> > > >> Hope this can help you. >> > > >> >> > > >> Yong >> > > >> >> > > >> On Sun, Jun 24, 2012 at 12:40 PM, Yang <[email protected]> >> wrote: >> > > >> > my UDF returns a bag of tuples : mybag:bag{ mytuple: tuple ( x: >> int, >> > > >> y:int)} >> > > >> > >> > > >> > in my pig script: >> > > >> > >> > > >> > I do >> > > >> > >> > > >> > K = foreach blah generate UDF( xxx); >> > > >> > >> > > >> > M = foreach K generate x; >> > > >> > >> > > >> > >> > > >> > here PIG 0.8.1 says x can not be found in schema, since >> > > >> > >> > > >> > describe K >> > > >> > >> > > >> > shows: >> > > >> > { mytuple:tuple(x:int , y:int) } >> > > >> > >> > > >> > while 0.10.0 >> > > >> > >> > > >> > shows >> > > >> > {x:int, y:int} >> > > >> >> > > >> > >> >
