What should FLATTEN do?

2010-04-02 Thread hc busy
Guys, I have a row containing a map 'id','data', {((1,2)), ((2,3)), ((4,5))} What is the expected behavior when I flatten on that bag? I had expected it to result in 'id','data', (1,2) 'id','data', (2,3) 'id','data', (4,5) But it appears to me that the result of applying FLATTEN to that bag

Re: What should FLATTEN do?

2010-04-02 Thread hc busy
doh s/map/bag/g I seem to get maps and bags mixed up or some reason... Guys, I have a row containing a *bag* 'id','data', {((1,2)), ((2,3)), ((4,5))} What is the expected behavior when I flatten on that bag? I had expected it to result in 'id','data', (1,2) 'id','data', (2,3) 'id','data',

Re: What should FLATTEN do?

2010-04-02 Thread Dmitriy Ryaboy
CDH2 or CDH3? CDH2 is basically 0.{4,5}. CDH3 is in between 5 and 6. I expect the first result -- a flattened bag of tuples results in multiple rows, each containing the (not-flattened) tuple. Btw, Pig 0.6 is out. -D On Fri, Apr 2, 2010 at 11:32 AM, hc busy hc.b...@gmail.com wrote: doh

Re: What should FLATTEN do?

2010-04-02 Thread hc busy
Yeah, I'm sure it has nested tuples. Pig doesn't natively support introduction of tuples h = foreach g generate ((x,y,z)), (x), x doesn't work, but i have a udf that does that don't ask why, and I've seen it print double pair of paren's when I took a dump. Our hadoop guys here

Re: What should FLATTEN do?

2010-04-02 Thread Russell Jurney
Not sure if this is exactly the same, but when I've created tuples within tuples in UDFs (to preserve order of pairs), from bag input, Pig has allowed it - but I can't work with that data in subsequent steps. On Fri, Apr 2, 2010 at 12:37 PM, hc busy hc.b...@gmail.com wrote: Yeah, I'm sure it

Re: What should FLATTEN do?

2010-04-02 Thread hc busy
yeah, you have to implement outputSchema() method on the udf in order to make the content of the tuple visible... There's a nice example in the UDF Manual http://hadoop.apache.org/pig/docs/r0.6.0/udf.html http://hadoop.apache.org/pig/docs/r0.6.0/udf.htmlsearch for 'package myudf' until u

Re: What should FLATTEN do?

2010-04-02 Thread hc busy
Okay guys some details after some digging. We've got this version of pig from CDH2 installed: hadoop-pig-0.5.0+11.1-1 the list of patches that they applied on top of 0.5.0 are listed here: http://archive.cloudera.com/cdh/2/pig-0.5.0+11.1.CHANGES.txt

Re: What should FLATTEN do?

2010-04-02 Thread hc busy
The hadoop version: hadoop-0.20-0.20.1+169.68-1 On Fri, Apr 2, 2010 at 2:33 PM, hc busy hc.b...@gmail.com wrote: Okay guys some details after some digging. We've got this version of pig from CDH2 installed: hadoop-pig-0.5.0+11.1-1 the list of patches that they applied on top of 0.5.0