Within same map or reduce step - as subsequent operators.
I dont think it combines operators, but it is not different jobs - if that is what you are worried about. Think of it like a pipeline ...


Regards,
Mridul

On Monday 01 March 2010 05:32 PM, prasenjit mukherjee wrote:
Thanks that will work.

I was hoping to avoid that additional foreach loop. BTW, are these
statements internally optimized by pig, so that its done in a single
iteration ?

-Prasen

On Mon, Mar 1, 2010 at 4:47 PM, Ankur C. Goel<[email protected]>  wrote:

Would this work for you?

r1 = load 'data' AS
(f1:chararray, f2:chararray,f3:chararray, i1:int,i2:int,i3:int);
tmp = group r1 by (f1,f2);
tmp1 = foreach tmp {
   generate  flatten(group), FLATTEN(r1.(i1,i2));
}
tmp2 = FOREACH tmp1 GENERATE f1, f2, i1, i1+i2;
dump tmp2;

-...@nkur

On 3/1/10 3:21 PM, "prasenjit mukherjee"<[email protected]>  wrote:

grunt>    r1 = load '/tmp/agg_qat.txt' USING PigStorage (',') AS
(f1:chararray, f2:chararray,f3:chararray, i1:int,i2:int,i3:int);
grunt>  tmp = group r1 by (f1,f2);
grunt>  tmp1 = foreach tmp generate  flatten(group), r1.(i1,i1+i2);

The last line is throwing an error :
2010-03-01 15:17:20,053 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 1000: Error during parsing. Encountered " "+" "+ "" at line 1,
column 52.
Was expecting one of:
    ")" ...
    "," ...
    "," ...
    ")" ...

The following line works fine though :
grunt>  tmp1 = foreach tmp generate  flatten(group), r1.(i1,i2);

ANy pointers how to acheive r1.(i1, i1+i2 ) in the group ?

-thanks,
Prasen



Reply via email to