Within same map or reduce step - as subsequent operators.
I dont think it combines operators, but it is not different jobs - if
that is what you are worried about. Think of it like a pipeline ...
Regards,
Mridul
On Monday 01 March 2010 05:32 PM, prasenjit mukherjee wrote:
Thanks that will work.
I was hoping to avoid that additional foreach loop. BTW, are these
statements internally optimized by pig, so that its done in a single
iteration ?
-Prasen
On Mon, Mar 1, 2010 at 4:47 PM, Ankur C. Goel<[email protected]> wrote:
Would this work for you?
r1 = load 'data' AS
(f1:chararray, f2:chararray,f3:chararray, i1:int,i2:int,i3:int);
tmp = group r1 by (f1,f2);
tmp1 = foreach tmp {
generate flatten(group), FLATTEN(r1.(i1,i2));
}
tmp2 = FOREACH tmp1 GENERATE f1, f2, i1, i1+i2;
dump tmp2;
-...@nkur
On 3/1/10 3:21 PM, "prasenjit mukherjee"<[email protected]> wrote:
grunt> r1 = load '/tmp/agg_qat.txt' USING PigStorage (',') AS
(f1:chararray, f2:chararray,f3:chararray, i1:int,i2:int,i3:int);
grunt> tmp = group r1 by (f1,f2);
grunt> tmp1 = foreach tmp generate flatten(group), r1.(i1,i1+i2);
The last line is throwing an error :
2010-03-01 15:17:20,053 [main] ERROR org.apache.pig.tools.grunt.Grunt
- ERROR 1000: Error during parsing. Encountered " "+" "+ "" at line 1,
column 52.
Was expecting one of:
")" ...
"," ...
"," ...
")" ...
The following line works fine though :
grunt> tmp1 = foreach tmp generate flatten(group), r1.(i1,i2);
ANy pointers how to acheive r1.(i1, i1+i2 ) in the group ?
-thanks,
Prasen