Yes, but even using the : E = GROUP B *ALL PARALLEL 100;*
I got only one reduce (an
obviously<http://www.youtube.com/watch?v=hMtZfW2z9dw>no space to
process everything)

I tried Group by something and worked.
Could be some optimization issue!?


On Fri, Feb 11, 2011 at 3:10 PM, Alan Gates <[email protected]> wrote:

> Possible, but it will be ignored.  Anything done inside a nested foreach
> block will be executed at the parallel level of the preceding group by.
>
> Alan.
>
>
> On Feb 11, 2011, at 8:57 AM, Charles Gonçalves wrote:
>
>  Is possible to use a parallel statment inside a nested foreach block like
>> in
>> :
>>
>> 28 E = GROUP B ALL PARALLEL 100;
>>
>>
>>
>> 29
>>
>>
>>
>> 30 edge_breakdown = FOREACH E {
>>
>>
>>
>> 31   dist_cIps = DISTINCT B.cIp *PARALLEL X * ;
>>
>>
>>
>> 32   dist_sIps = DISTINCT B.sIp ;
>>
>>
>>
>> 33   urls_ok = FILTER B BY valid(url);
>>
>>
>>
>> 34   GENERATE COUNT(dist_cIps),COUNT(dist_sIps) ,COUNT(urls_ok.url),
>> COUNT(B.url), SUM(B.scBytes);
>>
>>
>> 35 }
>>
>> I got an error :
>> ERROR 1000: Error during parsing. Encountered " "parallel" "PARALLEL "" at
>> line 36, column 36.
>> Was expecting:
>>   ";" ...
>>
>> My problem is that I'm using  PARALLEL in line 28 an also setting the
>> 14 SET DEFAULT_PARALLEL 30;
>>
>> But even though I'm gotting just one reducer !!
>>
>> Is some optimization that I can disable?
>> I already tried to play with the  pig.exec.reducers.bytes.per.reducer and
>> nothin.
>> I'm processing 2TB of data an one reduce is yielding   no space left on
>> device error!
>>
>> Any
>>
>>
>> --
>> *Charles Ferreira Gonçalves *
>> http://homepages.dcc.ufmg.br/~charles/
>> UFMG - ICEx - Dcc
>> Cel.: 55 31 87741485
>> Tel.:  55 31 34741485
>> Lab.: 55 31 34095840
>>
>
>


-- 
*Charles Ferreira Gonçalves *
http://homepages.dcc.ufmg.br/~charles/
UFMG - ICEx - Dcc
Cel.: 55 31 87741485
Tel.:  55 31 34741485
Lab.: 55 31 34095840

Reply via email to