"Group all" puts everything into one group, so kinda hard to find anything for the other 99 reducers :)
Which is ok if you are applying algebraic functions to it, like counting or finding maxes of things, as those operations will be pushed out to the mappers instead of building the group on a single reducer. D On Fri, Feb 11, 2011 at 10:10 AM, Charles Gonçalves <[email protected]> wrote: > Yes, but even using the : E = GROUP B *ALL PARALLEL 100;* > I got only one reduce (an > obviously<http://www.youtube.com/watch?v=hMtZfW2z9dw>no space to > process everything) > > I tried Group by something and worked. > Could be some optimization issue!? > > > On Fri, Feb 11, 2011 at 3:10 PM, Alan Gates <[email protected]> wrote: > >> Possible, but it will be ignored. Anything done inside a nested foreach >> block will be executed at the parallel level of the preceding group by. >> >> Alan. >> >> >> On Feb 11, 2011, at 8:57 AM, Charles Gonçalves wrote: >> >> Is possible to use a parallel statment inside a nested foreach block like >>> in >>> : >>> >>> 28 E = GROUP B ALL PARALLEL 100; >>> >>> >>> >>> 29 >>> >>> >>> >>> 30 edge_breakdown = FOREACH E { >>> >>> >>> >>> 31 dist_cIps = DISTINCT B.cIp *PARALLEL X * ; >>> >>> >>> >>> 32 dist_sIps = DISTINCT B.sIp ; >>> >>> >>> >>> 33 urls_ok = FILTER B BY valid(url); >>> >>> >>> >>> 34 GENERATE COUNT(dist_cIps),COUNT(dist_sIps) ,COUNT(urls_ok.url), >>> COUNT(B.url), SUM(B.scBytes); >>> >>> >>> 35 } >>> >>> I got an error : >>> ERROR 1000: Error during parsing. Encountered " "parallel" "PARALLEL "" at >>> line 36, column 36. >>> Was expecting: >>> ";" ... >>> >>> My problem is that I'm using PARALLEL in line 28 an also setting the >>> 14 SET DEFAULT_PARALLEL 30; >>> >>> But even though I'm gotting just one reducer !! >>> >>> Is some optimization that I can disable? >>> I already tried to play with the pig.exec.reducers.bytes.per.reducer and >>> nothin. >>> I'm processing 2TB of data an one reduce is yielding no space left on >>> device error! >>> >>> Any >>> >>> >>> -- >>> *Charles Ferreira Gonçalves * >>> http://homepages.dcc.ufmg.br/~charles/ >>> UFMG - ICEx - Dcc >>> Cel.: 55 31 87741485 >>> Tel.: 55 31 34741485 >>> Lab.: 55 31 34095840 >>> >> >> > > > -- > *Charles Ferreira Gonçalves * > http://homepages.dcc.ufmg.br/~charles/ > UFMG - ICEx - Dcc > Cel.: 55 31 87741485 > Tel.: 55 31 34741485 > Lab.: 55 31 34095840 >
