mapred.local.dir  temp dir. space allocation limited by smallest area
---------------------------------------------------------------------

         Key: NUTCH-181
         URL: http://issues.apache.org/jira/browse/NUTCH-181
     Project: Nutch
        Type: Bug
  Components: indexer  
    Versions: 0.8-dev    
 Environment: all
    Reporter: Paul Baclace


When mapred.local.dir is used to specify multiple  temp dir. areas, space 
allocation limited by smallest area because the temp dir. selection algorithm 
is "round robin starting from a randomish point".   When round robin is used 
with approximately constant sized chunks, the smallest area runs out of space 
first, and this is a fatal error. 

Workaround: only list local fs dirs in mapred.local.dir with similarly-sized 
available areas.

I wrote a patch to JobConf (currenly being tested) which uses df to check 
available space (once a minute or less often) and then uses an efficient 
roulette selection to do allocation weighted by magnitude of available space. 



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to