mapred.local.dir  temp dir. space allocation limited by smallest area
---------------------------------------------------------------------

         Key: NUTCH-181
         URL: http://issues.apache.org/jira/browse/NUTCH-181
     Project: Nutch
        Type: Bug
  Components: indexer  
    Versions: 0.8-dev    
 Environment: all
    Reporter: Paul Baclace


When mapred.local.dir is used to specify multiple  temp dir. areas, space 
allocation limited by smallest area because the temp dir. selection algorithm 
is "round robin starting from a randomish point".   When round robin is used 
with approximately constant sized chunks, the smallest area runs out of space 
first, and this is a fatal error. 

Workaround: only list local fs dirs in mapred.local.dir with similarly-sized 
available areas.

I wrote a patch to JobConf (currenly being tested) which uses df to check 
available space (once a minute or less often) and then uses an efficient 
roulette selection to do allocation weighted by magnitude of available space. 



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to