mapred.local.dir temp dir. space allocation limited by smallest area
---------------------------------------------------------------------
Key: NUTCH-181
URL: http://issues.apache.org/jira/browse/NUTCH-181
Project: Nutch
Type: Bug
Components: indexer
Versions: 0.8-dev
Environment: all
Reporter: Paul Baclace
When mapred.local.dir is used to specify multiple temp dir. areas, space
allocation limited by smallest area because the temp dir. selection algorithm
is "round robin starting from a randomish point". When round robin is used
with approximately constant sized chunks, the smallest area runs out of space
first, and this is a fatal error.
Workaround: only list local fs dirs in mapred.local.dir with similarly-sized
available areas.
I wrote a patch to JobConf (currenly being tested) which uses df to check
available space (once a minute or less often) and then uses an efficient
roulette selection to do allocation weighted by magnitude of available space.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers