mapred.local.dir temp dir. space allocation limited by smallest area
---------------------------------------------------------------------
Key: NUTCH-181
URL: http://issues.apache.org/jira/browse/NUTCH-181
Project: Nutch
Type: Bug
Components: indexer
Versions: 0.8-dev
Environment: all
Reporter: Paul Baclace
When mapred.local.dir is used to specify multiple temp dir. areas, space
allocation limited by smallest area because the temp dir. selection algorithm
is "round robin starting from a randomish point". When round robin is used
with approximately constant sized chunks, the smallest area runs out of space
first, and this is a fatal error.
Workaround: only list local fs dirs in mapred.local.dir with similarly-sized
available areas.
I wrote a patch to JobConf (currenly being tested) which uses df to check
available space (once a minute or less often) and then uses an efficient
roulette selection to do allocation weighted by magnitude of available space.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira