Rod Taylor wrote:
The attached patches for Generator.java and Injector.java allow a
specific temporary directory to be specified. This gives Nutch the full
path to these temporary directories and seems to fix the "No input
directories" issue when using a local filesystem with multiple task
trackers.
This looks like a good patch. I've committed it.
This is a recent bug. The nutch-daemon.sh script connects all daemons
to the Nutch root, so that relative paths are consistent. And,
previously, child processes were always connected to the same place as
the parent process. But I changed that recently so that child processes
are now connected to the directory where their job's jar (if any) is
unpacked. This was so that if the jar contains scripts (e.g., a
parse-ext plugin script) then these scripts are easy to run.
In NDFS the current working directory is always /user/$USER. On the
local filesystem with a local jobtracker, paths are relative to the
current working directory of the process (since there's only one
process). The problematic case is when the local filesystem is used
with multiple processes. The prior convention of making paths relative
to the nutch root was fragile. Better to supply absolute paths, as your
patch does.
Doug