Rod Taylor wrote:
The attached patches for Generator.java and Injector.java allow a
specific temporary directory to be specified. This gives Nutch the full
path to these temporary directories and seems to fix the "No input
directories" issue when using a local filesystem with multiple task
trackers.

This looks like a good patch.  I've committed it.

This is a recent bug. The nutch-daemon.sh script connects all daemons to the Nutch root, so that relative paths are consistent. And, previously, child processes were always connected to the same place as the parent process. But I changed that recently so that child processes are now connected to the directory where their job's jar (if any) is unpacked. This was so that if the jar contains scripts (e.g., a parse-ext plugin script) then these scripts are easy to run.

In NDFS the current working directory is always /user/$USER. On the local filesystem with a local jobtracker, paths are relative to the current working directory of the process (since there's only one process). The problematic case is when the local filesystem is used with multiple processes. The prior convention of making paths relative to the nutch root was fragile. Better to supply absolute paths, as your patch does.

Doug

Reply via email to