Dan Bretherton wrote:
We do not need to use the distributed filesystem in Hadoop because the data and home directories are available on every machine via NFS.

Writing everything over NFS will seriously affect Hadoop's performance, and is not recommended.

060330 130100 task_m_f8jt6q  SEVERE FSError from child
060330 130100 task_m_f8jt6q org.apache.hadoop.fs.FSError: java.io.IOException: Stale NFS file handle

This looks related to your use of NFS.

java.io.IOException: Task process exit with nonzero status. at org.apache.hadoop.mapred.TaskRunner.runChild(TaskRunner.java:273) at ...etc.

That means the JVM running your task crashed. Enabling core dumps might help you figure out why it crashed.

java.io.FileNotFoundException: /users/dab/Hadoop/input/Occam/.nfs000047ab00000001 at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:114) at ....etc.

This looks like another NFS-related problem.

though this was not a true test of the DFS because of our NFS setup (i.e. all the DFS blocks actually end up in my home directory on a single disk). I should also point out that the input data involved in the DFS was just a list of file names, not the temperature data itself. Using the DFS I found that the jobs often failed because of problems with missing blocks of data. Here is a typical error message from the job tracker log file.

java.io.IOException: Could not obtain block blk_-3035035931951255964 at org.apache.hadoop.dfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:362) at ...etc.

As soon as these errors start to appear it means that the DFS is broken.

This could be related to running DFS on top of NFS. Again, I would not recommend that.

Generally I would try running things without using NFS, using local volumes for all mapred and dfs directories. That is the intended use.

Doug

Reply via email to