I believe the "-fs local" should be removed too.
The reason is that even you have a dedicated JobTracker after
removing "-jt local", but with "-fs local", I believe that all
the mappers will be run sequentially.
"-fs local" will force the mapreducer run in "local" mode,
which is really a test mode.
What you can do is to remove both "-fs local -jt local",
but give the FULL URI of the input and output path, to tell
Hadoop that they are local filesystem instead of HDFS.
Keep in mind followings:
1) The NFS mount need to be available in all your Task
Nodes, and mounted in the same way.
2) Even you can do that, but your sharing storage will be
your bottleneck. NFS won't work well for scalability.
Yong
Date: Fri, 20 Dec 2013 09:01:32 -0500
Subject: Re: Running Hadoop v2 clustered mode MR on an NFS
mounted filesystem
From:
[email protected]
To:
[email protected]
I think most of your problem is coming from
the options you are setting:
"hadoop jar
/hduser/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar
wordcount -fs local -jt local
/hduser/mount_point/ /results"
You appear to be directing your namenode to run jobs
in the LOCAL job runner and directing it to read
from the LOCAL filesystem. Drop the -jt
argument and it should run in distributed mode if your
cluster is set up right. You don't need to do anything
special to point Hadoop towards a NFS location, other
than set up the NFS location properly and make sure if
you are directing to it by name that it will resolve to
the right address. Hadoop doesn't care where it is, as
long as it can read from and write to it. The fact that
you are telling it to read/write from/to a NFS location
that happens to be mounted as a local filesystem object
doesn't matter - you could direct it to the local
/hduser/ path and set the -fs local option, and it would
end up on the NFS mount, because that's where the NFS
mount actually exists, or you could direct it to the
absolute network location of the folder that you want,
it shouldn't make a difference.