Eelco Lempsink wrote:
Inspired by http://www.mail-archive.com/[EMAIL PROTECTED]/msg02394.html I'm trying to run Hadoop on multiple CPU's, but without using HDFS.

To be clear: you need some sort of shared filesystem, if not HDFS, then NFS, S3, or something else. For example, the job client interacts with the job tracker by copying files to the shared filesystem named by fs.default.name, and job inputs and outputs are assumed to come from a shared filesystem.

So, if you're using NFS, then you'd set fs.default.name to something like "file:///mnt/shared/hadoop/". Note also that as your cluster grows, NFS will soon become a bottleneck. That's why HDFS is provided: there aren't other readily available shared filesystems that scale appropriately.

Doug

Reply via email to