At 9:41 am -0700 4/16/07, Doug Cutting wrote:
Eelco Lempsink wrote:
Inspired by http://www.mail-archive.com/[EMAIL PROTECTED]/msg02394.html I'm trying to run Hadoop on multiple CPU's, but without using HDFS.

To be clear: you need some sort of shared filesystem, if not HDFS, then NFS, S3, or something else. For example, the job client interacts with the job tracker by copying files to the shared filesystem named by fs.default.name, and job inputs and outputs are assumed to come from a shared filesystem.

So, if you're using NFS, then you'd set fs.default.name to something like "file:///mnt/shared/hadoop/". Note also that as your cluster grows, NFS will soon become a bottleneck. That's why HDFS is provided: there aren't other readily available shared filesystems that scale appropriately.

Has anybody been using Hadoop with ZFS? Would ZFS count as a readily available shared file system that scales appropriately?

Thanks,

-- Ken
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"

Reply via email to