At 9:41 am -0700 4/16/07, Doug Cutting wrote:
Eelco Lempsink wrote:
Inspired by
http://www.mail-archive.com/[EMAIL PROTECTED]/msg02394.html
I'm trying to run Hadoop on multiple CPU's, but without using HDFS.
To be clear: you need some sort of shared filesystem, if not HDFS,
then NFS, S3, or something else. For example, the job client
interacts with the job tracker by copying files to the shared
filesystem named by fs.default.name, and job inputs and outputs are
assumed to come from a shared filesystem.
So, if you're using NFS, then you'd set fs.default.name to something
like "file:///mnt/shared/hadoop/". Note also that as your cluster
grows, NFS will soon become a bottleneck. That's why HDFS is
provided: there aren't other readily available shared filesystems
that scale appropriately.
Has anybody been using Hadoop with ZFS? Would ZFS count as a readily
available shared file system that scales appropriately?
Thanks,
-- Ken
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"