Ken Krugler wrote:
Has anybody been using Hadoop with ZFS? Would ZFS count as a readily available shared file system that scales appropriately?

Sun's ZFS? I don't think that's distributed, is it? Does it provide a single namespace across an arbitrarily large cluster? From the documentation I can find it just sounds like a better single-node filesystem. It'd be good for, e.g., mounting 40 1TB drives on a big Sun box, but I don't see how it's meant to, e.g., stitch together 4,000 drives across a cluster of 1,000 nodes into a single filesystem.

I'd seem references to using ZFS as a "poor man's cluster", e.g. http://blogs.sun.com/erickustarz/entry/poor_man_s_cluster_end and http://www.opensolaris.org/jive/message.jspa?messageID=22182#22182.

From reading them, it's clear the ZFS isn't a distributed file system, not does it cleanly support shared access...though people are hacking on it to achieve some of these goals.

But for Hadoop users who don't have the requirement to access 4K drives on 1K servers (which would be, oh, maybe 99.9% of the universe :)) it might be an interesting option for a high performance, high reliability FS that scales further than NFS.

Having said that, we don't use Solaris (everything is Linux-based). There's a port in motion to Linux, from what I've read, so it might become more interesting then.

-- Ken
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"Find Code, Find Answers"

Reply via email to