Ross wrote:
> Hey folks,
>
> I guess this is an odd question to be asking here, but I could do with some 
> feedback from anybody who's actually using ZFS in anger.
>   

I've been using ZFS for nearly 3 years now.  It has been my (mirrored :-)
home directory for that time.  I've never lost any of that data, though I do
spend some time torturing ZFS and hardware.  Inside Sun, we use ZFS
home directories for a large number of developers and these servers
are upgraded every build.  As marketing would say, we eat our own
dog food.

> I'm about to go live with ZFS in our company on a new fileserver, but I have 
> some real concerns about whether I can really trust ZFS to keep my data alive 
> if things go wrong.  This is a big step for us, we're a 100% windows company 
> and I'm really going out on a limb by pushing Solaris.
>   

I'm not that familiar with running Windows file systems for large
numbers of users, but my personal experience with them has been
frought with data loss and, going back a few years, "ABORT, RETRY,
GIVE UP"

> The problems with zpool status hanging concern me, knowing that I can't hot 
> plug drives is an issue, and the long resilver times bug is also a potential 
> problem.  I suspect I can work around the hot plug drive bug with a big 
> warning label on the server, but knowing the pool can hang so easily makes me 
> worry about how well ZFS will handle other faults.
>   

While you've demonstrated hot unplug problems with USB drives,
that is a very different software path than the more traditional hot
plug SAS/FC/UltraSCSI devices.  USB devices are considered
removable media and have a very different use case than what is
normally considered for enterprise-class storage devices.

> On my drive home tonight I was wondering whether I'm going to have to swallow 
> my pride and order a hardware raid controller for this server, letting that 
> deal with the drive issues, and just using ZFS as a very basic filesystem.
>   

If you put all of your trust in the hardware RAID controller, then
one day you may be disappointed. This is why we tend to recommend
using some sort of data protection at the ZFS level, regardless of the
hardware.  If you look at this forum's archive, you will see someone
who has discovered a faulty RAID controller, switch, HBA, or
some other device by using ZFS.  With other file systems, it would
be difficult to isolate the fault.

> What has me re-considering ZFS though is that on the other hand I know the 
> Thumpers have sold well for Sun, and they pretty much have to use ZFS.  So 
> there's a big installed base out there using it, and that base has been using 
> it for a few years.  I know from the Thumper manual that you have to 
> unconfigure drives before removal on them on those servers, which goes a long 
> way towards making me think that should be a relatively safe way to work. 
>   

You can run Windows, RHEL, FreeBSD, and probably another
dozen or two OSes on thumpers.  We have customers who run
many different OSes on our open and industry standard hardware.

> The question is whether I can make a server I can be confident in.  I'm now 
> planning a very basic OpenSolaris server just using ZFS as a NFS server, is 
> there anybody out there who can re-assure me that such a server can work well 
> and handle real life drive failures?
>   

Going back to your USB remove test, if you protect that disk
at the ZFS level, such as a mirror, then when the disk is removed
then it will be detected as removed and zfs status will show its
state as "removed" and the pool as "degraded" but it will continue
to function, as expected.  Replacing the USB device will bring it
back online, again as expected, and it should resilver automatically.
To reiterate, it is best to let ZFS do the data protection regardless
of the storage used.
 -- richard


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to