Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-16 Thread Miles Nordin
nw == Nicolas Williams nicolas.willi...@sun.com writes: nw For NFSv4 clients that support mirror mounts its not a problem nw at all. no, 3000 - 10,000 users is common for a large campus, and according to posters here, sometimes that many users actually can fit into the bandwidth of a

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-16 Thread John Kaitschuck
Miles Nordin wrote: nw == Nicolas Williams nicolas.willi...@sun.com writes: nw You're not required to go with one-filesystem-per-user though! It was pitched as an architectural advantage, but never fully delivered, and worse, used to justify removing traditional Unix quotas.

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-15 Thread Ross Smith
Forgive me for not understanding the details, but couldn't you also work backwards through the blocks with ZFS and attempt to recreate the uberblock? So if you lost the uberblock, could you (memory and time allowing) start scanning the disk, looking for orphan blocks that aren't refernced

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-15 Thread Casper . Dik
I think the problem for me is not that there's a risk of data loss if a pool becomes corrupt, but that there are no recovery tools available. With UFS, people expect that if the worst happens, fsck will be able to recover their data in most cases. Except, of course, that fsck lies. In fixes

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-15 Thread Ross
I think the problem for me is not that there's a risk of data loss if a pool becomes corrupt, but that there are no recovery tools available. With UFS, people expect that if the worst happens, fsck will be able to recover their data in most cases. With ZFS you have no such tools, yet Victor

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-15 Thread Bob Friesenhahn
On Mon, 15 Dec 2008, Ross wrote: My concern is that ZFS has all this information on disk, it has the ability to know exactly what is and isn't corrupted, and it should (at least for a system with snapshots) have many, many potential uberblocks to try. It should be far, far better than UFS

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-15 Thread Ross Smith
I'm not sure I follow how that can happen, I thought ZFS writes were designed to be atomic? They either commit properly on disk or they don't? On Mon, Dec 15, 2008 at 6:34 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Mon, 15 Dec 2008, Ross wrote: My concern is that ZFS has all

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-15 Thread Bob Friesenhahn
On Mon, 15 Dec 2008, Ross Smith wrote: I'm not sure I follow how that can happen, I thought ZFS writes were designed to be atomic? They either commit properly on disk or they don't? Yes, this is true. One reason why people complain about corrupted ZFS pools is because they have hardware

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-15 Thread Nicolas Williams
On Mon, Dec 15, 2008 at 01:36:46PM -0600, Bob Friesenhahn wrote: On Mon, 15 Dec 2008, Ross Smith wrote: I'm not sure I follow how that can happen, I thought ZFS writes were designed to be atomic? They either commit properly on disk or they don't? Yes, this is true. One reason why

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-15 Thread Miles Nordin
nw == Nicolas Williams nicolas.willi...@sun.com writes: nw Your thesis is that all corruption problems observed with ZFS nw on SANs are: a) phantom writes that never reached the rotating nw rust, b) not bit rot, corruption in the I/O paths, ... nw Correct? yeah. by ``all'' I

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-15 Thread Miles Nordin
bc == Bryan Cantrill b...@eng.sun.com writes: jz == Joseph Zhou j...@excelsioritsolutions.com writes: bc most of the people I talk to are actually _using_ NetApp's bc technology, a practice that tends to leave even the most bc stalwart proponents realistic about the (many)

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-15 Thread Nicolas Williams
On Mon, Dec 15, 2008 at 05:04:03PM -0500, Miles Nordin wrote: As Tim said, the one-filesystem-per-user thing is not working out. For NFSv3 clients that truncate MOUNT protocol answers (and v4 clients that still rely on the MOUNT protocol), yes, one-filesystem-per-user is a problem. For NFSv4

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-15 Thread Toby Thain
Maybe the format allows unlimited O(1) snapshots, but it's at best O(1) to take them. All over the place it's probably O(n) or worse to _have_ them. to boot with them, to scrub with them. Why would a scrub be O(n snapshots)? The O(n filesystems) effects reported from time to time in

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-13 Thread Joseph Zhou
Richard, I have been glancing through the posts, saw more hardware RAID vs ZFS discussion, some are very useful. However, as you adviced me the other day, we should think about the overall solution architect, not just the feature itself. I believe the spirit of ZFS snapshot is more significant

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-13 Thread Tim
On Fri, Dec 12, 2008 at 8:16 PM, Jeff Bonwick jeff.bonw...@sun.com wrote: I'm going to pitch in here as devil's advocate and say this is hardly revolution. 99% of what zfs is attempting to do is something NetApp and WAFL have been doing for 15 years+. Regardless of the merits of their

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-13 Thread Jeff Bonwick
Off the top of my head nearly all of them. Some of them have artificial limitations because they learned the hard way that if you give customers enough rope they'll hang themselves. For instance unlimited snapshots. Oh, that's precious! It's not an arbitrary limit, it's a safety feafure!

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-13 Thread Bryan Cantrill
Seriously? Do you know anything about the NetApp platform? I'm hoping this is a genuine question... Off the top of my head nearly all of them. Some of them have artificial limitations because they learned the hard way that if you give customers enough rope they'll hang themselves. For

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-13 Thread Bob Friesenhahn
On Sat, 13 Dec 2008, Tim wrote: Seriously? Do you know anything about the NetApp platform? I'm hoping this is a genuine question... I believe that esteemed Sun engineers like Jeff are quite familiar with the NetApp platform. Besides NetApp being one of the primary storage competitors, it

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-13 Thread Joseph Zhou
Message - From: Bob Friesenhahn bfrie...@simple.dallas.tx.us To: Tim t...@tcsac.net Cc: zfs-discuss@opensolaris.org Sent: Saturday, December 13, 2008 11:03 AM Subject: Re: [zfs-discuss] Split responsibility for data with ZFS On Sat, 13 Dec 2008, Tim wrote: Seriously? Do you know anything

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-13 Thread Bob Friesenhahn
On Sat, 13 Dec 2008, Joseph Zhou wrote: In that spirit, and looking at the NetApp virtual server support architecture, I would say -- as much as the ONTAP/WAFL thing (even with GX integration) is elegant, it would make more sense to utilize the file system capabilities with kernal

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-13 Thread Anton B. Rang
I wasn't joking, though as is well known, the plural of anecdote is not data. Both UFS and ZFS, in common with all file system, have design flaws and bugs. To lose an entire UFS file system (barring the loss of the entire underlying storage) requires a great deal of corruption; there are

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-13 Thread Anton B. Rang
Some RAID systems compare checksums on reads, though this is usually only for RAID-4 configurations (e.g. DataDirect) because of the performance hit otherwise. End-to-end checksums are not yet common. The SCSI committee recently ratified T10 DIF, which allows either an operating system or

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-13 Thread Richard Elling
Anton B. Rang wrote: I find it naïve to imagine that Sun customers expect their UFS (or other) file systems to be unrecoverable. OK, I'll bite. If we believe the disk vendors who rate their disks as having an unrecoverable error rate of 1 bit per 10^14 bits read, and knowing that UFS has

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-13 Thread Richard Elling
Anton B. Rang wrote: Some RAID systems compare checksums on reads, though this is usually only for RAID-4 configurations (e.g. DataDirect) because of the performance hit otherwise. For the record, Solaris had a (mirrored) RAID system which would compare data from both sides of the mirror

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Gary Mills
On Thu, Dec 11, 2008 at 10:41:26PM -0600, Bob Friesenhahn wrote: On Thu, 11 Dec 2008, Gary Mills wrote: The split responsibility model is quite appealing. I'd like to see ZFS address this model. Is there not a way that ZFS could delegate responsibility for both error detection and correction

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Ross
It really comes down to how much you trust the SAN and transport technology. If you're happy that you've got a good SAN, and you have a transport that guarantees the integrity of the data then there's no reason ZFS shouldn't be reliable. Personally I'd be happier once some of the recovery

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Miles Nordin
tt == Toby Thain t...@telegraphics.com.au writes: mg == Mike Gerdts mger...@gmail.com writes: tt I think we have to assume Anton was joking - otherwise his tt measure is uselessly unscientific. I think it's rude to talk about someone who's present in the third person, especially when

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Richard Elling
Gary Mills wrote: On Thu, Dec 11, 2008 at 10:41:26PM -0600, Bob Friesenhahn wrote: On Thu, 11 Dec 2008, Gary Mills wrote: The split responsibility model is quite appealing. I'd like to see ZFS address this model. Is there not a way that ZFS could delegate responsibility for both

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Nicolas Williams
On Fri, Dec 12, 2008 at 01:52:54PM -0600, Gary Mills wrote: On Fri, Dec 12, 2008 at 04:30:51PM +1300, Ian Collins wrote: No matter how good your SAN is, it won't spot a flaky cable or bad RAM. Of course it will. There's an error-checking protocol that runs over the SAN cable. Memory will

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Johan Hartzenberg
On Fri, Dec 12, 2008 at 10:10 PM, Miles Nordin car...@ivy.net wrote: 0. The reports I read were not useless in the way some have stated, because for example Mike sampled his own observations: [snip] I don't see when the single-LUN SAN corruption problems were fixed. I think the

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Toby Thain
On 12-Dec-08, at 3:10 PM, Miles Nordin wrote: tt == Toby Thain t...@telegraphics.com.au writes: mg == Mike Gerdts mger...@gmail.com writes: tt I think we have to assume Anton was joking - otherwise his tt measure is uselessly unscientific. I think it's rude to talk about someone

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Toby Thain
On 12-Dec-08, at 3:38 PM, Johan Hartzenberg wrote: ... The only bit that I understand about why HW raid might be bad is that if it had access to the disks behind a HW RAID LUN, then _IF_ zfs were to encounter corrupted data in a read, it will probably be able to re-construct that data.

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Bob Friesenhahn
On Fri, 12 Dec 2008, Toby Thain wrote: 1. I don't think your impressions nor Anton's and mine are ``useless'' Alright, I agree I should retract the 'useless' but I would keep the 'unscientific'. There is no need to retract the 'useless'. By the same useless measure, George Bush Jr has done

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Bob Friesenhahn
On Fri, 12 Dec 2008, Toby Thain wrote: Except that it's not just parity - ZFS checksums where RAID-N does not (although I've heard that some RAID systems checksum somewhere - not end-to-end of course). It will soon be quite easy to build a RAID system like this using OpenSolaris and a

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Richard Elling
Johan Hartzenberg wrote: There is so much unsupported claims and noise on both sides that everybody is sounding like a bunch of fanboys. I don't think there are two sides. Anyone who has been around computing for any length of time has lost data due to various failures. The question isn't

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Tim
On Fri, Dec 12, 2008 at 2:51 PM, Toby Thain t...@telegraphics.com.auwrote: On 12-Dec-08, at 3:38 PM, Johan Hartzenberg wrote: ... The only bit that I understand about why HW raid might be bad is that if it had access to the disks behind a HW RAID LUN, then _IF_ zfs were to encounter

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Tim
On Fri, Dec 12, 2008 at 3:36 PM, Ian Collins i...@ianshome.com wrote: The ideas aren't new, but the combination of the ideas is. NetApp is still a box at the end of a bit of wire that the OS has to blindly trust. -- Ian. I'm not aware of many, if any large shops that are moving to a

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Ian Collins
Tim wrote: On Fri, Dec 12, 2008 at 3:36 PM, Ian Collins i...@ianshome.com mailto:i...@ianshome.com wrote: The ideas aren't new, but the combination of the ideas is. NetApp is still a box at the end of a bit of wire that the OS has to blindly trust. -- Ian. I'm

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Miles Nordin
nw == Nicolas Williams nicolas.willi...@sun.com writes: tt == Toby Thain t...@telegraphics.com.au writes: jh == Johan Hartzenberg jhart...@gmail.com writes: nw If you can fully trust the SAN then there's no reason not to nw run ZFS on top of it with no ZFS mirrors and no RAID-Z. The

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Nicolas Williams
On Fri, Dec 12, 2008 at 05:31:37PM -0500, Miles Nordin wrote: nw If you can fully trust the SAN then there's no reason not to nw run ZFS on top of it with no ZFS mirrors and no RAID-Z. The best practice I understood is currently to use zpool-layer redundancy especially with SAN even

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Ed Spencer
I find this thread both interesting and disturbing. I'm fairly new to this list so please excuse me if my comments/opinions are simplistic or just incorrect. I think there's been to much FC SAN bashing so let me change the example. What if you buy a 7000 Series server (complete with zfs) and

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Jeff Bonwick
I'm going to pitch in here as devil's advocate and say this is hardly revolution. 99% of what zfs is attempting to do is something NetApp and WAFL have been doing for 15 years+. Regardless of the merits of their patents and prior art, etc., this is not something revolutionarily new. It may

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-12 Thread Richard Elling
[sigh, here we go again... isn't this in a FAQ somewhere, it certainly is in the archives...] Ed Spencer wrote: I find this thread both interesting and disturbing. I'm fairly new to this list so please excuse me if my comments/opinions are simplistic or just incorrect. I think there's been

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-11 Thread Bob Friesenhahn
On Wed, 10 Dec 2008, Anton B. Rang wrote: It sounds like you have access to a source of information that the rest of us don't have access to. I think if you read the archives of this mailing list, and compare it to the discussions on the other Solaris mailing lists re UFS, it's a

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-11 Thread Robert Milkowski
Hello Anton, Thursday, December 11, 2008, 4:17:15 AM, you wrote: It sounds like you have access to a source of information that the rest of us don't have access to. ABR I think if you read the archives of this mailing list, and ABR compare it to the discussions on the other Solaris mailing

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-11 Thread Toby Thain
On 11-Dec-08, at 12:28 PM, Robert Milkowski wrote: Hello Anton, Thursday, December 11, 2008, 4:17:15 AM, you wrote: It sounds like you have access to a source of information that the rest of us don't have access to. ABR I think if you read the archives of this mailing list, and ABR

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-11 Thread Gary Mills
On Wed, Dec 10, 2008 at 12:58:48PM -0800, Richard Elling wrote: Nicolas Williams wrote: On Wed, Dec 10, 2008 at 01:30:30PM -0600, Nicolas Williams wrote: On Wed, Dec 10, 2008 at 12:46:40PM -0600, Gary Mills wrote: On the server, a variety of filesystems can be created on this virtual

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-11 Thread Ian Collins
Gary Mills wrote: The split responsibility model is quite appealing. I'd like to see ZFS address this model. Is there not a way that ZFS could delegate responsibility for both error detection and correction to the storage device, at least one more sophisticated than a physical disk?

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-11 Thread Bob Friesenhahn
On Thu, 11 Dec 2008, Gary Mills wrote: The split responsibility model is quite appealing. I'd like to see ZFS address this model. Is there not a way that ZFS could delegate responsibility for both error detection and correction to the storage device, at least one more sophisticated than a

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-11 Thread Richard Elling
Gary Mills wrote: On Wed, Dec 10, 2008 at 12:58:48PM -0800, Richard Elling wrote: Nicolas Williams wrote: On Wed, Dec 10, 2008 at 01:30:30PM -0600, Nicolas Williams wrote: On Wed, Dec 10, 2008 at 12:46:40PM -0600, Gary Mills wrote: On the server, a variety

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-11 Thread Nicolas Williams
On Thu, Dec 11, 2008 at 09:54:36PM -0800, Richard Elling wrote: I'm not really sure what you mean by split responsibility model. I think you will find that previous designs have more (blind?) trust in the underlying infrastructure. ZFS is designed to trust, but verify. I think he means ZFS w/

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-10 Thread Bob Friesenhahn
On Wed, 10 Dec 2008, Gary Mills wrote: This is a split responsibility configuration where the storage device is responsible for integrity of the storage and ZFS is responsible for integrity of the filesystem. How can it be made to behave in a reliable manner? Can ZFS be better than UFS in

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-10 Thread Will Murnane
On Wed, Dec 10, 2008 at 18:46, Gary Mills [EMAIL PROTECTED] wrote: The storage device provides reliability and integrity for the blocks of data that it serves, and does this well. But not well enough. Even if the storage does a perfect job keeping its bits correct on disk, there are a lot of

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-10 Thread Nicolas Williams
On Wed, Dec 10, 2008 at 12:46:40PM -0600, Gary Mills wrote: On the server, a variety of filesystems can be created on this virtual disk. UFS is most common, but ZFS has a number of advantages over UFS. Two of these are dynamic space management and snapshots. There are also a number of

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-10 Thread Ross
I agree completely with your assessment of the problems Gary, when ZFS can't correct your data you do seem to be at high risk of loosing data, although some people are able to recover it with the help of a couple of helpful souls on this forum. I can think of one scenario where you might be

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-10 Thread Nicolas Williams
On Wed, Dec 10, 2008 at 01:30:30PM -0600, Nicolas Williams wrote: On Wed, Dec 10, 2008 at 12:46:40PM -0600, Gary Mills wrote: On the server, a variety of filesystems can be created on this virtual disk. UFS is most common, but ZFS has a number of advantages over UFS. Two of these are

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-10 Thread Richard Elling
Nicolas Williams wrote: On Wed, Dec 10, 2008 at 01:30:30PM -0600, Nicolas Williams wrote: On Wed, Dec 10, 2008 at 12:46:40PM -0600, Gary Mills wrote: On the server, a variety of filesystems can be created on this virtual disk. UFS is most common, but ZFS has a number of advantages

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-10 Thread Nicolas Williams
On Wed, Dec 10, 2008 at 12:58:48PM -0800, Richard Elling wrote: Nicolas Williams wrote: But note that the setup you describe puts ZFS in no worse a situation than any other filesystem. Well, actually, it does. ZFS is susceptible to a class of failure modes I classify as kill the canary

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-10 Thread Miles Nordin
nw == Nicolas Williams [EMAIL PROTECTED] writes: wm == Will Murnane [EMAIL PROTECTED] writes: nw ZFS has very strong error detection built-in, nw ZFS can also store multiple copies of data and metadata even nw in non-mirrored/non-RAID-Z pools. nw Whoever is making those

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-10 Thread Miles Nordin
re == Richard Elling [EMAIL PROTECTED] writes: re ZFS will detect errors and complain about them, which results re in people blaming ZFS (the canary). this is some really sketchy spin. Sometimes you will say ZFS stores multiple copies of metadata, so even on an unredundant pool a few

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-10 Thread Bob Friesenhahn
On Wed, 10 Dec 2008, Miles Nordin wrote: The objection, to review, is that people are losing entire ZFS pools on SAN's more often than UFS pools on the same SAN. This is It sounds like you have access to a source of information that the rest of us don't have access to. Perhaps it is a secret

Re: [zfs-discuss] Split responsibility for data with ZFS

2008-12-10 Thread Anton B. Rang
It sounds like you have access to a source of information that the rest of us don't have access to. I think if you read the archives of this mailing list, and compare it to the discussions on the other Solaris mailing lists re UFS, it's a reasonable conclusion. -- This message posted from