Re: [zfs-discuss] Yager on ZFS

can you guess? Thu, 06 Dec 2007 09:13:32 -0800

> apologies in advance for prolonging this thread ..

Why do you feel any need to?  If you were contributing posts as completely 
devoid of technical content as some of the morons here have recently been 
submitting I could understand it, but my impression is that the purpose of this 
forum is to explore the kind of questions that you're interested in discussing.

 i
> had considered  
> taking this completely offline, but thought of a few
> people at least  
> who might find this discussion somewhat interesting

And any who don't are free to ignore it, so no harm done there either.

> .. at the least i  
> haven't seen any mention of Merkle trees yet as the
> nerd in me yearns  
> for

I'd never heard of them myself until recently, despite having come up with the 
idea independently to use a checksumming mechanism very similar to ZFS's.  
Merkle seems to be an interesting guy - his home page is worth a visit.

> 
> On Dec 5, 2007, at 19:42, bill todd - aka can you
> guess? wrote:
> 
> >> what are you terming as "ZFS' incremental risk
> reduction"? ..  
> >> (seems like a leading statement toward a
> particular assumption)
> >
> > Primarily its checksumming features, since other
> open source  
> > solutions support simple disk scrubbing (which
> given its ability to  
> > catch most deteriorating disk sectors before they
> become unreadable  
> > probably has a greater effect on reliability than
> checksums in any  
> > environment where the hardware hasn't been slapped
> together so  
> > sloppily that connections are flaky).
> 
> ah .. okay - at first reading "incremental risk
> reduction" seems to  
> imply an incomplete approach to risk

The intent was to suggest a step-wise approach to risk, where some steps are 
far more significant than others (though of course some degree of overlap 
between steps is also possible).

*All* approaches to risk are incomplete.

 ...

 i do  
> believe that an interesting use of the merkle tree
> with a sha256 hash  
> is somewhat of an improvement over conventional
> volume based data  
> scrubbing techniques

Of course it is:  that's why I described it as 'incremental' rather than as 
'redundant'.  The question is just how *significant* an improvement it offers.

 since there can be a unique
> integration between  
> the hash tree for the filesystem block layout and a
> hierarchical data  
> validation method.  In addition to the finding
> unknown areas with the  
> scrub, you're also doing relatively inexpensive data
> validation  
> checks on every read.

Yup.

...

> sure - we've seen many transport errors,

I'm curious what you mean by that, since CRCs on the transports usually 
virtually eliminate them as problems.  Unless you mean that you've seen many 
*corrected* transport errors (indicating that the CRC and retry mechanisms are 
doing their job and that additional ZFS protection in this area is probably 
redundant).

 as well as
> firmware  
> implementation errors

Quantitative and specific examples are always good for this kind of thing; the 
specific hardware involved is especially significant to discussions of the sort 
that we're having (given ZFS's emphasis on eliminating the need for much 
special-purpose hardware).

 .. in fact with many arrays
> we've seen data  
> corruption issues with the scrub

I'm not sure exactly what you're saying here:  is it that the scrub has 
*uncovered* many apparent instances of data corruption (as distinct from, e.g., 
merely unreadable disk sectors)?

 (particularly if the
> checksum is  
> singly stored along with the data block)

Since (with the possible exception of the superblock) ZFS never stores a 
checksum 'along with the data block', I'm not sure what you're saying there 
either.

 -  just like
> spam you really  
> want to eliminate false positives that could indicate
> corruption  
> where there isn't any.

The only risk that ZFS's checksums run is the infinitesimal possibility that 
corruption won't be detected, not that they'll return a false positive.

  if you take some time to read
> the on disk  
> format for ZFS you'll see that there's a tradeoff
> that's done in  
> favor of storing more checksums in many different
> areas instead of  
> making more room for direct block pointers.

While I haven't read that yet, I'm familiar with the trade-off between using 
extremely wide checksums (as ZFS does - I'm not really sure why, since 
cryptographic-level security doesn't seem necessary in this application) and 
limiting the depth of the indirect block tree.  But (yet again) I'm not sure 
what you're trying to get at here.

...

 on this list we've seen a number of consumer
> level products  
> including sata controllers, and raid cards (which are
> also becoming  
> more commonplace in the consumer realm) that can be
> confirmed to  
> throw data errors.

Your phrasing here is a bit unusual ('throwing errors' - or exceptions - is not 
commonly related to corrupting data).  If you're referring to some kind of 
silent data corruption, once again specifics are important:  to put it bluntly, 
a lot of the people here are only semi-competent technically and appear to have 
a personal interest in finding ways to justify their enthusiasm for ZFS, thus 
their anecdotal reports require scrutiny (especially since the only formal 
studies that I'm familiar with in this area don't seem to reflect their 
reported experiences).

  Code maturity issues aside, there
> aren't very  
> many array vendors that are open-sourcing their array
> firmware - and  
> if you consider zfs as a feature-set that could
> function as a multi- 
> purpose storage array (systems are cheap) - i find it
> refreshing that  
> everything that's being done under the covers is
> really out in the open.

As, of course, is equally the case with other open-source software solutions 
that offer similar reliability:  ZFS is not exactly anything new in this 
respect, save for its specific checksumming mechanism (the incremental value of 
which is the question here).

> 
> > And otherwise undetected disk errors occur with
> negligible  
> > frequency compared with software errors that can
> silently trash  
> > your data in ZFS cache or in application buffers
> (especially in PC  
> > environments:  enterprise software at least tends
> to be more stable  
> > and more carefully controlled - not to mention
> their typical use of  
> > ECC RAM).
> >
> > So depending upon ZFS's checksums to protect your
> data in most PC  
> > environments is sort of like leaving on a vacation
> and locking and  
> > bolting the back door of your house while leaving
> the front door  
> > wide open:  yes, a burglar is less likely to enter
> by the back  
> > door, but thinking that the extra bolt there made
> you much safer is  
> > likely foolish.
> 
> granted - it's not an all-in-one solution, but by
> combining the  
> merkle tree approach with the sha256 checksum along
> with periodic  
> data scrubbing - it's a darn good approach ..

As are the other open-source solutions that offer similar reliability:  you can 
keep touting ZFS's checksums until the cows come home, but unless you 
*quantify* their incremental value (as Anton just took a stab at doing 
elsewhere) it's just more fanboy babble (which again may seem a bit blunt, but 
when you keep asserting that something is significantly valuable without ever 
being willing to step up to the plate and quantify that value - especially in 
comparison to other remaining risks - after repeated challenges to do so a 
little bluntness seems to be in order).

> particularly since it  
> also tends to cost a lot less than what you might
> have to pay  
> elsewhere for something you can't really see inside.

But no more than the other open-source solutions that you *can* see inside.

...

 you do seem to
> be repeating the  
> phrase "incremental protection" recently which i
> think i take issue  
> with.

Then I hope that the above has helped you understand it better.

...

 checksums and scrubbing are only a piece of
> the larger data  
> protection schemes.

And a piece whose utility is (I'll say it again) "quantifiable".

  This should really be used along
> with snapshots,  
> replication, and backup

Only if the checksums contribute significantly to the overall result - 
otherwise, they're just another 'nice to have, all other things being equal' 
feature.

Waving one's hands about snapshots, replication, and backup is not enough:  you 
have to quantify how much each adds to the overall level of protection and then 
see how much *more* a feature like ZFS's checksums reduces the residual risk.  
Do you have more than one backup copy, and do you compare each with the 
original after creating it, and are at least some of them off site, and do you 
periodically verify that their contents are readable?  If any of those is not 
true, then your residual risk probably won't be markedly reduced by the 
existence of ZFS-style checksums.

- bill

This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Yager on ZFS

Reply via email to