Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-12 Thread Jim Klimov
2012-01-13 5:30, Daniel Carosone wrote: Corrupted file data that is then accurately checksummed and readable as valid? Speaking of which, is there currently any simple way to disable checksum validation during data reads (and not cause a kernel panic when reading garbage under the guise of metad

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-12 Thread Jim Klimov
2012-01-13 5:01, Richard Elling wrote: On Jan 12, 2012, at 2:34 PM, Jim Klimov wrote: Metadata is at least doubly redundant and checksummed. True, and this helps if it is valid in the first place (in RAM). >> As has been >> reported by many blog-posts researching ZDB, there do >> happen case

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-12 Thread Jim Klimov
2012-01-13 5:30, Daniel Carosone wrote: On Thu, Jan 12, 2012 at 05:01:48PM -0800, Richard Elling wrote: This thread is about checksums - namely, now, what are our options when they mismatch the data? As has been reported by many blog-posts researching ZDB, there do happen cases when checksums ar

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-12 Thread Daniel Carosone
On Thu, Jan 12, 2012 at 05:01:48PM -0800, Richard Elling wrote: > > This thread is about checksums - namely, now, what are > > our options when they mismatch the data? As has been > > reported by many blog-posts researching ZDB, there do > > happen cases when checksums are broken (i.e. bitrot in >

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-12 Thread Richard Elling
On Jan 12, 2012, at 2:34 PM, Jim Klimov wrote: > I guess I have another practical rationale for a second > checksum, be it ECC or not: my scrubbing pool found some > "unrecoverable errors". Luckily, for those files I still > have external originals, so I rsynced them over. Still, > there is one fi

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-12 Thread Daniel Carosone
On Fri, Jan 13, 2012 at 04:48:44AM +0400, Jim Klimov wrote: > As Richard reminded me in another thread, both metadata > and DDT can contain checksums, hopefully of the same data > block. So for deduped data we may already have a means > to test whether the data or the checksum is incorrect... It's

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-12 Thread Jim Klimov
2012-01-13 2:34, Jim Klimov wrote: I guess I have another practical rationale for a second checksum, be it ECC or not: my scrubbing pool found some "unrecoverable errors". ...Applications need to know whether the digest has been changed. As Richard reminded me in another thread, both metadata a

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-12 Thread Jim Klimov
I guess I have another practical rationale for a second checksum, be it ECC or not: my scrubbing pool found some "unrecoverable errors". Luckily, for those files I still have external originals, so I rsynced them over. Still, there is one file whose broken prehistory is referenced in snapshots, an

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-12 Thread David Magda
On Wed, January 11, 2012 11:40, Nico Williams wrote: > I don't find this terribly attractive, but maybe I'm just not looking > at it the right way. Perhaps there is a killer enterprise feature for > ECC here: stretching MTTDL in the face of a device failure in a mirror > or raid-z configuration (

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-11 Thread Jim Klimov
2012-01-11 20:40, Nico Williams пишет: On Wed, Jan 11, 2012 at 9:16 AM, Jim Klimov wrote: I've recently had a sort of an opposite thought: yes, ZFS redundancy is good - but also expensive in terms of raw disk space. This is especially bad for hardware space-constrained systems like laptops and

Re: [zfs-discuss] Idea: ZFS and on-disk ECC for blocks

2012-01-11 Thread Nico Williams
On Wed, Jan 11, 2012 at 9:16 AM, Jim Klimov wrote: > I've recently had a sort of an opposite thought: yes, > ZFS redundancy is good - but also expensive in terms > of raw disk space. This is especially bad for hardware > space-constrained systems like laptops and home-NASes, > where doubling the n