Re: [zfs-discuss] dedup and handling corruptions - impossible?

2010-08-22 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of devsk
 
 If dedup is ON and the pool develops a corruption in a file, I can
 never fix it because when I try to copy the correct file on top of the
 corrupt file,
 the block hash will match with the existing blocks and only reference
 count will be updated. The only way to fix it is to delete all
 snapshots (to remove all references) and then delete the file and then
 copy the valid file. This is a pretty high cost if it is so (empirical
 evidence so far, I don't know internal details).

Um ... If dedup is on, and a file develops corruption, the original has
developed corruption too.  It was probably corrupt before it was copied.
This is what zfs checksumming and mirrors/redundancy are for.

If you have ZFS, and redundancy, this won't happen.  (Unless you have
failing ram/cpu/etc)

If you have *any* filesystem without redundancy, and this happens, you
should stop trying to re-copy the file, and instead throw away your disk and
restore from backup.

If you run without redundancy, and without backup, you got what you asked
for.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedup and handling corruptions - impossible?

2010-08-22 Thread devsk
  From: zfs-discuss-boun...@opensolaris.org
 [mailto:zfs-discuss-
  boun...@opensolaris.org] On Behalf Of devsk
  
  If dedup is ON and the pool develops a corruption
 in a file, I can
  never fix it because when I try to copy the correct
 file on top of the
  corrupt file,
  the block hash will match with the existing blocks
 and only reference
  count will be updated. The only way to fix it is to
 delete all
  snapshots (to remove all references) and then
 delete the file and then
  copy the valid file. This is a pretty high cost if
 it is so (empirical
  evidence so far, I don't know internal details).
 
 Um ... If dedup is on, and a file develops
 corruption, the original has
 developed corruption too.

What do you mean original? dedup creates only one copy of the file blocks. The 
file was not corrupt when it was copied 3 months ago.
I have read the file many times and scrubbed the pool many times since then. 
The file is present in many snapshots.

  It was probably corrupt
 before it was copied.
 This is what zfs checksumming and mirrors/redundancy
 are for.
 
 If you have ZFS, and redundancy, this won't happen.
  (Unless you have
 ailing ram/cpu/etc)
 

You are saying ZFS will detect and rectify this kind of corruption in a deduped 
pool automatically if enough redundancy is present? Can that fail sometimes? 
Under what conditions?

I would hate to restore a 1.5TB pool from backup just because one 5MB file is 
gone bust. And I have a known good copy of the file.

I raised a technical question and you are going all personal on me.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedup and handling corruptions - impossible?

2010-08-22 Thread Thomas Burgess
You are saying ZFS will detect and rectify this kind of corruption in a
 deduped pool automatically if enough redundancy is present? Can that fail
 sometimes? Under what conditions?

 I would hate to restore a 1.5TB pool from backup just because one 5MB file
 is gone bust. And I have a known good copy of the file.

 I raised a technical question and you are going all personal on me.
 --
 This message posted from opensolaris.org


zfs checksums every transaction.  When you access a file, it checks that the
checksums match.  If they do not (corruption) and you have redundancy, it
repairs the corruption.  It can detect and corrupt corruption in this way.


It didnt' seem like anyone got personal with you.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedup and handling corruptions - impossible?

2010-08-22 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of devsk
 
 What do you mean original? dedup creates only one copy of the file
 blocks. The file was not corrupt when it was copied 3 months ago.

Please describe the problem.

If you copied the file 3 months ago, and the new  old copies are both
referencing the same blocks on disk thanks to dedup, and the new copy has
become corrupt, then the original has also become corrupt.

In the OP, you seem to imply that the original is not corrupt, but the new
copy is corrupt, and you can't fix the new copy by overwriting it with a
fresh copy of the original.  This makes no sense.


  If you have ZFS, and redundancy, this won't happen.
   (Unless you have
  ailing ram/cpu/etc)
 
 
 You are saying ZFS will detect and rectify this kind of corruption in a
 deduped pool automatically if enough redundancy is present? Can that
 fail sometimes? Under what conditions?

I'm saying ZFS checksums every block on disk, read or written, and if any
checksum mismatches, then ZFS automatically checks the other copy ... from
the other disk in the mirror, or reconstructed from the redundancy in raid,
or whatever.  By having redundancy, ZFS will automatically correct any
checksum mismatches it encounters.

If a checksum is mismatched on *both* sides of the mirror, it means either
(a) both disks went bad at the same time, which is unlikely, but nonzero
probability, or (b) there's faulty ram or cpu or some other single-point of
failure in the system.


 I raised a technical question and you are going all personal on me.

Woah.  Where did that come from???

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedup and handling corruptions - impossible?

2010-08-22 Thread Richard Elling
On Aug 21, 2010, at 9:22 PM, devsk wrote:

 If dedup is ON and the pool develops a corruption in a file, I can never fix 
 it because when I try to copy the correct file on top of the corrupt file,
 the block hash will match with the existing blocks and only reference count 
 will be updated. The only way to fix it is to delete all
 snapshots (to remove all references) and then delete the file and then copy 
 the valid file. This is a pretty high cost if it is so (empirical
 evidence so far, I don't know internal details).
 
 Has anyone else experienced this?

zfs set dedup=on,verify dataset

IMNSHO, verify should be the default.
 -- richard

-- 
OpenStorage Summit, October 25-27, San Fransisco
http://nexenta-summit2010.eventbrite.comZFS and performance consulting
http://www.RichardElling.com










___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedup and handling corruptions - impossible?

2010-08-22 Thread Ian Collins

On 08/23/10 10:38 AM, Richard Elling wrote:

On Aug 21, 2010, at 9:22 PM, devsk wrote:

   

If dedup is ON and the pool develops a corruption in a file, I can never fix it 
because when I try to copy the correct file on top of the corrupt file,
the block hash will match with the existing blocks and only reference count 
will be updated. The only way to fix it is to delete all
snapshots (to remove all references) and then delete the file and then copy the 
valid file. This is a pretty high cost if it is so (empirical
evidence so far, I don't know internal details).

Has anyone else experienced this?
 

zfs set dedup=on,verify dataset

IMNSHO, verify should be the default.
   


I thought it was the default for lesser checksum algorithms, give the 
long odds in an sha256 false positive?


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedup and handling corruptions - impossible?

2010-08-22 Thread Richard Elling
On Aug 22, 2010, at 3:57 PM, Ian Collins wrote:
 On 08/23/10 10:38 AM, Richard Elling wrote:
 On Aug 21, 2010, at 9:22 PM, devsk wrote:
   
 If dedup is ON and the pool develops a corruption in a file, I can never 
 fix it because when I try to copy the correct file on top of the corrupt 
 file,
 the block hash will match with the existing blocks and only reference count 
 will be updated. The only way to fix it is to delete all
 snapshots (to remove all references) and then delete the file and then copy 
 the valid file. This is a pretty high cost if it is so (empirical
 evidence so far, I don't know internal details).
 
 Has anyone else experienced this?
 
 zfs set dedup=on,verify dataset
 
 IMNSHO, verify should be the default.
   
 
 I thought it was the default for lesser checksum algorithms, give the long 
 odds in an sha256 false positive?


That was the original intent, however the only checksum algorithm used today
is SHA-256.
 -- richard

-- 
OpenStorage Summit, October 25-27, San Fransisco
http://nexenta-summit2010.eventbrite.com
ZFS and performance consulting
http://www.RichardElling.com











___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] dedup and handling corruptions - impossible?

2010-08-21 Thread devsk
If dedup is ON and the pool develops a corruption in a file, I can never fix it 
because when I try to copy the correct file on top of the corrupt file,
the block hash will match with the existing blocks and only reference count 
will be updated. The only way to fix it is to delete all
snapshots (to remove all references) and then delete the file and then copy the 
valid file. This is a pretty high cost if it is so (empirical
evidence so far, I don't know internal details).

Has anyone else experienced this?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss