Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-21 Thread Darren J Moffat
Daniel Carosone wrote: Your parenthetical comments here raise some concerns, or at least eyebrows, with me. Hopefully you can lower them again. compress, encrypt, checksum, dedup. (and you need to use zdb to get enough info to see the leak - and that means you have access to the raw

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-21 Thread Darren J Moffat
Kjetil Torgrim Homme wrote: Note also that the compress/encrypt/checksum and the dedup are separate pipeline stages so while dedup is happening for block N block N+1 can be getting transformed - so this is designed to take advantage of multiple scheduling units (threads,cpus,cores etc). nice.

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-18 Thread Kjetil Torgrim Homme
Darren J Moffat darr...@opensolaris.org writes: Kjetil Torgrim Homme wrote: I don't know how tightly interwoven the dedup hash tree and the block pointer hash tree are, or if it is all possible to disentangle them. At the moment I'd say very interwoven by design. conceptually it doesn't

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-17 Thread Andrey Kuzmin
Downside you have described happens only when the same checksum is used for data protection and duplicate detection. This implies sha256, BTW, since fletcher-based dedupe has been dropped in recent builds. On 12/17/09, Kjetil Torgrim Homme kjeti...@linpro.no wrote: Andrey Kuzmin

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-17 Thread Kjetil Torgrim Homme
Andrey Kuzmin andrey.v.kuz...@gmail.com writes: Downside you have described happens only when the same checksum is used for data protection and duplicate detection. This implies sha256, BTW, since fletcher-based dedupe has been dropped in recent builds. if the hash used for dedup is

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-17 Thread Darren J Moffat
Kjetil Torgrim Homme wrote: Andrey Kuzmin andrey.v.kuz...@gmail.com writes: Downside you have described happens only when the same checksum is used for data protection and duplicate detection. This implies sha256, BTW, since fletcher-based dedupe has been dropped in recent builds. if the

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-17 Thread Kjetil Torgrim Homme
Darren J Moffat darr...@opensolaris.org writes: Kjetil Torgrim Homme wrote: Andrey Kuzmin andrey.v.kuz...@gmail.com writes: Downside you have described happens only when the same checksum is used for data protection and duplicate detection. This implies sha256, BTW, since fletcher-based

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-17 Thread Darren J Moffat
Kjetil Torgrim Homme wrote: I don't know how tightly interwoven the dedup hash tree and the block pointer hash tree are, or if it is all possible to disentangle them. At the moment I'd say very interwoven by desgin. conceptually it doesn't seem impossible, but that's easy for me to say,

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-17 Thread Bob Friesenhahn
On Thu, 17 Dec 2009, Kjetil Torgrim Homme wrote: compression requires CPU, actually quite a lot of it. even with the lean and mean lzjb, you will get not much more than 150 MB/s per core or something like that. so, if you're copying a 10 GB image file, it will take a minute or two, just to

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-17 Thread Nicolas Williams
On Thu, Dec 17, 2009 at 03:32:21PM +0100, Kjetil Torgrim Homme wrote: if the hash used for dedup is completely separate from the hash used for data protection, I don't see any downsides to computing the dedup hash from uncompressed data. why isn't it? Hash and checksum functions are slow

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-17 Thread Andrey Kuzmin
On Thu, Dec 17, 2009 at 6:14 PM, Kjetil Torgrim Homme kjeti...@linpro.no wrote: Darren J Moffat darr...@opensolaris.org writes: Kjetil Torgrim Homme wrote: Andrey Kuzmin andrey.v.kuz...@gmail.com writes: Downside you have described happens only when the same checksum is used for data

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-17 Thread Daniel Carosone
Your parenthetical comments here raise some concerns, or at least eyebrows, with me. Hopefully you can lower them again. compress, encrypt, checksum, dedup. (and you need to use zdb to get enough info to see the leak - and that means you have access to the raw devices) An attacker with

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-16 Thread Kjetil Torgrim Homme
Andrey Kuzmin andrey.v.kuz...@gmail.com writes: Kjetil Torgrim Homme wrote: for some reason I, like Steve, thought the checksum was calculated on the uncompressed data, but a look in the source confirms you're right, of course. thinking about the consequences of changing it, RAID-Z recovery

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-16 Thread Andrey Kuzmin
Yet again, I don't see how RAID-Z reconstruction is related to the subject discussed (what data should be sha256'ed when both dedupe and compression are enabled, raw or compressed ). sha256 has nothing to do with bad block detection (may be it will when encryption is implemented, but for now

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-16 Thread Kjetil Torgrim Homme
Andrey Kuzmin andrey.v.kuz...@gmail.com writes: Yet again, I don't see how RAID-Z reconstruction is related to the subject discussed (what data should be sha256'ed when both dedupe and compression are enabled, raw or compressed ). sha256 has nothing to do with bad block detection (may be it

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-16 Thread Andrey Kuzmin
On Wed, Dec 16, 2009 at 7:25 PM, Kjetil Torgrim Homme kjeti...@linpro.no wrote: Andrey Kuzmin andrey.v.kuz...@gmail.com writes: Yet again, I don't see how RAID-Z reconstruction is related to the subject discussed (what data should be sha256'ed when both dedupe and compression are enabled, raw

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-16 Thread Darren J Moffat
Andrey Kuzmin wrote: On Wed, Dec 16, 2009 at 7:25 PM, Kjetil Torgrim Homme kjeti...@linpro.no wrote: Andrey Kuzmin andrey.v.kuz...@gmail.com writes: Yet again, I don't see how RAID-Z reconstruction is related to the subject discussed (what data should be sha256'ed when both dedupe and

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-16 Thread Andrey Kuzmin
On Wed, Dec 16, 2009 at 7:46 PM, Darren J Moffat darr...@opensolaris.org wrote: Andrey Kuzmin wrote: On Wed, Dec 16, 2009 at 7:25 PM, Kjetil Torgrim Homme kjeti...@linpro.no wrote: Andrey Kuzmin andrey.v.kuz...@gmail.com writes: Yet again, I don't see how RAID-Z reconstruction is related

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-16 Thread Kjetil Torgrim Homme
Andrey Kuzmin andrey.v.kuz...@gmail.com writes: Darren J Moffat wrote: Andrey Kuzmin wrote: Resilvering has noting to do with sha256: one could resilver long before dedupe was introduced in zfs. SHA256 isn't just used for dedup it is available as one of the checksum algorithms right back to

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-15 Thread Darren J Moffat
Cyril Plisko wrote: On Mon, Dec 14, 2009 at 9:32 PM, Andrey Kuzmin andrey.v.kuz...@gmail.com wrote: Right, but 'verify' seems to be 'extreme safety' and thus rather rare use case. Hmm, dunno. I wouldn't set anything, but scratch file system to dedup=on. Anything of even slight significance is

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-15 Thread Kjetil Torgrim Homme
Robert Milkowski mi...@task.gda.pl writes: On 13/12/2009 20:51, Steve Radich, BitShop, Inc. wrote: Because if you can de-dup anyway why bother to compress THEN check? This SEEMS to be the behaviour - i.e. I would suspect many of the files I'm writing are dups - however I see high cpu use even

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-15 Thread Andrey Kuzmin
On Tue, Dec 15, 2009 at 3:06 PM, Kjetil Torgrim Homme kjeti...@linpro.no wrote: Robert Milkowski mi...@task.gda.pl writes: On 13/12/2009 20:51, Steve Radich, BitShop, Inc. wrote: Because if you can de-dup anyway why bother to compress THEN check? This SEEMS to be the behaviour - i.e. I would

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-14 Thread Robert Milkowski
On 13/12/2009 20:51, Steve Radich, BitShop, Inc. wrote: I enabled compression on a zfs filesystem with compression=gzip9 - i.e. fairly slow compression - this stores backups of databases (which compress fairly well). The next question is: Is the CRC on the disk based on the uncompressed data

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-14 Thread Andrey Kuzmin
On Sun, Dec 13, 2009 at 11:51 PM, Steve Radich, BitShop, Inc. ste...@bitshop.com wrote: I enabled compression on a zfs filesystem with compression=gzip9 - i.e. fairly slow compression - this stores backups of databases (which compress fairly well). The next question is:  Is the CRC on the

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-14 Thread A Darren Dunham
On Mon, Dec 14, 2009 at 09:30:29PM +0300, Andrey Kuzmin wrote: ZFS deduplication is block-level, so to deduplicate one needs data broken into blocks to be written. With compression enabled, you don't have these until data is compressed. Looks like cycles waste indeed, but ... ZFS compression

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-14 Thread Casper . Dik
On Mon, Dec 14, 2009 at 09:30:29PM +0300, Andrey Kuzmin wrote: ZFS deduplication is block-level, so to deduplicate one needs data broken into blocks to be written. With compression enabled, you don't have these until data is compressed. Looks like cycles waste indeed, but ... ZFS compression

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-14 Thread Andrey Kuzmin
On Mon, Dec 14, 2009 at 9:53 PM, casper@sun.com wrote: On Mon, Dec 14, 2009 at 09:30:29PM +0300, Andrey Kuzmin wrote: ZFS deduplication is block-level, so to deduplicate one needs data broken into blocks to be written. With compression enabled, you don't have these until data is

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-14 Thread Cyril Plisko
On Mon, Dec 14, 2009 at 9:32 PM, Andrey Kuzmin andrey.v.kuz...@gmail.com wrote: Right, but 'verify' seems to be 'extreme safety' and thus rather rare use case. Hmm, dunno. I wouldn't set anything, but scratch file system to dedup=on. Anything of even slight significance is set to dedup=verify.

Re: [zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-14 Thread Andrey Kuzmin
On 12/14/09, Cyril Plisko cyril.pli...@mountall.com wrote: On Mon, Dec 14, 2009 at 9:32 PM, Andrey Kuzmin andrey.v.kuz...@gmail.com wrote: Right, but 'verify' seems to be 'extreme safety' and thus rather rare use case. Hmm, dunno. I wouldn't set anything, but scratch file system to

[zfs-discuss] DeDup and Compression - Reverse Order?

2009-12-13 Thread Steve Radich, BitShop, Inc.
I enabled compression on a zfs filesystem with compression=gzip9 - i.e. fairly slow compression - this stores backups of databases (which compress fairly well). The next question is: Is the CRC on the disk based on the uncompressed data (which seems more likely to be able to be recovered) or