Re: [zfs-discuss] Dedup Questions.

2010-02-09 Thread Daniel Carosone
On Tue, Feb 09, 2010 at 08:26:42AM -0800, Richard Elling wrote:
  zdb -D poolname will provide details on the DDT size.  FWIW, I have a
  pool with 52M DDT entries and the DDT is around 26GB.

I wish -D was documented; I had forgotten about it and only found the
(expensive) -S variant, which wasn't what I was looking for. 

Well, I wish zdb was documented, but in this case I wish -D was
in the usage message, which is all the documentation we get today.

$ pfexec zdb -D tank
DDT-sha256-zap-duplicate: 19725 entries, size 270 on disk, 153 in core
DDT-sha256-zap-unique: 52284055 entries, size 284 on disk, 159 in core
  
dedup = 1.00, compress = 1.00, copies = 1.00, dedup * compress / copies 
  = 1.00

What units are the size X on disk, Y in core figures?  It's very
hard to make sense of them, given the vast difference in entries and
small difference in size of the two rows.  One can assume that the
duplicate entries have more block addresses in them and are bigger, I
suppose, but that isn't really enough to explain the gap.   

At least the on disk / in core values give a roughly consistent ratio,
both these and for a pool I have handy here - though I still don't
know what that means.  

  how do you calculate the 26 GB size from this?
 
 The exact size is not accounted. I'm inferring the size by looking at the 
 difference 
 between the space used for the (simple) pool and the sum of the file systems 
 under 
 the pool, where the top-level file system (/tank) is empty with mount points, 
 but no 
 snapshots.

Surely there has to be a better way.  If the numbers above don't give
it, then this brings me back to the method I speculated about in a
previous question..

I presume the DDT pool object can be found and inspected with zdb, to
reveal a size.  If the ratio and guesswork interpretation above holds
true, we might derive the in-core memory requirement from there. 

I don't know how to use zdb to do that for objects in general, nor how
to find or recognise the object in question. Could someone who does
please provide some hints?

I will go look at zdb sources, but (without yet having done so) I
suspect that it will just be printing out figures from zfs data
structures, and I will still need help with interpretation. 

--
Dan.

pgpcdFZhvT99l.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Dedup Questions.

2010-02-09 Thread Matthew Ahrens

Tom Hall wrote:

Re the DDT, can someone outline it's structure please? Some sort of
hash table? The blogs I have read so far dont specify.


It is stored in a ZAP object, which is an extensible hash table.  See 
zap.[ch], ddt_zap.c, ddt.h


--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Dedup Questions.

2010-02-08 Thread Tom Hall
Hi,

I am loving the new dedup feature.


Few questions:
If you enable it after data is on the filesystem, it will find the
dupes on read as well as write? Would a scrub therefore make sure the
DDT is fully populated.

Re the DDT, can someone outline it's structure please? Some sort of
hash table? The blogs I have read so far dont specify.

Re DDT size, is (data in use)/(av blocksize) * 256bit right as a worst
case (ie all blocks non identical)
What are average block sizes?

Cheers,
Tom
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Dedup Questions.

2010-02-08 Thread Richard Elling
On Feb 8, 2010, at 6:04 PM, Kjetil Torgrim Homme wrote:

 Tom Hall thattommyh...@gmail.com writes:
 
 If you enable it after data is on the filesystem, it will find the
 dupes on read as well as write? Would a scrub therefore make sure the
 DDT is fully populated.
 
 no.  only written data is added to the DDT, so you need to copy the data
 somehow.  zfs send/recv is the most convenient, but you could even do a
 loop of commands like
 
  cp -p $file $file.tmp  mv $file.tmp $file
 
 Re the DDT, can someone outline it's structure please? Some sort of
 hash table? The blogs I have read so far dont specify.
 
 I can't help here.

UTSL

 Re DDT size, is (data in use)/(av blocksize) * 256bit right as a worst
 case (ie all blocks non identical)
 
 the size of an entry is much larger:
 
 | From: Mertol Ozyoney mertol.ozyo...@sun.com
 | Subject: Re: Dedup memory overhead
 | Message-ID: 00cb01caa580$a3d6f110$eb84d330$%ozyo...@sun.com
 | Date: Thu, 04 Feb 2010 11:58:44 +0200
 | 
 | Approximately it's 150 bytes per individual block.
 
 What are average block sizes?
 
 as a start, look at your own data.  divide the used size in df with
 used inodes in df -i.  example from my home directory:
 
  $ /usr/gnu/bin/df -i ~
  FilesystemInodes IUsed IFree  IUse%Mounted on
  tank/home  223349423   3412777 219936646 2%/volumes/home
 
  $ df -k ~
  Filesystemkbytes  used avail capacity  Mounted on
  tank/home  573898752 257644703 10996825471%/volumes/home
 
 so the average file size is 75 KiB, smaller than the recordsize of 128
 KiB.  extrapolating to a full filesystem, we'd get 4.9M files.
 unfortunately, it's more complicated than that, since a file can consist
 of many records even if the *average* is smaller than a single record.
 
 a pessimistic estimate, then, is one record for each of those 4.9M
 files, plus one record for each 128 KiB of diskspace (2.8M), for a total
 of 7.7M records.  the size of the DDT for this (quite small!) filesystem
 would be something like 1.2 GB.  perhaps a reasonable rule of thumb is 1
 GB DDT per TB of storage.

zdb -D poolname will provide details on the DDT size.  FWIW, I have a
pool with 52M DDT entries and the DDT is around 26GB.

$ pfexec zdb -D tank
   
DDT-sha256-zap-duplicate: 19725 entries, size 270 on disk, 153 in core
DDT-sha256-zap-unique: 52284055 entries, size 284 on disk, 159 in core

dedup = 1.00, compress = 1.00, copies = 1.00, dedup * compress / copies 
= 1.00

(you can tell by the stats that I'm not expecting much dedup :-)
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss