Re: [zfs-discuss] Dedup Questions.
On Tue, Feb 09, 2010 at 08:26:42AM -0800, Richard Elling wrote: zdb -D poolname will provide details on the DDT size. FWIW, I have a pool with 52M DDT entries and the DDT is around 26GB. I wish -D was documented; I had forgotten about it and only found the (expensive) -S variant, which wasn't what I was looking for. Well, I wish zdb was documented, but in this case I wish -D was in the usage message, which is all the documentation we get today. $ pfexec zdb -D tank DDT-sha256-zap-duplicate: 19725 entries, size 270 on disk, 153 in core DDT-sha256-zap-unique: 52284055 entries, size 284 on disk, 159 in core dedup = 1.00, compress = 1.00, copies = 1.00, dedup * compress / copies = 1.00 What units are the size X on disk, Y in core figures? It's very hard to make sense of them, given the vast difference in entries and small difference in size of the two rows. One can assume that the duplicate entries have more block addresses in them and are bigger, I suppose, but that isn't really enough to explain the gap. At least the on disk / in core values give a roughly consistent ratio, both these and for a pool I have handy here - though I still don't know what that means. how do you calculate the 26 GB size from this? The exact size is not accounted. I'm inferring the size by looking at the difference between the space used for the (simple) pool and the sum of the file systems under the pool, where the top-level file system (/tank) is empty with mount points, but no snapshots. Surely there has to be a better way. If the numbers above don't give it, then this brings me back to the method I speculated about in a previous question.. I presume the DDT pool object can be found and inspected with zdb, to reveal a size. If the ratio and guesswork interpretation above holds true, we might derive the in-core memory requirement from there. I don't know how to use zdb to do that for objects in general, nor how to find or recognise the object in question. Could someone who does please provide some hints? I will go look at zdb sources, but (without yet having done so) I suspect that it will just be printing out figures from zfs data structures, and I will still need help with interpretation. -- Dan. pgpcdFZhvT99l.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Dedup Questions.
Tom Hall wrote: Re the DDT, can someone outline it's structure please? Some sort of hash table? The blogs I have read so far dont specify. It is stored in a ZAP object, which is an extensible hash table. See zap.[ch], ddt_zap.c, ddt.h --matt ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Dedup Questions.
Hi, I am loving the new dedup feature. Few questions: If you enable it after data is on the filesystem, it will find the dupes on read as well as write? Would a scrub therefore make sure the DDT is fully populated. Re the DDT, can someone outline it's structure please? Some sort of hash table? The blogs I have read so far dont specify. Re DDT size, is (data in use)/(av blocksize) * 256bit right as a worst case (ie all blocks non identical) What are average block sizes? Cheers, Tom ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Dedup Questions.
On Feb 8, 2010, at 6:04 PM, Kjetil Torgrim Homme wrote: Tom Hall thattommyh...@gmail.com writes: If you enable it after data is on the filesystem, it will find the dupes on read as well as write? Would a scrub therefore make sure the DDT is fully populated. no. only written data is added to the DDT, so you need to copy the data somehow. zfs send/recv is the most convenient, but you could even do a loop of commands like cp -p $file $file.tmp mv $file.tmp $file Re the DDT, can someone outline it's structure please? Some sort of hash table? The blogs I have read so far dont specify. I can't help here. UTSL Re DDT size, is (data in use)/(av blocksize) * 256bit right as a worst case (ie all blocks non identical) the size of an entry is much larger: | From: Mertol Ozyoney mertol.ozyo...@sun.com | Subject: Re: Dedup memory overhead | Message-ID: 00cb01caa580$a3d6f110$eb84d330$%ozyo...@sun.com | Date: Thu, 04 Feb 2010 11:58:44 +0200 | | Approximately it's 150 bytes per individual block. What are average block sizes? as a start, look at your own data. divide the used size in df with used inodes in df -i. example from my home directory: $ /usr/gnu/bin/df -i ~ FilesystemInodes IUsed IFree IUse%Mounted on tank/home 223349423 3412777 219936646 2%/volumes/home $ df -k ~ Filesystemkbytes used avail capacity Mounted on tank/home 573898752 257644703 10996825471%/volumes/home so the average file size is 75 KiB, smaller than the recordsize of 128 KiB. extrapolating to a full filesystem, we'd get 4.9M files. unfortunately, it's more complicated than that, since a file can consist of many records even if the *average* is smaller than a single record. a pessimistic estimate, then, is one record for each of those 4.9M files, plus one record for each 128 KiB of diskspace (2.8M), for a total of 7.7M records. the size of the DDT for this (quite small!) filesystem would be something like 1.2 GB. perhaps a reasonable rule of thumb is 1 GB DDT per TB of storage. zdb -D poolname will provide details on the DDT size. FWIW, I have a pool with 52M DDT entries and the DDT is around 26GB. $ pfexec zdb -D tank DDT-sha256-zap-duplicate: 19725 entries, size 270 on disk, 153 in core DDT-sha256-zap-unique: 52284055 entries, size 284 on disk, 159 in core dedup = 1.00, compress = 1.00, copies = 1.00, dedup * compress / copies = 1.00 (you can tell by the stats that I'm not expecting much dedup :-) -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss