Re: [zfs-discuss] dedupratio riddle
As noted, the ratio caclulation applies over the data attempted to dedup, not the whole pool. However, I saw a commit go by just in the last couple of days about the dedupratio calculation being misleading, though I didn't check the details. Presumably this will be reported differently from the next builds. -- Dan. pgpH78u3PQOkc.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
On 18 mar 2010, at 18.38, Craig Alder wrote: I remembered reading a post about this a couple of months back. This post by Jeff Bonwick confirms that the dedupratio is calculated only on the data that you've attempted to deduplicate, i.e. only the data written whilst dedup is turned on - http://mail.opensolaris.org/pipermail/zfs-discuss/2009-December/034721.html . Ah, I was on the right track then with the DDT then :) guess most people have it turned on/off from the begining until BP rewrite to ensure everything is deduplicated(which is probably a good idea). Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
I remembered reading a post about this a couple of months back. This post by Jeff Bonwick confirms that the dedupratio is calculated only on the data that you've attempted to deduplicate, i.e. only the data written whilst dedup is turned on - http://mail.opensolaris.org/pipermail/zfs-discuss/2009-December/034721.html. Regards, Craig -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
On 18 mrt 2010, at 10:07, Henrik Johansson wrote: > Hello, > > On 17 mar 2010, at 16.22, Paul van der Zwan wrote: > >> >> On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote: >> >>> Someone correct me if I'm wrong, but it could just be a coincidence. That >>> is, perhaps the data that you copied happens to lead to a dedup ratio >>> relative to the data that's already on there. You could test this out by >>> copying a few gigabytes of data you know is unique (like maybe a DVD video >>> file or something), and that should change the dedup ratio. >> >> The first copy of that data was unique and even dedup is switched off for >> the entire pool so it seems a bug in the calculation of the >> dedupratio or it used a method that is giving unexpected results. > > I wonder if the dedup ratio is calculated by the contents of the DDT or by > all the data contents of the whole pool, i'we only looked at the ratio for > datasets which had dedup on for the whole lifetime. If the former, data added > when it's switched off will never alter the ratio (until rewritten when with > dedup on). The source should have the answer, but i'm on mail only for a few > weeks. > > It'a probably for the whole dataset, that makes the most sense, just a > thought. > It looks like the ratio only gets updated when dedup is switched on and freezes if you switch dedup off for the entire pool, like I did. I tried to have a look at the source but it was way too complex to figure it out in the time I had available so far. Best regards, Paul van der Zwan Sun Microsystems Nederland > Regards > > Henrik > http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
Hello, On 17 mar 2010, at 16.22, Paul van der Zwan wrote: On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote: Someone correct me if I'm wrong, but it could just be a coincidence. That is, perhaps the data that you copied happens to lead to a dedup ratio relative to the data that's already on there. You could test this out by copying a few gigabytes of data you know is unique (like maybe a DVD video file or something), and that should change the dedup ratio. The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the dedupratio or it used a method that is giving unexpected results. I wonder if the dedup ratio is calculated by the contents of the DDT or by all the data contents of the whole pool, i'we only looked at the ratio for datasets which had dedup on for the whole lifetime. If the former, data added when it's switched off will never alter the ratio (until rewritten when with dedup on). The source should have the answer, but i'm on mail only for a few weeks. It'a probably for the whole dataset, that makes the most sense, just a thought. Regards Henrik http://sparcv9.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
On 17 mrt 2010, at 10:56, zfs ml wrote: > On 3/17/10 1:21 AM, Paul van der Zwan wrote: >> >> On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote: >> >>> Someone correct me if I'm wrong, but it could just be a coincidence. That >>> is, perhaps the data that you copied happens to lead to a dedup ratio >>> relative to the data that's already on there. You could test this out by >>> copying a few gigabytes of data you know is unique (like maybe a DVD video >>> file or something), and that should change the dedup ratio. >> >> The first copy of that data was unique and even dedup is switched off for >> the entire pool so it seems a bug in the calculation of the >> dedupratio or it used a method that is giving unexpected results. >> >> Paul > > beadm list -a > and/or other snapshots that were taken before turning off dedup? Possibly but that should not matter. If I triple the amount of data in the pool, with dedup switch off, the dedupratio should IMHO change because the amount of non-deduped data has changed. Paul ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
On 3/17/10 1:21 AM, Paul van der Zwan wrote: On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote: Someone correct me if I'm wrong, but it could just be a coincidence. That is, perhaps the data that you copied happens to lead to a dedup ratio relative to the data that's already on there. You could test this out by copying a few gigabytes of data you know is unique (like maybe a DVD video file or something), and that should change the dedup ratio. The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the dedupratio or it used a method that is giving unexpected results. Paul beadm list -a and/or other snapshots that were taken before turning off dedup? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote: > Someone correct me if I'm wrong, but it could just be a coincidence. That is, > perhaps the data that you copied happens to lead to a dedup ratio relative to > the data that's already on there. You could test this out by copying a few > gigabytes of data you know is unique (like maybe a DVD video file or > something), and that should change the dedup ratio. The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the dedupratio or it used a method that is giving unexpected results. Paul > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote: > Someone correct me if I'm wrong, but it could just be a coincidence. That is, > perhaps the data that you copied happens to lead to a dedup ratio relative to > the data that's already on there. You could test this out by copying a few > gigabytes of data you know is unique (like maybe a DVD video file or > something), and that should change the dedup ratio. The first copy of that data was unique and even dedup is switched off for the entire pool so it seems a bug in the calculation of the dedupratio or it used a method that is giving unexpected results. Paul > -- > This message posted from opensolaris.org > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] dedupratio riddle
Someone correct me if I'm wrong, but it could just be a coincidence. That is, perhaps the data that you copied happens to lead to a dedup ratio relative to the data that's already on there. You could test this out by copying a few gigabytes of data you know is unique (like maybe a DVD video file or something), and that should change the dedup ratio. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] dedupratio riddle
On Opensolaris build 134, upgraded from older versions, I have an rpool for which I had switch on dedup for a few weeks. After that I switched to back on. Now it seems the dedup ratio is stuck at a value of 1.68. Even when I copy more then 90 GB of data it still remains at 1.68. Any ideas ? Paul Here is some evidence… Before the copy : $ zpool list NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT rpool 931G 132G 799G14% 1.68x ONLINE - $ After the copy : $ zpool list NAMESIZE ALLOC FREECAP DEDUP HEALTH ALTROOT rpool 931G 225G 706G24% 1.68x ONLINE - $ It has only been enabled for 11 days last month. $ pfexec zpool history |grep dedup 2010-02-11.21:19:42 zfs set dedup=verify rpool 2010-02-22.21:38:15 zfs set dedup=off rpool And it is off on all filesystems: $ zfs get -r dedup rpool NAME PROPERTY VALUE SOURCE rpool dedup off local rp...@20100227dedup - - rpool/ROOTdedup off inherited from rpool rpool/r...@20100227 dedup - - rpool/ROOT/b131-zones dedup off inherited from rpool rpool/ROOT/b131-zo...@20100227dedup - - rpool/ROOT/b132 dedup off inherited from rpool rpool/ROOT/b...@20100227 dedup - - rpool/ROOT/b133 dedup off inherited from rpool rpool/ROOT/b134 dedup off inherited from rpool rpool/ROOT/b...@install dedup - - rpool/ROOT/b...@2010-02-07-11:19:05 dedup - - rpool/ROOT/b...@2010-02-20-15:59:22 dedup - - rpool/ROOT/b...@20100227 dedup - - rpool/ROOT/b...@2010-03-11-19:18:51 dedup - - rpool/dumpdedup off inherited from rpool rpool/d...@20100227 dedup - - rpool/export dedup off inherited from rpool rpool/exp...@20100227 dedup - - rpool/export/home dedup off inherited from rpool rpool/export/h...@20100227dedup - - rpool/export/home/beheer dedup off inherited from rpool rpool/export/home/beh...@20100227 dedup - - rpool/export/home/paulz dedup off inherited from rpool rpool/export/home/pa...@20100227 dedup - - rpool/export/sharededup off inherited from rpool rpool/export/sh...@20100227 dedup - - rpool/local dedup off inherited from rpool rpool/lo...@20100227 dedup - - rpool/paulzmail dedup off inherited from rpool rpool/paulzm...@20100227 dedup - - rpool/pkg dedup off inherited from rpool rpool/p...@20100227dedup - - rpool/swapdedup off inherited from rpool rpool/s...@20100227 dedup - - rpool/zones dedup off inherited from rpool rpool/zo...@20100227 dedup - - rpool/zones/buildzone dedup off inherited from rpool rpool/zones/buildz...@20100227dedup - - rpool/zones/buildzone/ROOTdedup off inherited from rpool rpool/zones/buildzone/r...@20100227 dedup - - rpool/zones/buildzone/ROOT/zbe-1 dedup off inherited from rpool rpool/zones/buildzone/ROOT/zb...@20100227 dedup - - rpool/zones/buildzone/ROOT/zbe-2 dedup off inherited from rpool rpool/zones/buildzone/ROOT/zb...@20100227 dedup - - rpool