Re: [zfs-discuss] dedupratio riddle

2010-03-18 Thread Daniel Carosone
As noted, the ratio caclulation applies over the data attempted to
dedup, not the whole pool.  However, I saw a commit go by just in the
last couple of days about the dedupratio calculation being misleading,
though I didn't check the details.   Presumably this will be reported
differently from the next builds.  

--
Dan.

pgpH78u3PQOkc.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedupratio riddle

2010-03-18 Thread Henrik Johansson

On 18 mar 2010, at 18.38, Craig Alder  wrote:

I remembered reading a post about this a couple of months back.   
This post by Jeff Bonwick confirms that the dedupratio is calculated  
only on the data that you've attempted to deduplicate, i.e. only the  
data written whilst dedup is turned on - http://mail.opensolaris.org/pipermail/zfs-discuss/2009-December/034721.html 
.


Ah, I was on the right track then with the DDT then :) guess most  
people have it turned on/off from the begining until BP rewrite to  
ensure everything is deduplicated(which is probably a good idea).


Regards

Henrik
http://sparcv9.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedupratio riddle

2010-03-18 Thread Craig Alder
I remembered reading a post about this a couple of months back.  This post by 
Jeff Bonwick confirms that the dedupratio is calculated only on the data that 
you've attempted to deduplicate, i.e. only the data written whilst dedup is 
turned on - 
http://mail.opensolaris.org/pipermail/zfs-discuss/2009-December/034721.html.

Regards,

Craig
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedupratio riddle

2010-03-18 Thread Paul van der Zwan

On 18 mrt 2010, at 10:07, Henrik Johansson wrote:

> Hello,
> 
> On 17 mar 2010, at 16.22, Paul van der Zwan  wrote:
> 
>> 
>> On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote:
>> 
>>> Someone correct me if I'm wrong, but it could just be a coincidence. That 
>>> is, perhaps the data that you copied happens to lead to a dedup ratio 
>>> relative to the data that's already on there. You could test this out by 
>>> copying a few gigabytes of data you know is unique (like maybe a DVD video 
>>> file or something), and that should change the dedup ratio.
>> 
>> The first copy of that data was unique and even dedup is switched off for 
>> the entire pool so it seems a bug in the calculation of the
>> dedupratio or it used a method that is giving unexpected results.
> 
> I wonder if the dedup ratio is calculated by the contents of the DDT or by 
> all the data contents of the whole pool, i'we only looked at the ratio for 
> datasets which had dedup on for the whole lifetime. If the former, data added 
> when it's switched off will never alter the ratio (until rewritten when with 
> dedup on). The source should have the answer, but i'm on mail only for a few 
> weeks.
> 
> It'a probably for the whole dataset, that makes the most sense, just a 
> thought.
> 

It looks like the ratio only gets updated when dedup is switched on and freezes 
if you switch dedup off for the entire pool, like I did.

I tried to have a look at the source but it was way too complex to figure it 
out in the time I had available so far.

Best regards,
Paul van der Zwan
Sun Microsystems Nederland

> Regards
> 
> Henrik
> http://sparcv9.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedupratio riddle

2010-03-18 Thread Henrik Johansson

Hello,

On 17 mar 2010, at 16.22, Paul van der Zwan   
wrote:




On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote:

Someone correct me if I'm wrong, but it could just be a  
coincidence. That is, perhaps the data that you copied happens to  
lead to a dedup ratio relative to the data that's already on there.  
You could test this out by copying a few gigabytes of data you know  
is unique (like maybe a DVD video file or something), and that  
should change the dedup ratio.


The first copy of that data was unique and even dedup is switched  
off for the entire pool so it seems a bug in the calculation of the

dedupratio or it used a method that is giving unexpected results.


I wonder if the dedup ratio is calculated by the contents of the DDT  
or by all the data contents of the whole pool, i'we only looked at the  
ratio for datasets which had dedup on for the whole lifetime. If the  
former, data added when it's switched off will never alter the ratio  
(until rewritten when with dedup on). The source should have the  
answer, but i'm on mail only for a few weeks.


It'a probably for the whole dataset, that makes the most sense, just a  
thought.


Regards

Henrik
http://sparcv9.blogspot.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedupratio riddle

2010-03-17 Thread Paul van der Zwan

On 17 mrt 2010, at 10:56, zfs ml wrote:

> On 3/17/10 1:21 AM, Paul van der Zwan wrote:
>> 
>> On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote:
>> 
>>> Someone correct me if I'm wrong, but it could just be a coincidence. That 
>>> is, perhaps the data that you copied happens to lead to a dedup ratio 
>>> relative to the data that's already on there. You could test this out by 
>>> copying a few gigabytes of data you know is unique (like maybe a DVD video 
>>> file or something), and that should change the dedup ratio.
>> 
>> The first copy of that data was unique and even dedup is switched off for 
>> the entire pool so it seems a bug in the calculation of the
>> dedupratio or it used a method that is giving unexpected results.
>> 
>>  Paul
> 
> beadm list -a
> and/or other snapshots that were taken before turning off dedup?

Possibly but that should not matter. If I triple the amount of data in the 
pool, with dedup switch off, the dedupratio
should IMHO change because the amount of non-deduped data has changed.

Paul
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedupratio riddle

2010-03-17 Thread zfs ml

On 3/17/10 1:21 AM, Paul van der Zwan wrote:


On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote:


Someone correct me if I'm wrong, but it could just be a coincidence. That is, 
perhaps the data that you copied happens to lead to a dedup ratio relative to 
the data that's already on there. You could test this out by copying a few 
gigabytes of data you know is unique (like maybe a DVD video file or 
something), and that should change the dedup ratio.


The first copy of that data was unique and even dedup is switched off for the 
entire pool so it seems a bug in the calculation of the
dedupratio or it used a method that is giving unexpected results.

Paul


beadm list -a
and/or other snapshots that were taken before turning off dedup?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedupratio riddle

2010-03-17 Thread Paul van der Zwan

On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote:

> Someone correct me if I'm wrong, but it could just be a coincidence. That is, 
> perhaps the data that you copied happens to lead to a dedup ratio relative to 
> the data that's already on there. You could test this out by copying a few 
> gigabytes of data you know is unique (like maybe a DVD video file or 
> something), and that should change the dedup ratio.

The first copy of that data was unique and even dedup is switched off for the 
entire pool so it seems a bug in the calculation of the
dedupratio or it used a method that is giving unexpected results.

Paul

> -- 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedupratio riddle

2010-03-17 Thread Paul van der Zwan

On 16 mrt 2010, at 19:48, valrh...@gmail.com wrote:

> Someone correct me if I'm wrong, but it could just be a coincidence. That is, 
> perhaps the data that you copied happens to lead to a dedup ratio relative to 
> the data that's already on there. You could test this out by copying a few 
> gigabytes of data you know is unique (like maybe a DVD video file or 
> something), and that should change the dedup ratio.

The first copy of that data was unique and even dedup is switched off for the 
entire pool so it seems a bug in the calculation of the
dedupratio or it used a method that is giving unexpected results.

Paul

> -- 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] dedupratio riddle

2010-03-16 Thread valrh...@gmail.com
Someone correct me if I'm wrong, but it could just be a coincidence. That is, 
perhaps the data that you copied happens to lead to a dedup ratio relative to 
the data that's already on there. You could test this out by copying a few 
gigabytes of data you know is unique (like maybe a DVD video file or 
something), and that should change the dedup ratio.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] dedupratio riddle

2010-03-16 Thread Paul van der Zwan
On Opensolaris build 134, upgraded from older versions, I have an rpool for 
which I had switch on dedup for a few weeks. 
After that I switched to back on.
Now it seems the dedup ratio is stuck at a value of 1.68.
Even when I copy more then 90 GB of data it still remains at 1.68.
Any ideas ?

Paul
Here is some evidence…

Before the copy :
$ zpool list
NAMESIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
rpool   931G   132G   799G14%  1.68x  ONLINE  -
$ 

After the copy :
$ zpool list
NAMESIZE  ALLOC   FREECAP  DEDUP  HEALTH  ALTROOT
rpool   931G   225G   706G24%  1.68x  ONLINE  -
$
It has only been enabled for 11 days last month.

$ pfexec zpool history |grep dedup
2010-02-11.21:19:42 zfs set dedup=verify rpool
2010-02-22.21:38:15 zfs set dedup=off rpool

And it is off on all filesystems:
$ zfs get -r dedup rpool
NAME  PROPERTY  VALUE  
SOURCE
rpool dedup off
local
rp...@20100227dedup -  -
rpool/ROOTdedup off
inherited from rpool
rpool/r...@20100227   dedup -  -
rpool/ROOT/b131-zones dedup off
inherited from rpool
rpool/ROOT/b131-zo...@20100227dedup -  -
rpool/ROOT/b132   dedup off
inherited from rpool
rpool/ROOT/b...@20100227  dedup -  -
rpool/ROOT/b133   dedup off
inherited from rpool
rpool/ROOT/b134   dedup off
inherited from rpool
rpool/ROOT/b...@install   dedup -  -
rpool/ROOT/b...@2010-02-07-11:19:05   dedup -  -
rpool/ROOT/b...@2010-02-20-15:59:22   dedup -  -
rpool/ROOT/b...@20100227  dedup -  -
rpool/ROOT/b...@2010-03-11-19:18:51   dedup -  -
rpool/dumpdedup off
inherited from rpool
rpool/d...@20100227   dedup -  -
rpool/export  dedup off
inherited from rpool
rpool/exp...@20100227 dedup -  -
rpool/export/home dedup off
inherited from rpool
rpool/export/h...@20100227dedup -  -
rpool/export/home/beheer  dedup off
inherited from rpool
rpool/export/home/beh...@20100227 dedup -  -
rpool/export/home/paulz   dedup off
inherited from rpool
rpool/export/home/pa...@20100227  dedup -  -
rpool/export/sharededup off
inherited from rpool
rpool/export/sh...@20100227   dedup -  -
rpool/local   dedup off
inherited from rpool
rpool/lo...@20100227  dedup -  -
rpool/paulzmail   dedup off
inherited from rpool
rpool/paulzm...@20100227  dedup -  -
rpool/pkg dedup off
inherited from rpool
rpool/p...@20100227dedup -  
-
rpool/swapdedup off
inherited from rpool
rpool/s...@20100227   dedup -  -
rpool/zones   dedup off
inherited from rpool
rpool/zo...@20100227  dedup -  -
rpool/zones/buildzone dedup off
inherited from rpool
rpool/zones/buildz...@20100227dedup -  -
rpool/zones/buildzone/ROOTdedup off
inherited from rpool
rpool/zones/buildzone/r...@20100227   dedup -  -
rpool/zones/buildzone/ROOT/zbe-1  dedup off
inherited from rpool
rpool/zones/buildzone/ROOT/zb...@20100227 dedup -  -
rpool/zones/buildzone/ROOT/zbe-2  dedup off
inherited from rpool
rpool/zones/buildzone/ROOT/zb...@20100227 dedup -  -
rpool