Re: [zfs-discuss] zfs diff performance

2012-02-28 Thread Ian Collins

On 02/28/12 12:53 PM, Ulrich Graef wrote:

Hi Ian,

On 26.02.12 23:42, Ian Collins wrote:

I had high hopes of significant performance gains using zfs diff in
Solaris 11 compared to my home-brew stat based version in Solaris 10.
However the results I have seen so far have been disappointing.

Testing on a reasonably sized filesystem (4TB), a diff that listed 41k
changes took 77 minutes. I haven't tried my old tool, but I would
expect the same diff to take a couple of hours.

Size does not matter (at least here).
How many files do you have and do you have enough cache in main memory
(25% of ARC) or cache device (set to metadata only).


Last time I looked, about 10 million files.

If you are able to manage that every dnode (512 Byte) is in the ARC or
the L2ARC then your compare will fly!

When your are doing too much other stuff (do you IO? Do you have
applications running?)
They will move dnode data out of the direct access and compare needs to
read a lot from disk.


There was a send running form the same pool.


You are comparing a measurement with a guess. That is not a valid test.


The guess is based on the last time I ram my old diff tool.


The box is well specified, an x4270 with 96G of RAM and a FLASH
accelerator card used for log and cache.

Number of files/size of files is missing.


As I said, about 10 million, various sized form bytes to Gbytes.

How much of the pool is used (in %)?


63%

Perhaps the recordsize is lowered, then
How much is used for the cache.
Did you set secondarycache=metadata?


No.


When, is your burn in long enough, that all the metadata is on fast devices?
How large is your L2ARC?


72GB.

What is running in parallel to your test?
What is the disk configuration (you know: disks are slow)?


stripe of 5 2 way mirrors.


Do you use de-duplication (does not directly harm the performance, but
needs memory
and slows down zfs diff through that)?


No dedup!


Tell me the hit rates of the cache (metadata and data in ARC and L2ARC).
Good?


I'll have to check next time I run a diff.

Raidz or mirror?

Are there any ways to improve diff performance?


Yes. Mainly memory. Or use less files.


Tell that to the users!

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs diff performance

2012-02-28 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Ulrich Graef
 
 Do you use de-duplication? (does not directly harm the performance, but
 needs memory
 and slows down zfs diff through that)?

Yikes.  That couldn't be more wrong.  Yes, dedup hurts performance, badly.  
Yes, in theory dedup should be able to accelerate performance, but the way it's 
presently implemented, it gets hurt too dramatically by hard disk seek/latency 
access time.  The way it's presently implemented, there is one and only one way 
dedup improves performance, which is when you read duplicate blocks, then you 
get about 2-4x read performance gain.  For all other operations - write 
duplicate, read nonduplicate, write nonduplicate...  Performance is worse with 
dedup.  As little as 2x, as high as 10x or 20x if you have sufficient memory 
*and* you optimize (because the out-of-the-box configuration is very nearly 
unusable), and infinite-x if you're having insufficient memory or you fail to 
optimize.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss