Re: [zfs-discuss] Improving snapshot write performance

2012-04-11 Thread Richard Elling
On Apr 11, 2012, at 1:34 AM, Ian Collins wrote:

 I use an application with a fairly large receive data buffer (256MB) to 
 replicate data between sites.
 
 I have noticed the buffer becoming completely full when receiving snapshots 
 for some filesystems, even over a slow (~2MB/sec) WAN connection.  I assume 
 this is due to the changes being widely scattered.

Widely scattered on the sending side, receiving side should be mostly 
contiguous...
unless you are mostly full or there is some other cause of slow writes. The 
usual disk-oriented
performance analysis will show if this is the case. Most likely, something else 
is going on here.

 Is there any way to improve this situation?

Surely there must be...
 -- richard

--
ZFS Performance and Training
richard.ell...@richardelling.com
+1-760-896-4422







___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Improving snapshot write performance

2012-04-11 Thread Ian Collins

On 04/12/12 04:17 AM, Richard Elling wrote:

On Apr 11, 2012, at 1:34 AM, Ian Collins wrote:

I use an application with a fairly large receive data buffer (256MB) 
to replicate data between sites.


I have noticed the buffer becoming completely full when receiving 
snapshots for some filesystems, even over a slow (~2MB/sec) WAN 
connection.  I assume this is due to the changes being widely scattered.


Widely scattered on the sending side, receiving side should be mostly 
contiguous...


That's what I originally thought.

unless you are mostly full or there is some other cause of slow 
writes. The usual disk-oriented
performance analysis will show if this is the case. Most likely, 
something else is going on here.




Odd.  The pool is a single iSCSI volume exported from a 7320 and there 
is 18TB free.


I see the same issues with local replications on our LAN.  The 
filesystems that appear to write slowly are ones containing many small 
files, such as office documents.


Over the WAN, the receive buffer high water mark is usually the TCP 
receive window size, except for the apparently slow filesystems.


I'll add some more diagnostics.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Improving snapshot write performance

2012-04-11 Thread Peter Jeremy
On 2012-Apr-11 18:34:42 +1000, Ian Collins i...@ianshome.com wrote:
I use an application with a fairly large receive data buffer (256MB) to 
replicate data between sites.

I have noticed the buffer becoming completely full when receiving 
snapshots for some filesystems, even over a slow (~2MB/sec) WAN 
connection.  I assume this is due to the changes being widely scattered.

As Richard pointed out, the write side should be mostly contiguous.

Is there any way to improve this situation?

Is the target pool nearly full (so ZFS is spending lots of time searching
for free space)?

Do you have dedupe enabled on the target pool?  This would force ZFS to
search the DDT to write blocks - this will be expensive, especially if
you don't have enough RAM.

Do yoy have a high compression level (gzip or gzip-N) on the target
filesystems, without enough CPU horsepower?

Do you have a dying (or dead) disk in the target pool?

-- 
Peter Jeremy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Improving snapshot write performance

2012-04-11 Thread Ian Collins

On 04/12/12 09:00 AM, Jim Klimov wrote:

2012-04-11 23:55, Ian Collins wrote:

Odd. The pool is a single iSCSI volume exported from a 7320 and there is
18TB free.

Lame question: is that 18Tb free on the pool inside the
iSCSI volume, or on the backing pool on 7320?

I mean that as far as the external pool is concerned,
the zvol's blocks are allocated - even if the internal
pool considers them deleted but did not zero them out
and/or TRIM them explicitly.

Thus there may be lags due to fragmentation on the backing
external pool (physical on 7320), especially if it is
not very free and/or ifs free space is already too heavily
fragmented into many small bubbles.


I'll check, but I see the same effect with local replications as well.

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Improving snapshot write performance

2012-04-11 Thread Ian Collins

On 04/12/12 09:51 AM, Peter Jeremy wrote:

On 2012-Apr-11 18:34:42 +1000, Ian Collinsi...@ianshome.com  wrote:

I use an application with a fairly large receive data buffer (256MB) to
replicate data between sites.

I have noticed the buffer becoming completely full when receiving
snapshots for some filesystems, even over a slow (~2MB/sec) WAN
connection.  I assume this is due to the changes being widely scattered.

As Richard pointed out, the write side should be mostly contiguous.


Is there any way to improve this situation?

Is the target pool nearly full (so ZFS is spending lots of time searching
for free space)?

Do you have dedupe enabled on the target pool?  This would force ZFS to
search the DDT to write blocks - this will be expensive, especially if
you don't have enough RAM.

Do yoy have a high compression level (gzip or gzip-N) on the target
filesystems, without enough CPU horsepower?

Do you have a dying (or dead) disk in the target pool?



No to all of the above!

--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss