Re: [zfs-discuss] periodic slow responsiveness

Richard Elling Thu, 24 Sep 2009 09:59:51 -0700

comment below...

On Sep 23, 2009, at 10:00 PM, James Lever wrote:

On 08/09/2009, at 2:01 AM, Ross Walker wrote:
On Sep 7, 2009, at 1:32 AM, James Lever <j...@jamver.id.au> wrote:
Well a MD1000 holds 15 drives a good compromise might be 2 7 driveRAIDZ2s with a hotspare... That should provide 320 IOPS instead of160, big difference.
The issue is interactive responsiveness and if there is a way totune the system to give that while still having good performancefor builds when they are run.
Look at the write IOPS of the pool with the zpool iostat -v andlook at how many are happening on the RAIDZ2 vdev.
I was suggesting that slog write were possibly starving reads fromthe l2arc as they were on the same device. This appears not tohave been the issue as the problem has persisted even with thel2arc devices removed from the pool.
The SSD will handle a lot more IOPS then the pool and L2ARC is alazy reader, it mostly just holds on to read cache data.
It just may be that the pool configuration just can't handle thewrite IOPS needed and reads are starving.
Possible, but hard to tell. Have a look at the iostat resultsI’ve posted.
The busy times of the disks while the issue is occurring should letyou know.
So it turns out that the problem is that all writes coming via NFSare going through the slog. When that happens, the transfer speedto the device drops to ~70MB/s (the write speed of his SLC SSD) anduntil the load drops all new write requests are blocked causing anoticeable delay (which has been observed to be up to 20s, butgenerally only 2-4s).


Thank you sir, can I have another?

If you add (not attach) more slogs, the workload will be spread acrossthem. But...

I can reproduce this behaviour by copying a large file (hundreds ofMB in size) using 'cp src dst’ on an NFS (still currently v3) clientand observe that all data is pushed through the slog device (10GBpartition of a Samsung 50GB SSD behind a PERC 6/i w/256MB BBC)rather than going direct to the primary storage disks.
On a related note, I had 2 of these devices (both using just 10GBpartitions) connected as log devices (so the pool had 2 separate logdevices) and the second one was consistently running significantlyslower than the first. Removing the second device made animprovement on performance, but did not remove the occasionalobserved pauses.

...this is not surprising, when you add a slow slog device. This isthe weakest link rule.

I was of the (mis)understanding that only metadata and writessmaller than 64k went via the slog device in the event of an O_SYNCwrite request?

The threshold is 32 kBytes, which is unfortunately the same as thedefault

NFS write size. See CR6686887
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6686887

If you have a slog and logbias=latency (default) then the writes go tothe slog.So there is some interaction here that can affect NFS workloads inparticular.

The clients are (mostly) RHEL5.
Is there a way to tune this on the NFS server or clients such thatwhen I perform a large synchronous write, the data does not go viathe slog device?


You can change the IOP size on the client.
 -- richard

I have investigated using the logbias setting, but that will justkill small file performance also on any filesystem using it anddefeat the purpose of having a slog device at all.
cheers,
James

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] periodic slow responsiveness

Reply via email to