Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Casper . Dik
On Fri, 25 Sep 2009, James Lever wrote: NFS Version 3 introduces the concept of safe asynchronous writes.? Being safe then requires a responsibilty level on the client which is often not present. For example, if the server crashes, and then the client crashes, how does the client resend the

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Ross Walker
On Thu, Sep 24, 2009 at 11:29 PM, James Lever j...@jamver.id.au wrote: On 25/09/2009, at 11:49 AM, Bob Friesenhahn wrote: The commentary says that normally the COMMIT operations occur during close(2) or fsync(2) system call, or when encountering memory pressure.  If the problem is slow

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Bob Friesenhahn
On Fri, 25 Sep 2009, Ross Walker wrote: As a side an slog device will not be too beneficial for large sequential writes, because it will be throughput bound not latency bound. slog devices really help when you have lots of small sync writes. A RAIDZ2 with the ZIL spread across it will provide

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Richard Elling
On Sep 25, 2009, at 9:14 AM, Ross Walker wrote: On Fri, Sep 25, 2009 at 11:34 AM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Fri, 25 Sep 2009, Ross Walker wrote: As a side an slog device will not be too beneficial for large sequential writes, because it will be throughput bound

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread James Lever
On 26/09/2009, at 1:14 AM, Ross Walker wrote: By any chance do you have copies=2 set? No, only 1. So the double data going to the slog (as reported by iostat) is still confusing me and clearly potentially causing significant harm to my performance. Also, try setting

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Ross Walker
On Fri, Sep 25, 2009 at 5:24 PM, James Lever j...@jamver.id.au wrote: On 26/09/2009, at 1:14 AM, Ross Walker wrote: By any chance do you have copies=2 set? No, only 1.  So the double data going to the slog (as reported by iostat) is still confusing me and clearly potentially causing

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Ross Walker
On Fri, Sep 25, 2009 at 1:39 PM, Richard Elling richard.ell...@gmail.com wrote: On Sep 25, 2009, at 9:14 AM, Ross Walker wrote: On Fri, Sep 25, 2009 at 11:34 AM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Fri, 25 Sep 2009, Ross Walker wrote: As a side an slog device will not be

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Marion Hakanson
j...@jamver.id.au said: For a predominantly NFS server purpose, it really looks like a case of the slog has to outperform your main pool for continuous write speed as well as an instant response time as the primary criterion. Which might as well be a fast (or group of fast) SSDs or 15kRPM

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Ross Walker
On Fri, Sep 25, 2009 at 5:47 PM, Marion Hakanson hakan...@ohsu.edu wrote: j...@jamver.id.au said: For a predominantly NFS server purpose, it really looks like a case of the slog has to outperform your main pool for continuous write speed as well as an instant response time as the primary

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Bob Friesenhahn
On Fri, 25 Sep 2009, Richard Elling wrote: By default, the txg commit will occur when 1/8 of memory is used for writes. For 30 GBytes, that would mean a main memory of only 240 Gbytes... feasible for modern servers. Ahem. We were advised that 7/8s of memory is currently what is allowed for

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Bob Friesenhahn
On Fri, 25 Sep 2009, Ross Walker wrote: Problem is most SSD manufactures list sustained throughput with large IO sizes, say 4MB, and not 128K, so it is tricky buying a good SSD that can handle the throughput. Who said that the slog SSD is written to in 128K chunks? That seems wrong to me.

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Marion Hakanson
rswwal...@gmail.com said: Yes, but if it's on NFS you can just figure out the workload in MB/s and use that as a rough guideline. I wonder if that's the case. We have an NFS server without NVRAM cache (X4500), and it gets huge MB/sec throughput on large-file writes over NFS. But it's

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Ross Walker
On Sep 25, 2009, at 6:19 PM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Fri, 25 Sep 2009, Ross Walker wrote: Problem is most SSD manufactures list sustained throughput with large IO sizes, say 4MB, and not 128K, so it is tricky buying a good SSD that can handle the throughput.

Re: [zfs-discuss] periodic slow responsiveness

2009-09-25 Thread Neil Perrin
On 09/25/09 16:19, Bob Friesenhahn wrote: On Fri, 25 Sep 2009, Ross Walker wrote: Problem is most SSD manufactures list sustained throughput with large IO sizes, say 4MB, and not 128K, so it is tricky buying a good SSD that can handle the throughput. Who said that the slog SSD is written

Re: [zfs-discuss] periodic slow responsiveness

2009-09-24 Thread Bob Friesenhahn
On Thu, 24 Sep 2009, James Lever wrote: I was of the (mis)understanding that only metadata and writes smaller than 64k went via the slog device in the event of an O_SYNC write request? What would cause you to understand that? Is there a way to tune this on the NFS server or clients such

Re: [zfs-discuss] periodic slow responsiveness

2009-09-24 Thread Richard Elling
comment below... On Sep 23, 2009, at 10:00 PM, James Lever wrote: On 08/09/2009, at 2:01 AM, Ross Walker wrote: On Sep 7, 2009, at 1:32 AM, James Lever j...@jamver.id.au wrote: Well a MD1000 holds 15 drives a good compromise might be 2 7 drive RAIDZ2s with a hotspare... That should provide

Re: [zfs-discuss] periodic slow responsiveness

2009-09-24 Thread James Lever
On 25/09/2009, at 2:58 AM, Richard Elling wrote: On Sep 23, 2009, at 10:00 PM, James Lever wrote: So it turns out that the problem is that all writes coming via NFS are going through the slog. When that happens, the transfer speed to the device drops to ~70MB/s (the write speed of his

Re: [zfs-discuss] periodic slow responsiveness

2009-09-24 Thread James Lever
On 25/09/2009, at 1:24 AM, Bob Friesenhahn wrote: On Thu, 24 Sep 2009, James Lever wrote: Is there a way to tune this on the NFS server or clients such that when I perform a large synchronous write, the data does not go via the slog device? Synchronous writes are needed by NFS to

Re: [zfs-discuss] periodic slow responsiveness

2009-09-24 Thread Bob Friesenhahn
On Fri, 25 Sep 2009, James Lever wrote: NFS Version 3 introduces the concept of safe asynchronous writes.? Being safe then requires a responsibilty level on the client which is often not present. For example, if the server crashes, and then the client crashes, how does the client resend

Re: [zfs-discuss] periodic slow responsiveness

2009-09-24 Thread James Lever
On 25/09/2009, at 11:49 AM, Bob Friesenhahn wrote: The commentary says that normally the COMMIT operations occur during close(2) or fsync(2) system call, or when encountering memory pressure. If the problem is slow copying of many small files, this COMMIT approach does not help very much

Re: [zfs-discuss] periodic slow responsiveness

2009-09-24 Thread James Lever
I thought I would try the same test using dd bs=131072 if=source of=/ path/to/nfs to see what the results looked liked… It is very similar to before, about 2x slog usage and same timing and write totals. Friday, 25 September 2009 1:49:48 PM EST extended device

Re: [zfs-discuss] periodic slow responsiveness

2009-09-23 Thread James Lever
On 08/09/2009, at 2:01 AM, Ross Walker wrote: On Sep 7, 2009, at 1:32 AM, James Lever j...@jamver.id.au wrote: Well a MD1000 holds 15 drives a good compromise might be 2 7 drive RAIDZ2s with a hotspare... That should provide 320 IOPS instead of 160, big difference. The issue is

[zfs-discuss] periodic slow responsiveness

2009-09-06 Thread James Lever
I’m experiencing occasional slow responsiveness on an OpenSolaris b118 system typically noticed when running an ‘ls’ (no extra flags, so no directory service lookups). There is a delay of between 2 and 30 seconds but no correlation has been noticed with load on the server and the slow

Re: [zfs-discuss] periodic slow responsiveness

2009-09-06 Thread Ross Walker
On Sun, Sep 6, 2009 at 9:15 AM, James Leverj...@jamver.id.au wrote: I’m experiencing occasional slow responsiveness on an OpenSolaris b118 system typically noticed when running an ‘ls’ (no extra flags, so no directory service lookups).  There is a delay of between 2 and 30 seconds but no

Re: [zfs-discuss] periodic slow responsiveness

2009-09-06 Thread Richard Elling
On Sep 6, 2009, at 7:53 AM, Ross Walker wrote: On Sun, Sep 6, 2009 at 9:15 AM, James Leverj...@jamver.id.au wrote: I’m experiencing occasional slow responsiveness on an OpenSolaris b118 system typically noticed when running an ‘ls’ (no extra flags, so no directory service lookups). There is

Re: [zfs-discuss] periodic slow responsiveness

2009-09-06 Thread James Lever
On 07/09/2009, at 6:24 AM, Richard Elling wrote: On Sep 6, 2009, at 7:53 AM, Ross Walker wrote: On Sun, Sep 6, 2009 at 9:15 AM, James Leverj...@jamver.id.au wrote: I’m experiencing occasional slow responsiveness on an OpenSolaris b118 system typically noticed when running an ‘ls’ (no

Re: [zfs-discuss] periodic slow responsiveness

2009-09-06 Thread Ross Walker
Sorry for my earlier post I responded prematurely. On Sep 6, 2009, at 9:15 AM, James Lever j...@jamver.id.au wrote: I’m experiencing occasional slow responsiveness on an OpenSolaris b1 18 system typically noticed when running an ‘ls’ (no extra flags, so no directory service lookups).

Re: [zfs-discuss] periodic slow responsiveness

2009-09-06 Thread Richard Elling
On Sep 6, 2009, at 5:06 PM, James Lever wrote: On 07/09/2009, at 6:24 AM, Richard Elling wrote: On Sep 6, 2009, at 7:53 AM, Ross Walker wrote: On Sun, Sep 6, 2009 at 9:15 AM, James Leverj...@jamver.id.au wrote: I’m experiencing occasional slow responsiveness on an OpenSolaris b118 system

Re: [zfs-discuss] periodic slow responsiveness

2009-09-06 Thread James Lever
On 07/09/2009, at 11:08 AM, Richard Elling wrote: Ok, just so I am clear, when you mean local automount you are on the server and using the loopback -- no NFS or network involved? Correct. And the behaviour has been seen locally as well as remotely. You are looking for I/O that takes

Re: [zfs-discuss] periodic slow responsiveness

2009-09-06 Thread James Lever
On 07/09/2009, at 10:46 AM, Ross Walker wrote: zpool is RAIDZ2 comprised of 10 * 15kRPM SAS drives behind an LSI 1078 w/ 512MB BBWC exposed as RAID0 LUNs (Dell MD1000 behind PERC 6/ E) with 2x SSDs each partitioned as 10GB slog and 36GB remainder as l2arc behind another LSI 1078 w/ 256MB