Re: [zfs-discuss] ZIL SSD performance testing... -IOzone works great, others not so great

2009-04-10 Thread Patrick Skerrett

Thanks for the explanation folks.

So if I cannot get Apache/Webdav to write synchronously, (and it does 
not look like I can), then is it possible to tune the ARC to be more 
write-buffered heavy?


My biggest problem is with very quick spikes in writes periodically 
throughout the day. If I were able to buffer these better, I would be in 
pretty good shape. The machines are already (economically) maxed out on 
ram at 32 gigs.


If I were to add in the SSD L2ARC devices for read caching, can I 
configure the ARC to give up some of it's read caching for more write 
buffering?



Thanks.




Neil Perrin wrote:

Patrick,

The ZIL is only used for synchronous requests like O_DSYNC/O_SYNC and
fsync(). Your iozone command must be doing some synchronous writes.
All the other tests (dd, cat, cp, ...) do everything asynchronously.
That is they do not require the data to be on stable storage on
return from the write. So asynchronous writes get cached in memory
(the ARC) and written out periodically (every 30 seconds or less)
when the transaction group commits.

The ZIL would be heavily used if your system were a NFS server.
Databases also do synchronous writes.

Neil.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL SSD performance testing... -IOzone works great, others not so great

2009-04-10 Thread Eric D. Mudama

On Fri, Apr 10 at  8:07, Patrick Skerrett wrote:

Thanks for the explanation folks.

So if I cannot get Apache/Webdav to write synchronously, (and it does  
not look like I can), then is it possible to tune the ARC to be more  
write-buffered heavy?


My biggest problem is with very quick spikes in writes periodically  
throughout the day. If I were able to buffer these better, I would be in  
pretty good shape. The machines are already (economically) maxed out on  
ram at 32 gigs.


If I were to add in the SSD L2ARC devices for read caching, can I  
configure the ARC to give up some of it's read caching for more write  
buffering?


I think in most cases, the raw spindle throughput should be enough to
handle your load, or else you haven't sized your arrays properly.
Bursts of async writes of relatively large size should be headed to
the media at somewhere around 50-100MB/s/vdev I would think.  How much
burst IO do you have?

--
Eric D. Mudama
edmud...@mail.bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL SSD performance testing... -IOzone works great, others not so great

2009-04-10 Thread Patrick Skerrett

More than that :)

It's very very short duration, but we have the potential for  10's of 
thousands of clients doing writes all at the same time. I have the farm 
spread out over 16 servers, each with 2x 4GB fiber cards into big disk 
arrays, but my reads do get slow (resulting in end user experience 
degradation) when these write bursts come in, and if I could buffer them 
even for 60 seconds, it would make everything much smoother.



Is there a way to optimize the ARC for more write buffering, and push 
more read caching off into the L2ARC?


Again, I'm only worried about short bursts that happen once or twice a 
day. The rest of the time everything runs very smooth.



Thanks.



Eric D. Mudama wrote:

On Fri, Apr 10 at 8:07, Patrick Skerrett wrote:

Thanks for the explanation folks.

So if I cannot get Apache/Webdav to write synchronously, (and it does 
not look like I can), then is it possible to tune the ARC to be more 
write-buffered heavy?


My biggest problem is with very quick spikes in writes periodically 
throughout the day. If I were able to buffer these better, I would be 
in pretty good shape. The machines are already (economically) maxed 
out on ram at 32 gigs.


If I were to add in the SSD L2ARC devices for read caching, can I 
configure the ARC to give up some of it's read caching for more write 
buffering?


I think in most cases, the raw spindle throughput should be enough to
handle your load, or else you haven't sized your arrays properly.
Bursts of async writes of relatively large size should be headed to
the media at somewhere around 50-100MB/s/vdev I would think. How much
burst IO do you have?



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL SSD performance testing... -IOzone works great, others not so great

2009-04-10 Thread Mark J Musante

On Fri, 10 Apr 2009, Patrick Skerrett wrote:

degradation) when these write bursts come in, and if I could buffer them 
even for 60 seconds, it would make everything much smoother.


ZFS already batches up writes into a transaction group, which currently 
happens every 30 seconds.  Have you tested zfs against a real-world 
workload?



Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL SSD performance testing... -IOzone works great, others not so great

2009-04-10 Thread Patrick Skerrett

Yes, we are currently running ZFS, just without L2 ARC, or offloaded ZIL.





Mark J Musante wrote:

On Fri, 10 Apr 2009, Patrick Skerrett wrote:

degradation) when these write bursts come in, and if I could buffer 
them even for 60 seconds, it would make everything much smoother.


ZFS already batches up writes into a transaction group, which 
currently happens every 30 seconds. Have you tested zfs against a 
real-world workload?



Regards,
markm



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL SSD performance testing... -IOzone works great, others not so great

2009-04-10 Thread Toby Thain


On 10-Apr-09, at 5:05 PM, Mark J Musante wrote:


On Fri, 10 Apr 2009, Patrick Skerrett wrote:

degradation) when these write bursts come in, and if I could  
buffer them even for 60 seconds, it would make everything much  
smoother.


ZFS already batches up writes into a transaction group, which  
currently happens every 30 seconds.



Isn't that 5 seconds?

--T


  Have you tested zfs against a real-world workload?


Regards,
markm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL SSD performance testing... -IOzone works great, others not so great

2009-04-10 Thread Neil Perrin



On 04/10/09 20:15, Toby Thain wrote:


On 10-Apr-09, at 5:05 PM, Mark J Musante wrote:


On Fri, 10 Apr 2009, Patrick Skerrett wrote:

degradation) when these write bursts come in, and if I could buffer 
them even for 60 seconds, it would make everything much smoother.


ZFS already batches up writes into a transaction group, which 
currently happens every 30 seconds.


Isn't that 5 seconds?


It used to be, and it may still be for what you are running.
However, Mark is right, it is now 30 seconds. In fact 30s is
the maximum. The actual time will depend on load. If the pool
is heavily used then the txg's fire more frequently.

Neil.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZIL SSD performance testing... -IOzone works great, others not so great

2009-04-09 Thread Patrick Skerrett

Hi folks,

I would appreciate it if someone can help me understand some weird 
results I'm seeing with trying to do performance testing with an SSD 
offloaded ZIL.



I'm attempting to improve my infrastructure's burstable write capacity 
(ZFS based WebDav servers), and naturally I'm looking at implementing 
SSD based ZIL devices.
I have a test machine with the crummiest hard drive I can find installed 
in it, Quantum Fireball ATA-100 4500RPM 128K cache, and an Intel X25-E 
32gig SSD drive.
I'm trying to do A-B comparisons and am coming up with some very odd 
results:


The first test involves doing IOZone write testing on the fireball 
standalone, the SSD standalone, and the fireball with the SSD as a log 
device.


My test command is:  time iozone -i 0 -a -y 64 -q 1024 -g 32M

Then I check the time it takes to complete this operation in each scenario:

Fireball alone - 2m15s (told you it was crappy)
SSD alone - 0m3s
Fireball + SSD zil - 0m28s

This looks great! Watching 'zpool iostat-v' during this test further 
proves that the ZIL device is doing the brunt of the heavy lifting 
during this test. If I can get these kind of write results in my prod 
environment, I would be one happy camper.




However, ANY other test I can think of to run on this test machine shows 
absolutely no performance improvement of the Fireball+SSD Zil over the 
Fireball by itself. Watching zpool iostat -v shows no activity on the 
ZIL at all whatsoever.

Other tests I've tried to run:

A scripted batch job of 10,000 -
dd if=/dev/urandom of=/fireball/file_$i.dat bs=1k count=1000

A scripted batch job of 10,000 -
cat /sourcedrive/$file  /fireball/$file

A scripted batch job of 10,000 -
cp /sourcedrive/$file /fireball/$file

And a scripted batch job moving 10,000 files onto the fireball using 
Apache Webdav mounted on the fireball (similar to my prod environment):

curl -T /sourcedrive/$file http://127.0.0.1/fireball/




So what is IOZone doing differently than any other write operation I can 
think of???



Thanks,

Pat S.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZIL SSD performance testing... -IOzone works great, others not so great

2009-04-09 Thread Neil Perrin

Patrick,

The ZIL is only used for synchronous requests like O_DSYNC/O_SYNC and
fsync(). Your iozone command must be doing some synchronous writes.
All the other tests (dd, cat, cp, ...) do everything asynchronously.
That is they do not require the data to be on stable storage on
return from the write. So asynchronous writes get cached in memory
(the ARC) and written out periodically (every 30 seconds or less)
when the transaction group commits.

The ZIL would be heavily used if your system were a NFS server.
Databases also do synchronous writes.

Neil.

On 04/09/09 15:13, Patrick Skerrett wrote:

Hi folks,

I would appreciate it if someone can help me understand some weird 
results I'm seeing with trying to do performance testing with an SSD 
offloaded ZIL.



I'm attempting to improve my infrastructure's burstable write capacity 
(ZFS based WebDav servers), and naturally I'm looking at implementing 
SSD based ZIL devices.
I have a test machine with the crummiest hard drive I can find installed 
in it, Quantum Fireball ATA-100 4500RPM 128K cache, and an Intel X25-E 
32gig SSD drive.
I'm trying to do A-B comparisons and am coming up with some very odd 
results:


The first test involves doing IOZone write testing on the fireball 
standalone, the SSD standalone, and the fireball with the SSD as a log 
device.


My test command is:  time iozone -i 0 -a -y 64 -q 1024 -g 32M

Then I check the time it takes to complete this operation in each scenario:

Fireball alone - 2m15s (told you it was crappy)
SSD alone - 0m3s
Fireball + SSD zil - 0m28s

This looks great! Watching 'zpool iostat-v' during this test further 
proves that the ZIL device is doing the brunt of the heavy lifting 
during this test. If I can get these kind of write results in my prod 
environment, I would be one happy camper.




However, ANY other test I can think of to run on this test machine shows 
absolutely no performance improvement of the Fireball+SSD Zil over the 
Fireball by itself. Watching zpool iostat -v shows no activity on the 
ZIL at all whatsoever.

Other tests I've tried to run:

A scripted batch job of 10,000 -
dd if=/dev/urandom of=/fireball/file_$i.dat bs=1k count=1000

A scripted batch job of 10,000 -
cat /sourcedrive/$file  /fireball/$file

A scripted batch job of 10,000 -
cp /sourcedrive/$file /fireball/$file

And a scripted batch job moving 10,000 files onto the fireball using 
Apache Webdav mounted on the fireball (similar to my prod environment):

curl -T /sourcedrive/$file http://127.0.0.1/fireball/




So what is IOZone doing differently than any other write operation I can 
think of???



Thanks,

Pat S.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss