Re: [zfs-discuss] Bursty writes - why?

2010-10-13 Thread David Magda
On Tue, October 12, 2010 18:31, Bob Friesenhahn wrote:
 On Tue, 12 Oct 2010, Saxon, Will wrote:
 Another article concerning Sandforce performance:

 http://www.anandtech.com/show/3667/6

 [...]

 When I read this I thought that it kind of eliminated Sandforce
 drives from consideration as SLOG devices, which is a pity because
 the OCZ Vertex 2 EX or Vertex 2 Pro SAS otherwise look like good
 candidates.

 For obvious reasons, the SLOG is designed to write sequentially.
 Otherwise it would offer much less benefit.  Maybe this random-write
 issue with Sandforce would not be a problem?

The other thing is that the article talks about an SF-1200-based drive.
And an MLC one to boot.

When SandForce  originally came up on this list a while ago, I got the
general impression that while SF-1200-based devices were fine for L2ARC
caches, the consensus was that you would want an SF-1500-based devices for
slogs.

Not only does the SF-1500 get you better write IOps, the devices that used
them also tended to have batteries or super-caps as well. This helped with
data integrity in the case of unexpected power outages. SF-1500 units were
also usually available with SLC flash, which would help with longevity
give the write-oriented nature of slogs. See:

http://www.anandtech.com/show/3661/

So while the 'dedupe article' is informative, and the conclusions about
slogs and SF-1200-based devices appear sound, it's a bit beside the point
IMHO. Sadly there don't seem to be many SSDs out there that you /really/
want to use for slogs: there are many that you can make due with
(especially in mirrored configurations), but few that are ideal.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-12 Thread Jacob Ritorto
Thanks for posting your findings.  What was incorrect about the client's
config?

On Oct 7, 2010 4:15 PM, Eff Norwood sm...@jsvp.com wrote:

Figured it out - it was the NFS client. I used snoop and then some dtrace
magic to prove that the client (which was using O_SYNC) was sending very
bursty requests to the system. I tried a number of other NFS clients with
O_SYNC as well and got excellent performance when they were configured
correctly. Just for fun I disabled the DDRdrive X1 (pair of them) that I use
for the ZIL and performance tanked across the board when using O_SYNC. I
can't recommend the DDRdrive X1 enough as a ZIL! Here is a great article on
this behavior here: http://blogs.sun.com/brendan/entry/slog_screenshots

Thanks for the help all!

-- 
This message posted from opensolaris.org
___
zfs-dis...
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-12 Thread Eff Norwood
The NFS client in this case was VMWare ESXi 4.1 release build. What happened is 
that the file uploader behavior was changed in 4.1 to prevent I/O contention 
with the VM guests. That means when you go to upload something to the 
datastore, it only sends chunks of the file instead of streaming it all at once 
like it did in ESXi 4.0. To end users, something appeared to be broken because 
file uploads now took 95 seconds instead of 30. Turns out that is by design in 
4.1. This is the behavior *only* for the uploader and not for the VM guests. 
Their I/O is as expected.

I have to say as a side note, the DDRdrive X1s make a day and night difference 
with VMWare. If you use VMWare via NFS, I highly recommend the X1s as the ZIL. 
Otherwise the VMWare O_SYNC (Stable = FSYNC) will kill your performance dead. 
We also tried SSDs as the ZIL which worked ok until they got full, then 
performance tanked. As I have posted before, SSDs as your ZIL - don't do it!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-12 Thread Ray Van Dolson
On Tue, Oct 12, 2010 at 12:09:44PM -0700, Eff Norwood wrote:
 The NFS client in this case was VMWare ESXi 4.1 release build. What
 happened is that the file uploader behavior was changed in 4.1 to
 prevent I/O contention with the VM guests. That means when you go to
 upload something to the datastore, it only sends chunks of the file
 instead of streaming it all at once like it did in ESXi 4.0. To end
 users, something appeared to be broken because file uploads now took
 95 seconds instead of 30. Turns out that is by design in 4.1. This is
 the behavior *only* for the uploader and not for the VM guests. Their
 I/O is as expected.

Interesting.

 I have to say as a side note, the DDRdrive X1s make a day and night
 difference with VMWare. If you use VMWare via NFS, I highly recommend
 the X1s as the ZIL. Otherwise the VMWare O_SYNC (Stable = FSYNC) will
 kill your performance dead. We also tried SSDs as the ZIL which
 worked ok until they got full, then performance tanked. As I have
 posted before, SSDs as your ZIL - don't do it!  -- 

We run SSD's as ZIL here exclusively on what I'd consider fairly busy
VMware datastores and have never encountered this.

How would one know how full their SSD being used as ZIL is?  I was
under the impression that even using a full 32GB X-25E was overkill
spacewise for typical ZIL functionality...

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-12 Thread Miles Nordin
 en == Eff Norwood sm...@jsvp.com writes:

en We also tried SSDs as the ZIL which worked ok until they got
en full, then performance tanked. As I have posted before, SSDs
en as your ZIL - don't do it!

yeah, iirc the thread went back and forth between you and I for a few
days, something like this,

you: SSD's work fine at first, then slow down, see this anandtech
 article.  We got bit by this.

me:  That article is two years old.  Read this other article which is
 one year old and explains the problem is fixed if you buy current gen2
 intel or sandforce-based SSD.

you: Well absent test results from you I think we will just have to
 continue believing that all SSD's gradually slow down like I
 said, though I would love to be proved wrong.

me:  You haven't provided any test results yourself nor even said what
 drive you're using.  We've both just cited anandtech, and my
 citation's newer than yours.

you: I welcome further tests that prove the DDRDrive is not the only
 suitable ZIL, but absent these tests we have to assume I'm right
 that it is.

silly!

slowdowns with age:
  http://www.pcper.com/article.php?aid=669
  http://www.anandtech.com/show/2738/15

slowdowns fixed:
  http://www.anandtech.com/show/2899/8  

  ``With the X25-M G2 Intel managed to virtually eliminate the
random-write performance penalty on a sequentially filled
drive. In other words, if you used an X25-M G2 as a normal desktop
drive, 4KB random write performance wouldnCBB
http://www.anandtech.com/show/2738/25 t really degrade over
time. Even without TRIM.''

  note this is not advice to buy sandforce for slog because I don't
  know if anyone's tested it respects flush-cache commands and suspect
  it may drop them.

sumary: There's probably been major, documented shifts in the industry
between when you tested and now, but no one knows because you don't
even tell what you tested or how---you just spread FUD and flog the
DDRDrive and then say ``do research to prove me wrong or else my hazy
statement stands.''  bad science.


pgp6mClKLco5m.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-12 Thread Saxon, Will
 

 -Original Message-
 From: zfs-discuss-boun...@opensolaris.org 
 [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Miles Nordin
 Sent: Tuesday, October 12, 2010 5:15 PM
 To: zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] Bursty writes - why?
 
  en == Eff Norwood sm...@jsvp.com writes:
 
 en We also tried SSDs as the ZIL which worked ok until they got
 en full, then performance tanked. As I have posted before, SSDs
 en as your ZIL - don't do it!
 
 yeah, iirc the thread went back and forth between you and I for a few
 days, something like this,
 
 you: SSD's work fine at first, then slow down, see this anandtech
  article.  We got bit by this.
 
 me:  That article is two years old.  Read this other article which is
  one year old and explains the problem is fixed if you 
 buy current gen2
  intel or sandforce-based SSD.
 
 you: Well absent test results from you I think we will just have to
  continue believing that all SSD's gradually slow down like I
  said, though I would love to be proved wrong.
 
 me:  You haven't provided any test results yourself nor even said what
  drive you're using.  We've both just cited anandtech, and my
  citation's newer than yours.
 
 you: I welcome further tests that prove the DDRDrive is not the only
  suitable ZIL, but absent these tests we have to assume I'm right
  that it is.
 
 silly!
 
 slowdowns with age:
   http://www.pcper.com/article.php?aid=669
   http://www.anandtech.com/show/2738/15
 
 slowdowns fixed:
   http://www.anandtech.com/show/2899/8  
 
   ``With the X25-M G2 Intel managed to virtually eliminate the
 random-write performance penalty on a sequentially filled
 drive. In other words, if you used an X25-M G2 as a normal desktop
 drive, 4KB random write performance wouldnCBB
 http://www.anandtech.com/show/2738/25 t really degrade over
 time. Even without TRIM.''
 
   note this is not advice to buy sandforce for slog because I don't
   know if anyone's tested it respects flush-cache commands and suspect
   it may drop them.
 
 sumary: There's probably been major, documented shifts in the industry
 between when you tested and now, but no one knows because you don't
 even tell what you tested or how---you just spread FUD and flog the
 DDRDrive and then say ``do research to prove me wrong or else my hazy
 statement stands.''  bad science.

Another article concerning Sandforce performance:

http://www.anandtech.com/show/3667/6

Evidently, since the Sandforce controllers do deduplication to reduce writes, 
write performance with highly random data suffers relative to 'normal' data. In 
particular:

Sequential write speed actually takes the biggest hit of them all. At only 
144.4MB/s if you're writing highly random data sequentially you'll find that 
the SF-1200/SF-1500 performs worse than just about any other SSD controller on 
the market. Only the X25-M is slower. While the impact to read performance and 
random write performance isn't terrible, sequential performance does take a 
significant hit on these SandForce drives.

Unfortunately, this article doesn't actually compare performance for writes of 
random data between different controllers, it just says that the random data 
write performance of the Sandforce is worse than everything but the X25-M 
relative to their 'normal' data write performance. Do other controllers do 
dedup on written data like Sandforce? 

When I read this I thought that it kind of eliminated Sandforce drives from 
consideration as SLOG devices, which is a pity because the OCZ Vertex 2 EX or 
Vertex 2 Pro SAS otherwise look like good candidates. 

-Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-12 Thread Bob Friesenhahn

On Tue, 12 Oct 2010, Saxon, Will wrote:
When I read this I thought that it kind of eliminated Sandforce 
drives from consideration as SLOG devices, which is a pity because 
the OCZ Vertex 2 EX or Vertex 2 Pro SAS otherwise look like good 
candidates.


For obvious reasons, the SLOG is designed to write sequentially. 
Otherwise it would offer much less benefit.  Maybe this random-write 
issue with Sandforce would not be a problem?


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-12 Thread Scott Meilicke
On Oct 12, 2010, at 3:31 PM, Bob Friesenhahn wrote:
 
 For obvious reasons, the SLOG is designed to write sequentially. Otherwise it 
 would offer much less benefit.  Maybe this random-write issue with Sandforce 
 would not be a problem?


Isn't writing from cache to disk designed to be sequential, while writes to the 
ZIL/SLOG will be more random (in order to commit quickly)?

Scott Meilicke



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-12 Thread Christopher George
 Maybe this random-write issue with Sandforce would not be a 
 problem?

It is most definitely a problem, as one needs to question the
conventional assertion of a sequential write pattern?  I presented 
some findings recently at the Nexenta Training Seminar in 
Rotterdam.  Here is a link to an excerpt (full presentation available 
to those interested, email cgeorge at ddrdrive dot com):

http://www.ddrdrive.com/zil_iopattern_excerpt.pdf

In summary, a sequential write pattern is found for a pool with 
only a single file system.  But as additional file systems are added 
the resultant (or aggregate) write pattern trends to random.  Over 
50% random with a pool containing just 5 filesystems.  This makes 
intuitive sense knowing each filesystem has it's own ZIL and they 
all share the dedicated log (ZIL Accelerator).

Best regards,

Christopher George
Founder/CTO
www.ddrdrive.com
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-12 Thread Saxon, Will
Bob Friesenhan wrote:

 On Tue, 12 Oct 2010, Saxon, Will wrote:
 When I read this I thought that it kind of eliminated Sandforce
 drives from consideration as SLOG devices, which is a pity because
 the OCZ Vertex 2 EX or Vertex 2 Pro SAS otherwise look like good
 candidates.

 For obvious reasons, the SLOG is designed to write sequentially.
 Otherwise it would offer much less benefit.  Maybe this random-write
 issue with Sandforce would not be a problem?

The observation was that the Sandforce controllers perform more 
poorly than others when sequentially writing highly random data, 
not with random writes of 'normal' data. 

-Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-07 Thread Eff Norwood
The NFS client that we're using always uses O_SYNC, which is why it was 
critical for us to use the DDRdrive X1 as the ZIL. I was unclear on the entire 
system we're using, my apologies. It is:

OpenSolaris SNV_134
Motherboard: SuperMicro X8DAH
RAM: 72GB
CPU: Dual Intel 5503 @ 2.0GHz
ZIL: DDRdrive X1 (two of these, independent and not mirrored)
Drives: 24 x Seagate 1TB SAS, 7200 RPM
Network connected via 3 x gigabit links as LACP + 1 gigabit backup, IPMP on top 
of those.

The output I posted is from zpool iostat and I used that because it corresponds 
to what users are seeing. Whenever zpool iostat shows write activity, the file 
copies to the system are working as expected. As soon as zpool iostat shows no 
activity, the writes all pause. The simple test case is to copy a cd-rom ISO 
image to the server while doing the zpool iostat.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-07 Thread Eff Norwood
Figured it out - it was the NFS client. I used snoop and then some dtrace magic 
to prove that the client (which was using O_SYNC) was sending very bursty 
requests to the system. I tried a number of other NFS clients with O_SYNC as 
well and got excellent performance when they were configured correctly. Just 
for fun I disabled the DDRdrive X1 (pair of them) that I use for the ZIL and 
performance tanked across the board when using O_SYNC. I can't recommend the 
DDRdrive X1 enough as a ZIL! Here is a great article on this behavior here: 
http://blogs.sun.com/brendan/entry/slog_screenshots

Thanks for the help all!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Bursty writes - why?

2010-10-06 Thread Eff Norwood
I have a 24 x 1TB system being used as an NFS file server. Seagate SAS disks 
connected via an LSI 9211-8i SAS controller, disk layout 2 x 11 disk RAIDZ2 + 2 
spares. I am using 2 x DDR Drive X1s as the ZIL. When we write anything to it, 
the writes are always very bursty like this:

ool488K  20.0T  0  0  0  0
xpool488K  20.0T  0  0  0  0
xpool488K  20.0T  0  0  0  0
xpool488K  20.0T  0232  0  29.0M
xpool488K  20.0T  0101  0  12.7M
xpool488K  20.0T  0  0  0  0
xpool488K  20.0T  0  0  0  0
xpool488K  20.0T  0  0  0  0
xpool488K  20.0T  0  0  0  0
xpool488K  20.0T  0 50  0  6.37M
xpool488K  20.0T  0477  0  59.7M
xpool488K  20.0T  0  0  0  0
xpool488K  20.0T  0  0  0  0
xpool488K  20.0T  0  0  0  0
xpool488K  20.0T  0  0  0  0
xpool488K  20.0T  0  0  0  0
xpool   74.7M  20.0T  0702  0  76.2M
xpool   74.7M  20.0T  0577  0  72.2M
xpool   74.7M  20.0T  0110  0  13.9M
xpool   74.7M  20.0T  0  0  0  0
xpool   74.7M  20.0T  0  0  0  0
xpool   74.7M  20.0T  0  0  0  0
xpool   74.7M  20.0T  0  0  0  0

Whenever you see 0 the write is just hanging. What I would like to see is at 
least some writing happening every second. What can I look at for this issue?

Thanks
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-06 Thread Marty Scholes
I think you are seeing ZFS store up the writes, coalesce them, then flush to 
disk every 30 seconds.

Unless the writes are synchronous, the ZIL won't be used, but the writes will 
be cached instead, then flushed.

If you think about it, this is far more sane than flushing to disk every time 
the write() system call is used.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Bursty writes - why?

2010-10-06 Thread Bob Friesenhahn

On Wed, 6 Oct 2010, Marty Scholes wrote:

If you think about it, this is far more sane than flushing to disk 
every time the write() system call is used.


Yes, it dramatically diminishes the number of copy-on-write writes and 
improves the pool layout efficiency.  It also saves energy.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss