Re: [zfs-discuss] Benchmarking Methodologies

2010-04-24 Thread Robert Milkowski

On 21/04/2010 18:37, Ben Rockwood wrote:

You've made an excellent case for benchmarking and where its useful
but what I'm asking for on this thread is for folks to share the
research they've done with as much specificity as possible for research
purposes. :)
   


However you can also find some benchmarks with sysbench + mysql or oracle.
I don't remember if I posted or not some of my results but I'm pretty 
sure you can find others.


--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Benchmarking Methodologies

2010-04-23 Thread Scott Meilicke
My use case for opensolaris is as a storage server for a VM environment (we 
also use EqualLogic, and soon an EMC CX4-120). To that end, I use iometer 
within a VM, simulating my VM IO activity, with some balance given to easy 
benchmarking. We have about 110 VMs across eight ESX hosts. Here is what I do:

* Attach a 100G vmdk to one Windows 2003 R2 VM
* Create a 32G test file (my opensolaris box has 16G of RAM)
* export/import the pool on the solaris box, and reboot my guest to clear 
caches all around
* Run a disk queue depth of 32 outstanding IOs
* 60% read, 65% random, 8k block size
* Run for five minutes spool up, then run the test for five minutes

My actual workload is closer to 50% read, 16k block size, so I adjust my 
interpretation of the results accordingly. 

Probably I should run a lot more iometer daemons.

Performance will increase as the benchmark runs due to the l2arc filling up, so 
I found that running the benchmark starting at 5 minutes into the work load was 
a happy medium. Things will get a bit faster the longer the benchmark runs, but 
this is good as far as benchmarking goes.

Only occasionally due I get wacko results, which I happily toss out the window.

Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Benchmarking Methodologies

2010-04-21 Thread Darren J Moffat

On 21/04/2010 04:43, Ben Rockwood wrote:

I'm doing a little research study on ZFS benchmarking and performance
profiling.  Like most, I've had my favorite methods, but I'm
re-evaluating my choices and trying to be a bit more scientific than I
have in the past.


To that end, I'm curious if folks wouldn't mind sharing their work on
the subject?  What tool(s) to you prefer in what situations?  Do you
have a standard method of running them (tool args; block sizes, thread
counts, ...) or procedures between runs (zpool import/export, new
dataset creation,...)?  etc.


filebench is useful to look at.  One of the interesting things about 
filebench is it has a filesystem specific flush script that it 
executes between runs - the idea being to get rid of anything cached. 
For ZFS it exports and imports the pool. filebench also has 
configurations for benchmarking particular well known workloads (like 
OLTP, file serving, webserving etc). Now having said that I've had some 
interesting results (where I know I'm asking ZFS to do more work with 
the data yet filebench results showed they were faster) with it recently 
that make me wonder a little about some of what it does.  There is an 
option to filebench to generate comparison tables between multiple runs 
- though all that really does it put the columns next to each other you 
don't get any percentage differences or useful info.


I've been recommended to look at vdbench as well and I've just started 
looking at that.  Unlike filebench though vdbench will need to have 
support scripts to flush out cached data etc.   vdbench is written in 
Java so it is possible to run it on your local and remote storage (where 
remote might not be OpenSolaris) it also has a nice GUI compare tool 
that uses colour and percentages to show the differences between runs.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Benchmarking Methodologies

2010-04-21 Thread Robert Milkowski

On 21/04/2010 04:43, Ben Rockwood wrote:

I'm doing a little research study on ZFS benchmarking and performance
profiling.  Like most, I've had my favorite methods, but I'm
re-evaluating my choices and trying to be a bit more scientific than I
have in the past.


To that end, I'm curious if folks wouldn't mind sharing their work on
the subject?  What tool(s) to you prefer in what situations?  Do you
have a standard method of running them (tool args; block sizes, thread
counts, ...) or procedures between runs (zpool import/export, new
dataset creation,...)?  etc.


Any feedback is appreciated.  I want to get a good sampling of opinions.

   


I haven't heard from you in a while! Good to see you here again :)

Sorry for stating obvious but at the end of a day it depends on what 
your goals are.

Are you interested in micro-benchmarks and comparison to other file systems?

I think the most relevant filesystem benchmarks for users is when you 
benchmark a specific application and present results from an application 
point of view. For example, given a workload for Oracle, MySQL, LDAP, 
... how quickly it completes? How much benefit there is by using SSDs? 
What about other filesystems?


Micro-benchmarks are fine but very hard to be properly interpreted by 
most users.


Additionally most benchmarks are almost useless if they are not compared 
to some other configuration with only a benchmarked component changed. 
For example, knowing that some MySQL load completes in 1h on ZFS is 
basically useless. But knowing that on the same HW with Linux/ext3 and 
under the same load it completes in 2h would be interesting to users.


Other interesting thing would be to see an impact of different ZFS 
setting on a benchmark results (aligned recordsize for database vs. 
default, atime off vs. on, lzjb, gzip, ssd). Also comparison of 
benchmark results with all default zfs setting compared to whatever 
setting you did which gave you the best result.


--
Robert Milkowski
http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Benchmarking Methodologies

2010-04-21 Thread Thomas Uebermeier

Ben,

never trust a benchmark, you haven't faked yourself!

There are many benchmarks out there, but the question is, how relevant are
they for your usage pattern. How important are single stream benchmarks, when
you are opening and closing 1000s of files per second or if you run a DB on
top of it.
At the end there is only one benchmark, which is the one you wrote yourself
which simulates your application.

There are some generic ones, like dd or iozone, which gives you a throughput
number or some other (bonnie, etc.) , which tests other functions.
At the end you need to know what is important for your usage and if you care
on numbers like how many snapshots you can do per seconds or not.

Writing your own benchmark in perl or a similar scripting language is quickly
done and gives you the numbers you need. At the end storage is a complex
system and there are many variables between the write() request and the bit
being written on a piece of hardware. I wouldn't trust any numbers from
syscall/sec benchmarks being relevant in my environment.

Thomas
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Benchmarking Methodologies

2010-04-21 Thread Ben Rockwood
On 4/21/10 2:15 AM, Robert Milkowski wrote:
 I haven't heard from you in a while! Good to see you here again :)

 Sorry for stating obvious but at the end of a day it depends on what
 your goals are.
 Are you interested in micro-benchmarks and comparison to other file
 systems?

 I think the most relevant filesystem benchmarks for users is when you
 benchmark a specific application and present results from an
 application point of view. For example, given a workload for Oracle,
 MySQL, LDAP, ... how quickly it completes? How much benefit there is
 by using SSDs? What about other filesystems?

 Micro-benchmarks are fine but very hard to be properly interpreted by
 most users.

 Additionally most benchmarks are almost useless if they are not
 compared to some other configuration with only a benchmarked component
 changed. For example, knowing that some MySQL load completes in 1h on
 ZFS is basically useless. But knowing that on the same HW with
 Linux/ext3 and under the same load it completes in 2h would be
 interesting to users.

 Other interesting thing would be to see an impact of different ZFS
 setting on a benchmark results (aligned recordsize for database vs.
 default, atime off vs. on, lzjb, gzip, ssd). Also comparison of
 benchmark results with all default zfs setting compared to whatever
 setting you did which gave you the best result.

Hey Robert... I'm always around. :)

You've made an excellent case for benchmarking and where its useful
but what I'm asking for on this thread is for folks to share the
research they've done with as much specificity as possible for research
purposes. :)

Let me illustrate:

To Darren's point on FileBench and vdbench... to date I've found these
two to be the most useful.   IOzone, while very popular, has always
given me strange results which are inconsistent regardless of how large
the block and data is.  Given that the most important aspect of any
benchmark is repeatability and sanity in results, I've found no value in
IOzone any longer.

vdbench has become my friend particularly in the area of physical disk
profiling.  Before tuning ZFS (or any filesystem) its important to find
a solid baseline of performance on the underlying disk structure.  So
using a variety of vdbench profiles such as the following help you
pinpoint exactly the edges of the performance envelope:

sd=sd1,lun=/dev/rdsk/c0t1d0s0,threads=1
wd=wd1,sd=sd1,readpct=100,rhpct=0,seekpct=0
rd=run1,wd=wd1,iorate=max,elapsed=10,interval=1,forxfersize=(4k-4096k,d)

With vdbench and the workload above I can get consistent, reliable
results time after time and the results on other systems match.
This is particularly key if your running a hardware RAID controller
under ZFS.  There isn't anything dd can do that vdbench can't do
better.  Using a workload like above both at differing xfer sizes and
also at differing thread counts really helps give an accurate picture of
the disk capabilities.

Moving up into the filesystem.  I've been looking intently at improving
my FileBench profiles, based on the supplied ones with tweaking.  I'm
trying to get to a methodology that provides me with time-after-time
repeatable results for real comparison between systems. 

I'm looking hard at vdbench file workloads, but they aren't yet nearly
as sophisticated as FileBench.  I am also looking at FIO
(http://freshmeat.net/projects/fio/), which is FileBench-esce.


At the end of the day, I agree entirely that application benchmarks are
far more effective judges... but they are also more time consuming and
less flexible than dedicated tools.   The key is honing generic
benchmarks to provide useful data which can be relied upon for making
accurate estimates as regards to application performance.  When you
start judging filesystem performance based on something like MySQL there
are simply too many variables involved.


So, I appreciate the Benchmark 101, but I'm looking for anyone
interested in sharing meat.  Most of the existing ZFS benchmarks folks
published are several years old now, and most were using IOzone.

benr.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Benchmarking Methodologies

2010-04-20 Thread Ben Rockwood
I'm doing a little research study on ZFS benchmarking and performance
profiling.  Like most, I've had my favorite methods, but I'm
re-evaluating my choices and trying to be a bit more scientific than I
have in the past.


To that end, I'm curious if folks wouldn't mind sharing their work on
the subject?  What tool(s) to you prefer in what situations?  Do you
have a standard method of running them (tool args; block sizes, thread
counts, ...) or procedures between runs (zpool import/export, new
dataset creation,...)?  etc.


Any feedback is appreciated.  I want to get a good sampling of opinions.

Thanks!



benr.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss