Re: [zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-14 Thread Richard Elling
additional clarification ...

On Jan 14, 2010, at 8:49 AM, Richard Elling wrote:

> On Jan 14, 2010, at 6:41 AM, Gary Mills wrote:
> 
>> On Thu, Jan 14, 2010 at 01:47:46AM -0800, Roch wrote:
>>> 
>>> Gary Mills writes:
 
 Yes, I understand that, but do filesystems have separate queues of any
 sort within the ZIL?  If not, would it help to put the database
 filesystems into a separate zpool?
 
>>> 
>>> The slog device is for the pool but the ZIL is per
>>> filesystem/dataset. The logbias property can be used on a dataset to
>>> prevent that set from consuming the slog device resource  :
>>> 
>>> http://blogs.sun.com/roch/entry/synchronous_write_bias_property
>> 
>> Ah, that's what I wanted to know.  Thanks for the response.
> 
> Roch, I think this can be misinterpreted, so perhaps more clarity is needed.
> 
> If you have sync writes, they will be written to persistent storage before
> they are acknowledged. 
> 
> The only question is where they will be written: to the ZIL or pool?
> 
> By default, this preference is based on the size of each I/O, with small
> I/Os written to the ZIL and large I/Os written to the pool.
> 
> The dataset parameter logbias is used to set the ZIL vs pool preference. 
> 
> Thus, one could force all datasets, save one, to use the pool and permit
> the one, lucky dataset to use the ZIL (or vice versa) 

Should read:
Thus, one could force all datasets, save one, to use the pool and permit
the one, lucky dataset to use the ZIL (or vice versa) for large I/Os.
 -- richard

> Separate log devices is an orthogonal issue.
> -- richard
> 

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-14 Thread Richard Elling
On Jan 14, 2010, at 6:41 AM, Gary Mills wrote:

> On Thu, Jan 14, 2010 at 01:47:46AM -0800, Roch wrote:
>> 
>> Gary Mills writes:
>>> 
>>> Yes, I understand that, but do filesystems have separate queues of any
>>> sort within the ZIL?  If not, would it help to put the database
>>> filesystems into a separate zpool?
>>> 
>> 
>> The slog device is for the pool but the ZIL is per
>> filesystem/dataset. The logbias property can be used on a dataset to
>> prevent that set from consuming the slog device resource  :
>> 
>>  http://blogs.sun.com/roch/entry/synchronous_write_bias_property
> 
> Ah, that's what I wanted to know.  Thanks for the response.

Roch, I think this can be misinterpreted, so perhaps more clarity is needed.

If you have sync writes, they will be written to persistent storage before
they are acknowledged. 

The only question is where they will be written: to the ZIL or pool?

By default, this preference is based on the size of each I/O, with small
I/Os written to the ZIL and large I/Os written to the pool.

The dataset parameter logbias is used to set the ZIL vs pool preference. 

Thus, one could force all datasets, save one, to use the pool and permit
the one, lucky dataset to use the ZIL (or vice versa)

Separate log devices is an orthogonal issue.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-14 Thread Gary Mills
On Thu, Jan 14, 2010 at 01:47:46AM -0800, Roch wrote:
> 
> Gary Mills writes:
>  > 
>  > Yes, I understand that, but do filesystems have separate queues of any
>  > sort within the ZIL?  If not, would it help to put the database
>  > filesystems into a separate zpool?
>  > 
> 
> The slog device is for the pool but the ZIL is per
> filesystem/dataset. The logbias property can be used on a dataset to
> prevent that set from consuming the slog device resource  :
> 
>   http://blogs.sun.com/roch/entry/synchronous_write_bias_property

Ah, that's what I wanted to know.  Thanks for the response.

-- 
-Gary Mills--Unix Group--Computer and Network Services-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-14 Thread Gary Mills
On Thu, Jan 14, 2010 at 10:58:48AM +1100, Daniel Carosone wrote:
> On Wed, Jan 13, 2010 at 08:21:13AM -0600, Gary Mills wrote:
> > Yes, I understand that, but do filesystems have separate queues of any
> > sort within the ZIL?
> 
> I'm not sure. If you can experiment and measure a benefit,
> understanding the reasons is helpful but secondary.  If you can't
> experiment so easily, you're stuck asking questions, as now, to see
> whether the effort of experimenting is potentially worthwhile. 

Yes, we're stuck asking questions.  I appreciate your responses.

> Some other things to note (not necessarily arguments for or against):
> 
>  * you can have multiple slog devices, in case you're creating
>so much ZIL traffic that ZIL queueing is a real problem, however
>shared or structured between filesystems.

For the time being, I'd like to stay with the ZIL that's internal
to the zpool.

>  * separate filesystems can have different properties which might help
>tuning and experiments (logbias, copies, compress, *cache), as well
>the recordsize.  Maybe you will find that compress on mailboxes
>helps, as long as you're not also compressing the db's?

Yes, that's a good point in favour of a separate filesystem.

>  * separate filesystems may have different recovery requirements
>(snapshot cycles).  Note that taking snapshots is ~free, but
>keeping them and deleting them have costs over time.  Perhaps you
>can save some of these costs if the db's are throwaway/rebuildable. 

Also a good point.

> > If not, would it help to put the database
> > filesystems into a separate zpool?
> 
> Maybe, if you have the extra devices - but you need to compare with
> the potential benefit of adding those devices (and their IOPS) to
> benefit all users of the existing pool.
> 
> For example, if the databases are a distinctly different enough load,
> you could compare putting them on a dedicated pool on ssd, vs using
> those ssd's as additional slog/l2arc.  Unless you can make quite
> categorical separations between the workloads, such that an unbalanced
> configuration matches an unbalanced workload, you may still be better
> with consolidated IO capacity in the one pool.

As well, I'd like to keep all of the ZFS pools on the same external
storage device.  This makes migrating to a different server quite easy.

> Note, also, you can only take recursive atomic snapshots within the
> one pool - this might be important if the db's have to match the
> mailbox state exactly, for recovery.

That's another good point.  It's certainly better to have synchronized
snapshots.

-- 
-Gary Mills--Unix Group--Computer and Network Services-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-14 Thread Roch


Gary Mills writes:

 > On Tue, Jan 12, 2010 at 01:56:57PM -0800, Richard Elling wrote:
 > > On Jan 12, 2010, at 12:37 PM, Gary Mills wrote:
 > > 
 > > > On Tue, Jan 12, 2010 at 11:11:36AM -0600, Bob Friesenhahn wrote:
 > > >> On Tue, 12 Jan 2010, Gary Mills wrote:
 > > >>> 
 > > >>> Is moving the databases (IMAP metadata) to a separate ZFS filesystem
 > > >>> likely to improve performance?  I've heard that this is important, but
 > > >>> I'm not clear why this is.
 > > > 
 > > > I found a couple of references that suggest just putting the databases
 > > > on their own ZFS filesystem has a great benefit.  One is an e-mail
 > > > message to a mailing list from Vincent Fox at UC Davis.  They run a
 > > > similar system to ours at that site.  He says:
 > > > 
 > > >Particularly the database is important to get it's own filesystem so
 > > >that it's queue/cache are separated.
 > > 
 > > Another policy you might consider is the recordsize for the 
 > > database vs the message store.  In general, databases like the
 > > recordsize to match.  Of course, recordsize is a per-dataset 
 > > parameter.
 > 
 > Unfortunately, it's not a single database.  There are many of them, of
 > different types.  One is a Berkeley DB, others are something specific
 > to the IMAP server (called skiplist), and some are small flat files
 > that are just rewritten.  All they have in common is activity and
 > frequent locking.  They can be relocated as a whole.
 > 
 > > > The second one is from:
 > > > 
 > > >http://blogs.sun.com/roch/entry/the_dynamics_of_zfs
 > > > 
 > > > He says:
 > > > 
 > > >For file modification that come with some immediate data integrity
 > > >constraint (O_DSYNC, fsync etc.) ZFS manages a per-filesystem intent
 > > >log or ZIL.
 > > > 
 > > > This sounds like the ZIL queue mentioned above.  Is I/O for each of
 > > > those handled separately?
 > > 
 > > ZIL is for the pool.
 > 
 > Yes, I understand that, but do filesystems have separate queues of any
 > sort within the ZIL?  If not, would it help to put the database
 > filesystems into a separate zpool?
 > 

The slog device is for the pool but the ZIL is per
filesystem/dataset. The logbias property can be used on a dataset to
prevent that set from consuming the slog device resource  :

http://blogs.sun.com/roch/entry/synchronous_write_bias_property

-r


 > > We did some experiments with the messaging server and a RAID
 > > array with separate logs. As expected, it didn't make much difference
 > > because of the nice, large nonvolatile write cache on the array. This
 > > reinforces the notion that Dan Carosone also recently noted: performance
 > > gains for separate logs are possible when the latency of the separate
 > > log device is much lower than the latency of the devices in the main pool,
 > > and, of course, the workload uses sync writes.
 > 
 > It certainly sounds as if latency is the key for synchronous writes.
 > 
 > -- 
 > -Gary Mills--Unix Group--Computer and Network Services-
 > ___
 > zfs-discuss mailing list
 > zfs-discuss@opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-13 Thread Daniel Carosone
On Wed, Jan 13, 2010 at 08:21:13AM -0600, Gary Mills wrote:
> Yes, I understand that, but do filesystems have separate queues of any
> sort within the ZIL?

I'm not sure. If you can experiment and measure a benefit,
understanding the reasons is helpful but secondary.  If you can't
experiment so easily, you're stuck asking questions, as now, to see
whether the effort of experimenting is potentially worthwhile. 

Some other things to note (not necessarily arguments for or against):

 * you can have multiple slog devices, in case you're creating
   so much ZIL traffic that ZIL queueing is a real problem, however
   shared or structured between filesystems.
 * separate filesystems can have different properties which might help
   tuning and experiments (logbias, copies, compress, *cache), as well
   the recordsize.  Maybe you will find that compress on mailboxes
   helps, as long as you're not also compressing the db's?
 * separate filesystems may have different recovery requirements
   (snapshot cycles).  Note that taking snapshots is ~free, but
   keeping them and deleting them have costs over time.  Perhaps you
   can save some of these costs if the db's are throwaway/rebuildable. 

> If not, would it help to put the database
> filesystems into a separate zpool?

Maybe, if you have the extra devices - but you need to compare with
the potential benefit of adding those devices (and their IOPS) to
benefit all users of the existing pool.

For example, if the databases are a distinctly different enough load,
you could compare putting them on a dedicated pool on ssd, vs using
those ssd's as additional slog/l2arc.  Unless you can make quite
categorical separations between the workloads, such that an unbalanced
configuration matches an unbalanced workload, you may still be better
with consolidated IO capacity in the one pool.

Note, also, you can only take recursive atomic snapshots within the
one pool - this might be important if the db's have to match the
mailbox state exactly, for recovery.

--
Dan.

pgpdoPYf5GMFk.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-13 Thread Gary Mills
On Tue, Jan 12, 2010 at 01:56:57PM -0800, Richard Elling wrote:
> On Jan 12, 2010, at 12:37 PM, Gary Mills wrote:
> 
> > On Tue, Jan 12, 2010 at 11:11:36AM -0600, Bob Friesenhahn wrote:
> >> On Tue, 12 Jan 2010, Gary Mills wrote:
> >>> 
> >>> Is moving the databases (IMAP metadata) to a separate ZFS filesystem
> >>> likely to improve performance?  I've heard that this is important, but
> >>> I'm not clear why this is.
> > 
> > I found a couple of references that suggest just putting the databases
> > on their own ZFS filesystem has a great benefit.  One is an e-mail
> > message to a mailing list from Vincent Fox at UC Davis.  They run a
> > similar system to ours at that site.  He says:
> > 
> >Particularly the database is important to get it's own filesystem so
> >that it's queue/cache are separated.
> 
> Another policy you might consider is the recordsize for the 
> database vs the message store.  In general, databases like the
> recordsize to match.  Of course, recordsize is a per-dataset 
> parameter.

Unfortunately, it's not a single database.  There are many of them, of
different types.  One is a Berkeley DB, others are something specific
to the IMAP server (called skiplist), and some are small flat files
that are just rewritten.  All they have in common is activity and
frequent locking.  They can be relocated as a whole.

> > The second one is from:
> > 
> >http://blogs.sun.com/roch/entry/the_dynamics_of_zfs
> > 
> > He says:
> > 
> >For file modification that come with some immediate data integrity
> >constraint (O_DSYNC, fsync etc.) ZFS manages a per-filesystem intent
> >log or ZIL.
> > 
> > This sounds like the ZIL queue mentioned above.  Is I/O for each of
> > those handled separately?
> 
> ZIL is for the pool.

Yes, I understand that, but do filesystems have separate queues of any
sort within the ZIL?  If not, would it help to put the database
filesystems into a separate zpool?

> We did some experiments with the messaging server and a RAID
> array with separate logs. As expected, it didn't make much difference
> because of the nice, large nonvolatile write cache on the array. This
> reinforces the notion that Dan Carosone also recently noted: performance
> gains for separate logs are possible when the latency of the separate
> log device is much lower than the latency of the devices in the main pool,
> and, of course, the workload uses sync writes.

It certainly sounds as if latency is the key for synchronous writes.

-- 
-Gary Mills--Unix Group--Computer and Network Services-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-12 Thread Richard Elling
On Jan 12, 2010, at 12:37 PM, Gary Mills wrote:

> On Tue, Jan 12, 2010 at 11:11:36AM -0600, Bob Friesenhahn wrote:
>> On Tue, 12 Jan 2010, Gary Mills wrote:
>>> 
>>> Is moving the databases (IMAP metadata) to a separate ZFS filesystem
>>> likely to improve performance?  I've heard that this is important, but
>>> I'm not clear why this is.
>> 
>> There is an obvious potential benefit in that you are then able to 
>> tune filesystem parameters to best fit the needs of the application 
>> which updates the data.  For example, if the database uses a small 
>> block size, then you can set the filesystem blocksize to match.  If 
>> the database uses memory mapped files, then using a filesystem 
>> blocksize which is closest to the MMU page size may improve 
>> performance.
> 
> I found a couple of references that suggest just putting the databases
> on their own ZFS filesystem has a great benefit.  One is an e-mail
> message to a mailing list from Vincent Fox at UC Davis.  They run a
> similar system to ours at that site.  He says:
> 
>Particularly the database is important to get it's own filesystem so
>that it's queue/cache are separated.

Another policy you might consider is the recordsize for the 
database vs the message store.  In general, databases like the
recordsize to match.  Of course, recordsize is a per-dataset 
parameter.

> The second one is from:
> 
>http://blogs.sun.com/roch/entry/the_dynamics_of_zfs
> 
> He says:
> 
>For file modification that come with some immediate data integrity
>constraint (O_DSYNC, fsync etc.) ZFS manages a per-filesystem intent
>log or ZIL.
> 
> This sounds like the ZIL queue mentioned above.  Is I/O for each of
> those handled separately?

ZIL is for the pool.

We did some experiments with the messaging server and a RAID
array with separate logs. As expected, it didn't make much difference
because of the nice, large nonvolatile write cache on the array. This
reinforces the notion that Dan Carosone also recently noted: performance
gains for separate logs are possible when the latency of the separate
log device is much lower than the latency of the devices in the main pool,
and, of course, the workload uses sync writes.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-12 Thread Ray Van Dolson
On Tue, Jan 12, 2010 at 12:37:30PM -0800, Gary Mills wrote:
> On Tue, Jan 12, 2010 at 11:11:36AM -0600, Bob Friesenhahn wrote:
> > On Tue, 12 Jan 2010, Gary Mills wrote:
> > >
> > >Is moving the databases (IMAP metadata) to a separate ZFS filesystem
> > >likely to improve performance?  I've heard that this is important, but
> > >I'm not clear why this is.
> > 
> > There is an obvious potential benefit in that you are then able to 
> > tune filesystem parameters to best fit the needs of the application 
> > which updates the data.  For example, if the database uses a small 
> > block size, then you can set the filesystem blocksize to match.  If 
> > the database uses memory mapped files, then using a filesystem 
> > blocksize which is closest to the MMU page size may improve 
> > performance.
> 
> I found a couple of references that suggest just putting the databases
> on their own ZFS filesystem has a great benefit.  One is an e-mail
> message to a mailing list from Vincent Fox at UC Davis.  They run a
> similar system to ours at that site.  He says:
> 
> Particularly the database is important to get it's own filesystem so
> that it's queue/cache are separated.
> 
> The second one is from:
> 
> http://blogs.sun.com/roch/entry/the_dynamics_of_zfs
> 
> He says:
> 
> For file modification that come with some immediate data integrity
> constraint (O_DSYNC, fsync etc.) ZFS manages a per-filesystem intent
> log or ZIL.
> 
> This sounds like the ZIL queue mentioned above.  Is I/O for each of
> those handled separately?

That's interesting... and if so, is there a way to designate a log
device for a specific filesystem?

Ray
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-12 Thread Gary Mills
On Tue, Jan 12, 2010 at 11:11:36AM -0600, Bob Friesenhahn wrote:
> On Tue, 12 Jan 2010, Gary Mills wrote:
> >
> >Is moving the databases (IMAP metadata) to a separate ZFS filesystem
> >likely to improve performance?  I've heard that this is important, but
> >I'm not clear why this is.
> 
> There is an obvious potential benefit in that you are then able to 
> tune filesystem parameters to best fit the needs of the application 
> which updates the data.  For example, if the database uses a small 
> block size, then you can set the filesystem blocksize to match.  If 
> the database uses memory mapped files, then using a filesystem 
> blocksize which is closest to the MMU page size may improve 
> performance.

I found a couple of references that suggest just putting the databases
on their own ZFS filesystem has a great benefit.  One is an e-mail
message to a mailing list from Vincent Fox at UC Davis.  They run a
similar system to ours at that site.  He says:

Particularly the database is important to get it's own filesystem so
that it's queue/cache are separated.

The second one is from:

http://blogs.sun.com/roch/entry/the_dynamics_of_zfs

He says:

For file modification that come with some immediate data integrity
constraint (O_DSYNC, fsync etc.) ZFS manages a per-filesystem intent
log or ZIL.

This sounds like the ZIL queue mentioned above.  Is I/O for each of
those handled separately?

-- 
-Gary Mills--Unix Group--Computer and Network Services-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-12 Thread Bob Friesenhahn

On Tue, 12 Jan 2010, Gary Mills wrote:


Is moving the databases (IMAP metadata) to a separate ZFS filesystem
likely to improve performance?  I've heard that this is important, but
I'm not clear why this is.


There is an obvious potential benefit in that you are then able to 
tune filesystem parameters to best fit the needs of the application 
which updates the data.  For example, if the database uses a small 
block size, then you can set the filesystem blocksize to match.  If 
the database uses memory mapped files, then using a filesystem 
blocksize which is closest to the MMU page size may improve 
performance.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] How do separate ZFS filesystems affect performance?

2010-01-12 Thread Gary Mills
I'm working with a Cyrus IMAP server running on a T2000 box under
Solaris 10 10/09 with current patches.  Mailboxes reside on six ZFS
filesystems, each containing about 200 gigabytes of data.  These are
part of a single zpool built on four Iscsi devices from our Netapp
filer.

One of these ZFS filesystems contains a number of global and per-user
databases in addition to one sixth of the mailboxes.  I'm thinking of
moving these databases to a separate ZFS filesystem.  Access to these
databases must be quick to ensure responsiveness of the server.  We
are currently experiencing a slowdown in performance when the number
of simultaneous IMAP sessions rises above 3000.  These databases are
opened and memory-mapped by all processes.  They have the usual
requirement for locking and synchronous writes whenever they are
updated.

Is moving the databases (IMAP metadata) to a separate ZFS filesystem
likely to improve performance?  I've heard that this is important, but
I'm not clear why this is.  Does each filesystem have its own queue in
the ARC or ZIL?  Here are some statistics taken while the server was
busy and access was slow:

# /usr/local/sbin/zilstat 5 5
   N-Bytes  N-Bytes/s N-Max-RateB-Bytes  B-Bytes/s B-Max-Rateops  <=4kB 
4-32kB >=32kB
   1126664 225332 515872   1148518422970363469312292163 
51 79
740536 148107 250896953548819070974005888198106 
24 68
758344 151668 179104   1254604825092092682880227 93 
45 89
603304 120660 204344917913618358272084864179 89 
23 67
948896 189779 346520   1588019231760384173824262108 
32123
# /usr/local/sbin/arcstat 5 5
Time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz c
10:50:16  191M   31M 16   14M8   17M   48   18M   1230G   32G
10:50:211K   148 1076572   5878   1530G   32G
10:50:261K   154 1288765   7296   1830G   32G
10:50:31   79661  7547 6   3525830G   32G
10:50:361K   117  9   105812   5344   1030G   32G

-- 
-Gary Mills--Unix Group--Computer and Network Services-
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss