Reducing ZFS blocksize to improve Cyrus write performance ?

2010-08-09 Thread Eric Luyten
Folks,

A question for those of you running ZFS as the filesystem architecture
for your Cyrus message store : did you consider, measure and/or carry
out a change of the default 128 KB blocksize ?
If so, what value are you using ?

Regards,
Eric Luyten, Computing Centre VUB/ULB.



Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Reducing ZFS blocksize to improve Cyrus write performance ?

2010-08-09 Thread Pascal Gienger
Am 09.08.10 17:22, schrieb Eric Luyten:
 Folks,

 A question for those of you running ZFS as the filesystem architecture
 for your Cyrus message store : did you consider, measure and/or carry
 out a change of the default 128 KB blocksize ?
 If so, what value are you using ?

First:
Changes to ZFS recordsize do not change the on-disk-format of your 
zfs/zpool. It just applies to NEWLY created files or file parts/zfs 
records (!).

Second: As said on a ZFS volume the recordsize is NOT the block size. 
The record size is the size of a single ZFS record read at once. Due to 
the ZIL changes to files get written nearly sequentially so the 
recordsize is nearly irrelevant.

A smaller record size is a good option if you notice an i/o bottleneck 
on your fiberchannel/iSCSI/SAS link. It won't bring you a performance 
gain in random i/o. There is a small exception: Database systems writing 
always the same fixed blocksize. For MySQL some people advise 32k.


ZFS record size is not the same as zfs block size of a zvol (zfs block 
volume). That's another story. But I assume you are not talking about a 
ZFS block volume iSCSI server with a non-zfs-filesystem written on it.

Just my $0.02,

Pascal

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Reducing ZFS blocksize to improve Cyrus write performance ?

2010-08-09 Thread Pascal Gienger
Am 09.08.10 17:33, schrieb Pascal Gienger:

 A smaller record size is a good option if you notice an i/o bottleneck
 on your fiberchannel/iSCSI/SAS link. It won't bring you a performance
 gain in random i/o. There is a small exception: Database systems writing
 always the same fixed blocksize. For MySQL some people advise 32k.

Just another note:
For us, gzip compression had a performance plus, reducing i/o bandwidth 
much better than a smaller recordsize (gzip compression for the 
mailstore, NOT (!) for the meta partition containing the cyrus.* files!).

Just for your info as a reference, we're running happy with this:

-bash-3.00$ zfs get all mail/imap
NAME   PROPERTY  VALUE  SOURCE
mail/imap  type  filesystem -
mail/imap  creation  Mon Aug 13 13:19 2007  -
mail/imap  used  1.58T  -
mail/imap  available 4.96T  -
mail/imap  referenced1.51T  -
mail/imap  compressratio 1.61x  -
mail/imap  mounted   yes-
mail/imap  quota none   default
mail/imap  reservation   none   default
mail/imap  recordsize128K   local
mail/imap  mountpoint/mail/imap default
mail/imap  sharenfs  offdefault
mail/imap  checksum  on default
mail/imap  compression   gzip   local
mail/imap  atime offlocal
mail/imap  devices   offlocal
mail/imap  exec  offlocal
mail/imap  setuidofflocal
mail/imap  readonly  offdefault
mail/imap  zoned offdefault
mail/imap  snapdir   hidden default
mail/imap  aclmode   groupmask  default
mail/imap  aclinheritrestricted default
mail/imap  canmount  on default
mail/imap  shareiscsioffdefault
mail/imap  xattr on default
mail/imap  copies1  default
mail/imap  version   1  -
mail/imap  utf8only  off-
mail/imap  normalization none   -
mail/imap  casesensitivity   sensitive  -
mail/imap  vscan offdefault
mail/imap  nbmandoffdefault
mail/imap  sharesmb  offdefault
mail/imap  refquota  none   default
mail/imap  refreservationnone   default
mail/imap  primarycache  alldefault
mail/imap  secondarycachealldefault
-bash-3.00$ 


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Cyrus replication and failover best pracistes

2010-08-09 Thread Dmitry Ivanov
Hello!
Folks, looking through maillist history i saw that many of you are 
running cyrus in rolling replication mode. I am interested in 
configuring cyrus replica to use as a standby imap server, where we can 
switch DNS in case of problems with primary backend. While testing on 
playground I got some problems and several questions appeared, may be 
you can help me to solve this.

1. Is it safe to leave sync_host: options in imapd.conf and running 
sync_server (due to record in cyrus.conf) on both master and replica, 
and start only sync_client -r on master server? Or better to have 
different config files for different roles?

2. Is there any way to solve issue when master overwrites messages with 
the same filename on replica (messages that were not synced before 
disaster happened) during syncing back to primary host? guid_mode: 
sha1 set.

May be some one can describe method of switching between replicated 
backends in production? For now I want to switch DNS and and than 
start/stop sync_client daemon.

Thank you for assistance!

-- 

Dmitry S. Ivanov

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Reducing ZFS blocksize to improve Cyrus write performance ?

2010-08-09 Thread Vincent Fox
For what Cyrus is doing on Solaris with ZFS, the
recordsize seems nearly negligible.  What with all the
caching in the way, and how ZFS orders transactions, it's
about the last tuneable I'd worry about.

Here's what works well for us, add this to /etc/system:

* Turn off ZFS cache flushing
set zfs:zfs_nocacheflush = 1
* Increase DNLC (Directory Name Lookup Cache)
set ncsize = 50

Turn off atime of course.

Turn on LZJB compression for metapartition but gzip for
the mail data filesystem. Our compression ratio on the mail
filesystem is showing 1.68x.

Our I/O channels average only 4-5% busy with ~6,000 users
per backed mailstore.  We run nightly snapshots and then
backup every other night from the most recent snapshot and
that is factored into the iostat number.






Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Reducing ZFS blocksize to improve Cyrus write performance ?

2010-08-09 Thread Pascal Gienger
Am 09.08.10 19:46, schrieb Vincent Fox:
 * Turn off ZFS cache flushing
 set zfs:zfs_nocacheflush = 1

For hardware (fiberchannel, iSCSI, SSA, ...) arrays with their own Cache 
this is a must.

 * Increase DNLC (Directory Name Lookup Cache)
 set ncsize = 50

vmstat -s | grep 'total name lookups'
135562914356 total name lookups (cache hits 96%)

:-)
Unless the percent ratio is not below 90% increasing the DNLC is not so 
useful.

 Turn off atime of course.

Sure.

 Turn on LZJB compression for metapartition but gzip for
 the mail data filesystem. Our compression ratio on the mail
 filesystem is showing 1.68x.

Yes. GZIP for Mail, LZJB for Meta. Identical configuration here.

Pascal

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Reducing ZFS blocksize to improve Cyrus write performance ?

2010-08-09 Thread Vincent Fox
On Mon, 2010-08-09 at 17:22 +0200, Eric Luyten wrote:
 Folks,
 
  did you consider, measure and/or carry
 out a change of the default 128 KB blocksize ?

To more directly answer your question than last post...

We did some testing with Bonnie++ prior to deployment
and changing recordsize didn't reveal any particular
improvement for what we guessed represented simulation.

After deployment we ran into performance problems, which
turned out to be related to fsync corner in then-current
release, later fixed in a patch.  We ran a performance
tool from Sun which clearly showed the problem with fsync
but I can't recall it's name right now.  We were in production
though at that point and not free to vary recordsizes and
see the effect with that tool.




Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus replication and failover best pracistes

2010-08-09 Thread Bron Gondwana
On Mon, Aug 09, 2010 at 08:15:36PM +0400, Dmitry Ivanov wrote:
   Hello!
 Folks, looking through maillist history i saw that many of you are 
 running cyrus in rolling replication mode. I am interested in 
 configuring cyrus replica to use as a standby imap server, where we can 
 switch DNS in case of problems with primary backend. While testing on 
 playground I got some problems and several questions appeared, may be 
 you can help me to solve this.
 
 1. Is it safe to leave sync_host: options in imapd.conf and running 
 sync_server (due to record in cyrus.conf) on both master and replica, 
 and start only sync_client -r on master server? Or better to have 
 different config files for different roles?

Yeah, that's pretty safe.  We run sync_server on our masters as well
so that we can move users between machines.

I'm not such a fan of the sync_host config variables - I'd prefer to
pass the information on the sync_client command line.  Should go fix
that!
 
 2. Is there any way to solve issue when master overwrites messages with 
 the same filename on replica (messages that were not synced before 
 disaster happened) during syncing back to primary host? guid_mode: 
 sha1 set.

We have a patch at FastMail that does it.  There's one again 2.3.16,
or soon it will be the default behaviour with the new sync protocol
(I keep talking about it ...)  It's actually up and running at FastMail
now, so I'll be pushing it back to CVS soon, and we'll work on making
a release.

 May be some one can describe method of switching between replicated 
 backends in production? For now I want to switch DNS and and than 
 start/stop sync_client daemon.

We do have slightly different configurations, so we have to shut down
both ends.  In future I plan to have sync_client running at both ends,
so it's master-master, but with DNS only pointing at one end, and some
sort of barrier process where we kill off connections before switching.

The barrier is needed if you don't want to be in split-brain recovery
mode ALL the time, because some clients hold IMAP connections open for
days.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html