Re: [Dovecot] Dovecot tuning for GFS2

2013-08-25 Thread Jan-Frode Myklebust
On Thu, Aug 22, 2013 at 08:57:40PM -0500, Stan Hoeppner wrote:
 
 
 130m to 18m is 'only' a 7 fold decrease.  18m inodes is still rather
 large for any filesystem, cluster or local.  A check on an 18m inode XFS
 filesystem, even on fast storage, would take quite some time.  I'm sure
 it would take quite a bit longer to check a GFS2 with 18m inodes.  


We use GPFS, not GFS2. Luckily we've never needed to run fsck on it, but
it has support for online fsck so hopefully it would be bareable (but
please, lets not talk about such things, knock on wood).

 Any reason you didn't go a little larger with your mdbox rotation
 size?

Just that we didn't see any clear recommendation/documentation for
why one would want to switch from the default 2MB. 2 MB should already
be packing 50-100 messages/file, so why are we only seeing 7x decrease
in number of files.. Hmm, I see the m-files isn't really utilizing 2 MB.
Looking at my own mdbox-storage I see 59 m-files, using a total of 34MB
(avg. 576KB/file)-- with sizes ranging from ~100 KB to 2 MB.  Checking our
quarantine mailbox I see 3045 files, using 2.6GB (avg. 850KB/file).

Guess I should look into changing to a larger rotation size.

BTW, what happens if I change the mdbox_rotate_size from 2MB to 10MB?
Will all the existing 2MB m-files grow to 10MB, or is it just new
m-files that will use this new size? Can I get dovecot to migrate out of
the 2MB files, and reorganize to 10MB files ?


   -jf


Re: [Dovecot] Dovecot tuning for GFS2

2013-08-25 Thread Stan Hoeppner
On 8/23/2013 3:30 AM, Jan-Frode Myklebust wrote:
 On Thu, Aug 22, 2013 at 08:57:40PM -0500, Stan Hoeppner wrote:

 130m to 18m is 'only' a 7 fold decrease.  18m inodes is still rather
 large for any filesystem, cluster or local.  A check on an 18m inode XFS
 filesystem, even on fast storage, would take quite some time.  I'm sure
 it would take quite a bit longer to check a GFS2 with 18m inodes.  
 
 We use GPFS, not GFS2. 

Understood.  But it makes little difference.  None of the cluster
filesystems perform very well with high metadata workloads or extremely
high inode counts, whether using OCFS, GFS, GPFS, CXFS, etc.

 Luckily we've never needed to run fsck on it, but
 it has support for online fsck so hopefully it would be bareable (but
 please, lets not talk about such things, knock on wood).

I'm not that familiar with the GPFS tools.  It may be able to run an
online check but I'd bet you have to unmount it to do a destructive
repair, as with most filesystems, cluster or not.

 Any reason you didn't go a little larger with your mdbox rotation
 size?
 
 Just that we didn't see any clear recommendation/documentation for
 why one would want to switch from the default 2MB. 2 MB should already
 be packing 50-100 messages/file, so why are we only seeing 7x decrease
 in number of files.. 
...
 Hmm, I see the m-files isn't really utilizing 2 MB.
 Looking at my own mdbox-storage I see 59 m-files, using a total of 34MB
 (avg. 576KB/file)-- with sizes ranging from ~100 KB to 2 MB.  Checking our
 quarantine mailbox I see 3045 files, using 2.6GB (avg. 850KB/file).

Apparently 2MB is approximate.  I'd guess if a new msg comes in that'll
put the m-file over the limit, the file is closed, a new one started,
and the new mail goes into the new file, leaving the current (previous)
file at less that the rotate size limit.  Timo will need to give the
definitive answer.

 Guess I should look into changing to a larger rotation size.
 
 BTW, what happens if I change the mdbox_rotate_size from 2MB to 10MB?
 Will all the existing 2MB m-files grow to 10MB, or is it just new
 m-files that will use this new size? Can I get dovecot to migrate out of
 the 2MB files, and reorganize to 10MB files ?

I'd guess existing m-files will remain as they are.  The rotation logic
acts on the currently open and not yet full file.  This is a serial
operation, only forward, not back.  Again, Timo should have a definitive
answer.

-- 
Stan




Re: [Dovecot] Dovecot tuning for GFS2

2013-08-22 Thread Alessio Cecchi

Il 21/08/2013 13:57, Andrea gabellini - SC ha scritto:

Hello,

I'm deploing a new email cluster using Dovecot over GFS2. Actually I'm
using courier over GFS.

Actually I'm testing Dovecot with these parameters:

mmap_disable = yes
mail_fsync = always
mail_nfs_storage = yes
mail_nfs_index = yes
lock_method = fcntl

Are they correct?

RedHat GFS support mmap, so is it better to enable it or leave it disabled?
The documentation suggest the use of flock. What about it?

Thanks,
Andrea





Hi Andrea,

I'm running a cluster with Maildir over NFS (and in past over OCFS2), 
with GFS2 you need to use the same options needed for NFS:


http://wiki2.dovecot.org/NFS

I suggest mmap_disable set on yes

Ciao
--
Alessio Cecchi is:
@ ILS - http://www.linux.it/~alessice/
on LinkedIn - http://www.linkedin.com/in/alessice
Assistenza Sistemi GNU/Linux - http://www.cecchi.biz/
@ PLUG - ex-Presidente, adesso senatore a vita, http://www.prato.linux.it


Re: [Dovecot] Dovecot tuning for GFS2

2013-08-22 Thread 林 宏河
Andrea,

We tried to use GFS2 + Dovecot(mdbox) but when there's many mailboxes and
mails,
it seem to get slow while dropping mail to the mailbox.
We tested with LeftHand storage by the way. So we switched to NFSv4.

Also, keep in mind that director those not detect when the backend server
fails.
So, we use poolmon as suggested in director wiki.
We've tested and it seems to work fine.

Have a look at it.

Kouga
 -Original Message-
 From: dovecot-boun...@dovecot.org [mailto:dovecot-boun...@dovecot.org] On
 Behalf Of Andrea gabellini - SC
 Sent: Wednesday, August 21, 2013 9:19 PM
 To: dovecot@dovecot.org
 Subject: Re: [Dovecot] Dovecot tuning for GFS2
 
 Robert,
 
 So you are using the same config I'm testing. I forgot to write that I
 use maildir.
 
 the final design will be, as RedHat suggest, that the same user goes
 always on the same node (using proxy or director).
 
 Thanks,
 Andrea
 
 
 
 Il 21/08/2013 14:04, Robert Schetterer ha scritto:
  Am 21.08.2013 13:57, schrieb Andrea gabellini - SC:
  Hello,
 
  I'm deploing a new email cluster using Dovecot over GFS2. Actually I'm
  using courier over GFS.
 
  Actually I'm testing Dovecot with these parameters:
 
  mmap_disable = yes
  mail_fsync = always
  mail_nfs_storage = yes
  mail_nfs_index = yes
  lock_method = fcntl
 
  Are they correct?
 
  RedHat GFS support mmap, so is it better to enable it or leave it
disabled?
  The documentation suggest the use of flock. What about it?
 
  Thanks,
  Andrea
 
 
 
  i have
 
  mail_fsync = always
  mail_nfs_storage = yes
  mail_nfs_index = yes
  mmap_disable = yes
 
  with ocfs2/maildir
 
  howeveryou you use a cluster filesystem ,if you use loadbalancing
  additional you should use it
  with
 
  http://wiki2.dovecot.org/Director
 
  by the way i never tested GFS2 with dovecot myself, but others
  told me it doesnt work very fine
 
 
  Best Regards
  MfG Robert Schetterer
 
 
 --
 
 All men are idiots... I married their king.
 
 
 Ing. *Andrea Gabellini*
 Email: andrea.gabell...@telecomitalia.sm
 mailto:andrea%20gabellini%20%3candrea.gabell...@telecomitalia.sm%3E
 Skype: andreagabellini
 Tel: (+378) 0549 886111
 Fax: (+378) 0549 886188
 
 Telecom Italia San Marino S.p.A.
 Strada degli Angariari, 3
 47891 Rovereta
 Republic of San Marino
 
 http://www.telecomitalia.sm



Re: [Dovecot] Dovecot tuning for GFS2

2013-08-22 Thread Stan Hoeppner
On 8/21/2013 4:07 PM, Jan-Frode Myklebust wrote:

 I would strongly suggest using mdbox instead. AFAIK clusterfs' aren't

I'd recommend mdbox as well, with a healthy rotation size.  The larger
files won't increase IMAP performance substantially but they can make
backup significantly quicker.

 very good at handling many small files. It's a worst case random I/O 
 usage pattern, with high rate of metadata operations on top.

Just for clarification, small files and random IO patterns at the disks
are only a small fraction of the maildir problem.  The majority of it is
metadata--the create, move, rename, etc operations.  To keep the
in-memory filesystem state consistent across all nodes, and to avoid
putting extra IOPS on the storage if on disk data structures were to be
used for synchronization, cluster filesystems exchange all metadata
updates and synchronization data over the cluster interconnect.  This is
inherently slow.

With a local filesystem and multiple processes, this coherence dance
takes place at DRAM latencies--tens of nanoseconds, and scales well as
load increases because DRAM bandwidth is 25-100 GB/s.  With a cluster
filesystem it takes place at interconnect latency, tens to hundreds of
μs, or about 1000x higher latency.  And it doesn't scale well as
bandwidth is limited to ~100 MB/s with GbE, ~1 GB/s with 10GbE or
Myrinet.  Stepping up to Infiniband 4x DDR can get you ~2 GB/s and
slightly lower latency, but that's a lot of extra expense for a mail
cluster, given the performance won't scale with the $$ spent.  The
switch and HBAs will cost more than the COTS servers.

Selecting the right mailbox format is in essence free, and mostly solves
the maildir metadata and IOPS problem.

 We use IBM GPFS for clusterfs, and have finally completed the conversion
 of a 130+ million inode maildir filesystem, into a 18 million inode mdbox
 filesystem. I have no hard performance data showing the difference
 between maildir/mdbox, but at a minimum mdbox is much easier to manage.
 Backup of 130+ million files is painfull.. and also it feels nice to be
 able do schedule batches of mailbox purges to off-hours, instead of doing
 them at peak hours.

130m to 18m is 'only' a 7 fold decrease.  18m inodes is still rather
large for any filesystem, cluster or local.  A check on an 18m inode XFS
filesystem, even on fast storage, would take quite some time.  I'm sure
it would take quite a bit longer to check a GFS2 with 18m inodes.  Any
reason you didn't go a little larger with your mdbox rotation size?

-- 
Stan



[Dovecot] Dovecot tuning for GFS2

2013-08-21 Thread Andrea gabellini - SC
Hello,

I'm deploing a new email cluster using Dovecot over GFS2. Actually I'm
using courier over GFS.

Actually I'm testing Dovecot with these parameters:

mmap_disable = yes
mail_fsync = always
mail_nfs_storage = yes
mail_nfs_index = yes
lock_method = fcntl

Are they correct?

RedHat GFS support mmap, so is it better to enable it or leave it disabled?
The documentation suggest the use of flock. What about it?

Thanks,
Andrea



-- 

Don't talk with a full mouth ... or with an empty head


Ing. *Andrea Gabellini*
Email: andrea.gabell...@telecomitalia.sm
mailto:andrea%20gabellini%20%3candrea.gabell...@telecomitalia.sm%3E
Skype: andreagabellini
Tel: (+378) 0549 886111
Fax: (+378) 0549 886188

Telecom Italia San Marino S.p.A.
Strada degli Angariari, 3
47891 Rovereta
Republic of San Marino

http://www.telecomitalia.sm


Re: [Dovecot] Dovecot tuning for GFS2

2013-08-21 Thread Robert Schetterer
Am 21.08.2013 13:57, schrieb Andrea gabellini - SC:
 Hello,
 
 I'm deploing a new email cluster using Dovecot over GFS2. Actually I'm
 using courier over GFS.
 
 Actually I'm testing Dovecot with these parameters:
 
 mmap_disable = yes
 mail_fsync = always
 mail_nfs_storage = yes
 mail_nfs_index = yes
 lock_method = fcntl
 
 Are they correct?
 
 RedHat GFS support mmap, so is it better to enable it or leave it disabled?
 The documentation suggest the use of flock. What about it?
 
 Thanks,
 Andrea
 
 
 

i have

mail_fsync = always
mail_nfs_storage = yes
mail_nfs_index = yes
mmap_disable = yes

with ocfs2/maildir

howeveryou you use a cluster filesystem ,if you use loadbalancing
additional you should use it
with

http://wiki2.dovecot.org/Director

by the way i never tested GFS2 with dovecot myself, but others
told me it doesnt work very fine


Best Regards
MfG Robert Schetterer

-- 
[*] sys4 AG

http://sys4.de, +49 (89) 30 90 46 64
Franziskanerstraße 15, 81669 München

Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
Vorstand: Patrick Ben Koetter, Axel von der Ohe, Marc Schiffbauer
Aufsichtsratsvorsitzender: Florian Kirstein


Re: [Dovecot] Dovecot tuning for GFS2

2013-08-21 Thread Andrea gabellini - SC
Robert,

So you are using the same config I'm testing. I forgot to write that I
use maildir.

the final design will be, as RedHat suggest, that the same user goes
always on the same node (using proxy or director).

Thanks,
Andrea



Il 21/08/2013 14:04, Robert Schetterer ha scritto:
 Am 21.08.2013 13:57, schrieb Andrea gabellini - SC:
 Hello,

 I'm deploing a new email cluster using Dovecot over GFS2. Actually I'm
 using courier over GFS.

 Actually I'm testing Dovecot with these parameters:

 mmap_disable = yes
 mail_fsync = always
 mail_nfs_storage = yes
 mail_nfs_index = yes
 lock_method = fcntl

 Are they correct?

 RedHat GFS support mmap, so is it better to enable it or leave it disabled?
 The documentation suggest the use of flock. What about it?

 Thanks,
 Andrea



 i have

 mail_fsync = always
 mail_nfs_storage = yes
 mail_nfs_index = yes
 mmap_disable = yes

 with ocfs2/maildir

 howeveryou you use a cluster filesystem ,if you use loadbalancing
 additional you should use it
 with

 http://wiki2.dovecot.org/Director

 by the way i never tested GFS2 with dovecot myself, but others
 told me it doesnt work very fine


 Best Regards
 MfG Robert Schetterer


-- 

All men are idiots... I married their king.


Ing. *Andrea Gabellini*
Email: andrea.gabell...@telecomitalia.sm
mailto:andrea%20gabellini%20%3candrea.gabell...@telecomitalia.sm%3E
Skype: andreagabellini
Tel: (+378) 0549 886111
Fax: (+378) 0549 886188

Telecom Italia San Marino S.p.A.
Strada degli Angariari, 3
47891 Rovereta
Republic of San Marino

http://www.telecomitalia.sm


Re: [Dovecot] Dovecot tuning for GFS2

2013-08-21 Thread Jan-Frode Myklebust
On Wed, Aug 21, 2013 at 02:18:52PM +0200, Andrea gabellini - SC wrote:
 
 So you are using the same config I'm testing. I forgot to write that I
 use maildir.

I would strongly suggest using mdbox instead. AFAIK clusterfs' aren't
very good at handling many small files. It's a worst case random I/O 
usage pattern, with high rate of metadata operations on top.

We use IBM GPFS for clusterfs, and have finally completed the conversion
of a 130+ million inode maildir filesystem, into a 18 million inode mdbox
filesystem. I have no hard performance data showing the difference
between maildir/mdbox, but at a minimum mdbox is much easier to manage.
Backup of 130+ million files is painfull.. and also it feels nice to be
able do schedule batches of mailbox purges to off-hours, instead of doing
them at peak hours.

As for your settings, we use:

mmap_disable = yes  # GPFS also support cluster-wide mmap, but for 
some reason we've disabled it in dovecot..
mail_fsync = optimized
mail_nfs_storage = no
mail_nfs_index = no
lock_method = fcntl

and of course Dovecot Director in front of them..



  -jf