Re: best filesystem for imap server

2004-12-07 Thread Henrique de Moraes Holschuh
On Tue, 07 Dec 2004, Einar Indridason wrote:
> We do have some *huge* mail-folders here, running on ext3, and when a
> directory gets over a certain size, every operation on the directory
> increases in time very sharply.  (Due to the "linked list" implementation
> in ext2/ext3.)

Is that ext3 in 2.6.8.1+ with all the htrees enabled?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-07 Thread Einar Indridason
On Sun, Dec 05, 2004 at 12:43:14AM -0200, Henrique de Moraes Holschuh wrote:
> On Sat, 04 Dec 2004, Einar Indridason wrote:
> > 
> > Don't forget JFS from IBM.
> 
> All I know about JFS is that it did not come up as better enough than ext3
> in a few benchmarks I've seen, to bother with it at the time :(
> 
> If you have first hand experience with JFS, please describe it to us.
> Especially data protection capabilities and performance in ridiculously big
> directories, as required by Cyrus spools :)

I don't have a first hand experience with JFS.  I just found it to be
missing from the discussion.

We do have some *huge* mail-folders here, running on ext3, and when a
directory gets over a certain size, every operation on the directory
increases in time very sharply.  (Due to the "linked list" implementation
in ext2/ext3.)

We did some googling around regarding which filesystem to choose, and I'm
inclined to try JFS when we install the next mail-server.

Some URLs I stumbled upon:

http://linuxgazette.net/102/piszcz.html
http://jamesthornton.com/hotlist/linux-filesystems/

And of course:
http://www.google.com/search?q=journaled+filesystem+benchmark+linux


--
einari
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-04 Thread Henrique de Moraes Holschuh
On Sat, 04 Dec 2004, Einar Indridason wrote:
> On Wed, Dec 01, 2004 at 05:07:46PM -0600, Jim Miller wrote:
> > journaled but very slow.  ReiserFS is a better choice for a journaled file
> > system and if you can hold off until all the bugs are worked out, Reiser4FS
> > would be the best choice (IMHO).
> 
> Don't forget JFS from IBM.

All I know about JFS is that it did not come up as better enough than ext3
in a few benchmarks I've seen, to bother with it at the time :(

If you have first hand experience with JFS, please describe it to us.
Especially data protection capabilities and performance in ridiculously big
directories, as required by Cyrus spools :)

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-04 Thread Einar Indridason
On Wed, Dec 01, 2004 at 05:07:46PM -0600, Jim Miller wrote:
> 
> 
> I feel that XFS is a bad choice since it is not a 'truly' journaled file
> system.  If you have a power failure/system crash/lockup, etc., etc. You
> could very easily end up with a corrupt file system -- XFS doesn't write out
> to the disks immediately (caching unwritten data to memory).  EXT3 is
> journaled but very slow.  ReiserFS is a better choice for a journaled file
> system and if you can hold off until all the bugs are worked out, Reiser4FS
> would be the best choice (IMHO).


Don't forget JFS from IBM.

--
[EMAIL PROTECTED]
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-03 Thread lst_hoe01
Zitat von Henrique de Moraes Holschuh <[EMAIL PROTECTED]>:

> I believe the openldap rationale is that it is impossible to have good BDB
> defaults. This affects Cyrus as well, I think.
>
> However, for Cyrus, it is probably easy enough to come up with a bare
> minimum setup for a 1000 concurrent connections scenario (half IMAP, half
> LMTP).  That should cover just about everyone that doesn't either have too
> little system memory, or a big enough site that they better know how to do
> the setup in the first place...  I might even contribute with that setup
> when I switch to BDB 4.2 and Cyrus IMAPd 2.2 :-)
>

May i ask where to find good dokumentation about BDB settings?? I have not
bothered until now because "embedded databases" should need no tuning.
>From what i hear this isn't true at all?

Thanxs

Andreas

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-03 Thread Henrique de Moraes Holschuh
On Fri, 03 Dec 2004, Andreas Hasenack wrote:
> On Thu, Dec 02, 2004 at 09:20:20PM -0200, Henrique de Moraes Holschuh wrote:
> > > subversion repository with about 50Gb of data on a single berkeley
> > > database file (version 4.2.52 + 2patches):
> > 
> > Heavy concurrent load on non-UP machines seem to be a much more common cause
> > of trouble with BDB than database size.  Index size does couse trouble (when
> 
> What I meant when I showed the database size was that we trust it enough to
> deal with it and our precious data (5 full versions of the distribution and
> all its updates).

Well, that certainly tells me I can trust subversion with it (if you tell me
exactly what BDB you're using, down to which non-sleepycat patches are
included in it...)

Which is a good thing to know. CVS is grating on my nerves, and DARCS isn't
quite there yet.  And arch is just plain obnoxious IMHO.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-03 Thread Henrique de Moraes Holschuh
On Fri, 03 Dec 2004, Andreas Hasenack wrote:
> On Thu, Dec 02, 2004 at 09:20:20PM -0200, Henrique de Moraes Holschuh wrote:
> > As a first example (and just like you said), if you don't get the DB_CONFIG
> > stuff exactly right, you can get anything from lock ups to environment
> > corruption.  This is quite easy to hit with OpenLDAP.  From what you wrote,
> 
> Indeed, openldap's defaults are wrong. In fact, it uses BDB's defaults which

I believe the openldap rationale is that it is impossible to have good BDB
defaults. This affects Cyrus as well, I think.  

However, for Cyrus, it is probably easy enough to come up with a bare
minimum setup for a 1000 concurrent connections scenario (half IMAP, half
LMTP).  That should cover just about everyone that doesn't either have too
little system memory, or a big enough site that they better know how to do
the setup in the first place...  I might even contribute with that setup
when I switch to BDB 4.2 and Cyrus IMAPd 2.2 :-)

> > Second, it is prone to behave badly in non-trivial workloads on non-trivial
> > apps on non-trivial (i.e. not UP) boxes.  Which is exactly the kind of thing
> > you have on any big Cyrus or OpenLDAP deployment.  I have some hopes that
> 
> Also in our subversion deployment, but it behaves quite nicely. Commit and 
> checkout
> times are good.

Usually BDB either behaves very well, or crashes-and-burns :P  Not much
middle-ground there.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-03 Thread Andreas Hasenack
On Thu, Dec 02, 2004 at 09:20:20PM -0200, Henrique de Moraes Holschuh wrote:
> > subversion repository with about 50Gb of data on a single berkeley
> > database file (version 4.2.52 + 2patches):
> 
> Heavy concurrent load on non-UP machines seem to be a much more common cause
> of trouble with BDB than database size.  Index size does couse trouble (when

What I meant when I showed the database size was that we trust it enough to
deal with it and our precious data (5 full versions of the distribution and
all its updates).

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-03 Thread Andreas Hasenack
On Thu, Dec 02, 2004 at 09:20:20PM -0200, Henrique de Moraes Holschuh wrote:
> As a first example (and just like you said), if you don't get the DB_CONFIG
> stuff exactly right, you can get anything from lock ups to environment
> corruption.  This is quite easy to hit with OpenLDAP.  From what you wrote,

Indeed, openldap's defaults are wrong. In fact, it uses BDB's defaults which
are just wrong for openldap. The openldap developers prefer to have the admin
change it.

> I guess subversion will also get hit by this one if DB_CONFIG is not
> optimal for your dataset.

Not necessarily, it may set some parameters itself. Our DB_CONFIG file, though, 
was
shipped with subversion at least.

> Second, it is prone to behave badly in non-trivial workloads on non-trivial
> apps on non-trivial (i.e. not UP) boxes.  Which is exactly the kind of thing
> you have on any big Cyrus or OpenLDAP deployment.  I have some hopes that

Also in our subversion deployment, but it behaves quite nicely. Commit and 
checkout
times are good.

> the very latest 4.2 fixes this.  I *do* know the others didn't, since I've
> experienced the crashes myself.
> 
> BDB 4.x is a complex piece of software, and it shows.

It is complex indeed. I like to say it has many buttons to turn or press.

> > It's heavily used by openldap and subversion. We, for example, have a
> 
> And at least with openldap, it causes a lot of trouble.

Without a properly tuned DB_CONFIG file, I agree. And the issue of why openldap
needs one (and doesn't set some basic values, like a bigger than miserable 
32kbytes
log buffer) escapes me.

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-03 Thread Hamish

I think the performance of those disks (and the RAID you put on them) will
be much more significant that the filesystem you use, considering the size
of your user population.  And given that factor, I'd say that even ext3
won't give you any problems performance-wise.  Still, reiserfs, IMO, would
be preferable for mail files.
John
 

Thanks for all the answers! I have decided to use reiser with raid1, as 
I have experience with reiser on samba and it works very well. Are there 
any fstab mount options that you could recommend for the raid array? I 
saw a recommendation for EXT3 to use noatime.
Thanks,
H
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-03 Thread Henrique de Moraes Holschuh
On Fri, 03 Dec 2004, Igor Brezac wrote:
> On Thu, 2 Dec 2004, Henrique de Moraes Holschuh wrote:
> >series is *not* to be trusted yet.  It is not just because of Cyrus (after
> >all, a bug in Cyrus code might cause BDB 4.x to misbehave),
> 
> This Cyrus bug has been fixed a long time ago.  I've run cyrus with BDB 
> 4.1 or higher for almost two years without any issues.

I do think I've read not so much time ago in this ML (certainly no more than
a few months) that there *could* be a well hidden bug still lurking in the
BDB code.

It would be a good idea to read all the 4.2 docs and do a full functionality
audit of the code sometime.

> stories, but there are numerous folks who run OpenLDAP with great success 
> in very busy environments.

Heh, I am one of those that got burned by older BDB 4.2 buggy code when
multiple databases are in use in the same environment in a SMT or SMP box
(SMT triggered the bug sooner than SMP; SMT+SMP triggered it almost always,
as soon as writes started).

The thing's running stable for a couple of months now, so I hope the issue
is completely fixed (at least within Debian. We *do* have patches to 4.2.52
to make it so, I don't know if these are available at the Sleepycat site).

I wonder how many of the reports of crashes and trouble with BDB are due to
people trying to use vendor-supplied BDB 4.x builds that haven't got the
latest patches, or known-bad BDB releases...

BTW, I regard BDB 4 and BDB 4.1 as certain-data-corruption-will-happen
territory.  IMHO Cyrus' autoconf script should refuse to work with anything
but BDB 3.2, 4.2 and 4.3 (when we test 4.3 enough, that is).

> >>default values for important settings, data corruption *will* happen.
> >Indeed.
> 
> A correctly configured BDB 4.x environment will behave and perform well. 
> I am yet to corrupt a BDB database to a point where the data is not 
> recoverable.

Well, same here I *think*. I recall doing some rm -rf 
type restores from LDIF to OpenLDAP, but I don't recall why I did it that
way.

> For those interested, you can find BDB docs at 
> http://www.sleepycat.com/docs/ref/toc.html.  As Henrique pointed out, BDB 
> is very complex, but it can also do a very good job.

Exactly, which is why we tolerate it :P

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-02 Thread Igor Brezac
On Thu, 2 Dec 2004, Henrique de Moraes Holschuh wrote:
On Thu, 02 Dec 2004, Andreas Hasenack wrote:
On Thu, Dec 02, 2004 at 06:48:02PM -0200, Henrique de Moraes Holschuh wrote:
wouldn't be appropriate. We could have used bdb, but generally have had
lots of problems with bdb so don't entirely trust it...
I don't know of anyone sane that trusts any BDB on the 4.x series.
With cyrus-imapd, that may be so. But don't generalize, BDB is quite robust.
Well, 3.2.9 as packaged by Debian IS robust.  I don't have a single
misbehaviour (let alone one that caused data loss) reported against it, I
think.
Now, for 4.x I do not agree with you. As far as I am concerned, the 4.x
series is *not* to be trusted yet.  It is not just because of Cyrus (after
all, a bug in Cyrus code might cause BDB 4.x to misbehave),
This Cyrus bug has been fixed a long time ago.  I've run cyrus with BDB 
4.1 or higher for almost two years without any issues.

but also because
of all the reports of problems with OpenLDAP.
Well, this is certainly a debatable issue.  The latest stable version of 
OpenLDAP will run only with BDB 4.2.52 or higher.  (BDB 4.3.21 is the 
latest and not without problems)  Sure there are folks with horror 
stories, but there are numerous folks who run OpenLDAP with great success 
in very busy environments.

As a first example (and just like you said), if you don't get the DB_CONFIG
stuff exactly right, you can get anything from lock ups to environment
corruption.  This is quite easy to hit with OpenLDAP.  From what you wrote,
I guess subversion will also get hit by this one if DB_CONFIG is not
optimal for your dataset.
Second, it is prone to behave badly in non-trivial workloads on non-trivial
apps on non-trivial (i.e. not UP) boxes.  Which is exactly the kind of thing
you have on any big Cyrus or OpenLDAP deployment.  I have some hopes that
the very latest 4.2 fixes this.  I *do* know the others didn't, since I've
experienced the crashes myself.
BDB 4.x is a complex piece of software, and it shows.
It's heavily used by openldap and subversion. We, for example, have a
And at least with openldap, it causes a lot of trouble.
subversion repository with about 50Gb of data on a single berkeley
database file (version 4.2.52 + 2patches):
Heavy concurrent load on non-UP machines seem to be a much more common cause
of trouble with BDB than database size.  Index size does couse trouble (when
DB_CONFIG is not correctly sized), though.
default values for important settings, data corruption *will* happen.
Indeed.
A correctly configured BDB 4.x environment will behave and perform well. 
I am yet to corrupt a BDB database to a point where the data is not 
recoverable.

For those interested, you can find BDB docs at 
http://www.sleepycat.com/docs/ref/toc.html.  As Henrique pointed out, BDB 
is very complex, but it can also do a very good job.

--
Igor
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-02 Thread Henrique de Moraes Holschuh
On Thu, 02 Dec 2004, Andreas Hasenack wrote:
> On Thu, Dec 02, 2004 at 06:48:02PM -0200, Henrique de Moraes Holschuh wrote:
> > > wouldn't be appropriate. We could have used bdb, but generally have had 
> > > lots of problems with bdb so don't entirely trust it...
> > 
> > I don't know of anyone sane that trusts any BDB on the 4.x series.
> 
> With cyrus-imapd, that may be so. But don't generalize, BDB is quite robust.

Well, 3.2.9 as packaged by Debian IS robust.  I don't have a single
misbehaviour (let alone one that caused data loss) reported against it, I
think.

Now, for 4.x I do not agree with you. As far as I am concerned, the 4.x
series is *not* to be trusted yet.  It is not just because of Cyrus (after
all, a bug in Cyrus code might cause BDB 4.x to misbehave), but also because
of all the reports of problems with OpenLDAP.

As a first example (and just like you said), if you don't get the DB_CONFIG
stuff exactly right, you can get anything from lock ups to environment
corruption.  This is quite easy to hit with OpenLDAP.  From what you wrote,
I guess subversion will also get hit by this one if DB_CONFIG is not
optimal for your dataset.

Second, it is prone to behave badly in non-trivial workloads on non-trivial
apps on non-trivial (i.e. not UP) boxes.  Which is exactly the kind of thing
you have on any big Cyrus or OpenLDAP deployment.  I have some hopes that
the very latest 4.2 fixes this.  I *do* know the others didn't, since I've
experienced the crashes myself.

BDB 4.x is a complex piece of software, and it shows.

> It's heavily used by openldap and subversion. We, for example, have a

And at least with openldap, it causes a lot of trouble.

> subversion repository with about 50Gb of data on a single berkeley
> database file (version 4.2.52 + 2patches):

Heavy concurrent load on non-UP machines seem to be a much more common cause
of trouble with BDB than database size.  Index size does couse trouble (when
DB_CONFIG is not correctly sized), though.

> default values for important settings, data corruption *will* happen.

Indeed.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-02 Thread Rob Mueller
FYI anyone looking for NVRAM solutions for journals/meta-data storage, I 
just found this page:

http://www.storagesearch.com/ssd-buyers-guide.html
Which looks to have lots of juicy info. If anyone knows anything about any 
of these products or has feedback, I'd love to hear about it, and I'm sure 
the list would too...

Rob
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-02 Thread Rob Mueller

Ordered would be best for a Cyrus spoll, and I guess Data would be best on
MTAs (when they have a small enough queue lifetime for most messages, and
the journal is large enough).
I think probably just test and find which one gives you the better 
performance. We tended to find that data=journal actually gave better 
performance, but didn't know exactly why. This seems to be another case of 
most benchmarks != real world!

Indeed. Although why mailboxes.db (when using the BDB backend, anyway) has
so much IO I have no idea.  Once read, BDB should be doing IPC to fetch it
from in-memory cache, not trashing the disk.  Unless writes to 
mailboxes.db
are very common.
This was a skiplist mailboxes.db. Bug again, there were 3 things on the 
NVRAM drive:
1. skiplist mailboxes.db
2. skiplist .seen files
3. quota files

I don't have a break down of which of those was causing the most IO load, 
but it's quite possible (and even probable) that it wasn't the mailboxes.db, 
the other 2 sets of files would get a LOT of writes (and this was even with 
noatime and nodiratime as well, definitely filesystem options you should be 
using)

No doubts about that one (since we're talking about a nvram drive here). 
I
wonder if it is such a great idea on a slow device (disk), though.  Do you
have this data?
No. I did notice once that lots of stat() calls are several times slower on 
HD's with the tails option on. We thus turn it off for all HDs.

You mean the patches on the threads, or patches available somewhere else
(where?)
Check the kernel mailing list for the ext3 one, I think it is or will be 
soon in the 2.6 mainline.

For reiserfs, Vladimir Saveliev from namesys told us. "The exampled scenario 
of deadlock happens when user buffer is prepared by mmap(2)-ing a file to 
which we are to write(2). Suggested patch in fs/reiserfs/:"

--- file.c~ 2004-10-02 12:29:33.223660850 +0400
+++ file.c  2004-10-08 10:03:03.001561661 +0400
@@ -1137,6 +1137,8 @@
   return result;
}
+return generic_file_write(file, buf, count, ppos);
+
if ( unlikely((ssize_t) count < 0 ))
return -EINVAL;
We've applied this and haven't had a problem since.
How stable and stress-tested is data=ordered? and what about the full
journalling (which might be a good thing on MTAs)?
Seems well tested to me, haven't seen any problems at all and we have lots 
of IO...

Rob
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-02 Thread Andreas Hasenack
On Thu, Dec 02, 2004 at 06:48:02PM -0200, Henrique de Moraes Holschuh wrote:
> > wouldn't be appropriate. We could have used bdb, but generally have had 
> > lots of problems with bdb so don't entirely trust it...
> 
> I don't know of anyone sane that trusts any BDB on the 4.x series.

With cyrus-imapd, that may be so. But don't generalize, BDB is quite robust.

It's heavily used by openldap and subversion. We, for example, have a 
subversion repository
with about 50Gb of data on a single berkeley database file (version 4.2.52 + 
2patches):

(...)
-rw---1 www  www  100M 2004-11-30 19:30 log.008597
-rw---1 www  www  100M 2004-12-01 14:58 log.008598
-rw---1 www  www  100M 2004-12-02 14:38 log.008599
-rw---1 www  www   85M 2004-12-02 19:33 log.008600
-rw-r--r--1 www  www   57M 2004-12-02 19:33 nodes
-rw-r--r--1 www  www   62M 2004-12-02 19:33 representations
-rw-r--r--1 www  www  1,6M 2004-12-02 19:33 revisions
-rw-r--r--1 www  www   50G 2004-12-02 19:33 strings  
<
-rw-r--r--1 www  www   24M 2004-12-02 19:33 transactions
-rw-r--r--1 www  www  8,0K 2004-12-02 19:33 uuids

This for about 2 years now (we started with subversion 0.14.3) and no data loss,
even after machine crashes due to faulty power supply, low RAM, etc. 

This obviously needs a correctly tuned DB_CONFIG file, and/or correct tuning 
from
within the application (subversion in this case). Otherwise, if left with BDB's
default values for important settings, data corruption *will* happen.

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-02 Thread Henrique de Moraes Holschuh
On Thu, 02 Dec 2004, Rob Mueller wrote:
> We use reiserfs for our large cyrus installation. We changed from ext3 
[...]

That was very interesting and useful data, thanks for posting it!

> Ordered = Data is written before meta-data journal is committed. This 
> avoids filesystem and data corruption. This is now the default in >= 2.6.8.1
> Data = All data and meta-data is written to the journal

Ordered would be best for a Cyrus spoll, and I guess Data would be best on
MTAs (when they have a small enough queue lifetime for most messages, and
the journal is large enough).

> turned out that's not the major IO bottleneck. We've found that the 
> mailboxes.db, .seen and quota databases generate the most IO. Putting these 

Indeed. Although why mailboxes.db (when using the BDB backend, anyway) has
so much IO I have no idea.  Once read, BDB should be doing IPC to fetch it
from in-memory cache, not trashing the disk.  Unless writes to mailboxes.db
are very common.

> One other useful feature of reiserfs is the "tails" feature. This is on by 
> default, and it means that multiple small files can be stored in 1 disk 
> block. On a space limited nvram drive, this is very useful for the legacy 
> quota system which uses 1 file small file per quota root (eg usually per 

No doubts about that one (since we're talking about a nvram drive here).  I
wonder if it is such a great idea on a slow device (disk), though.  Do you
have this data?

> wouldn't be appropriate. We could have used bdb, but generally have had 
> lots of problems with bdb so don't entirely trust it...

I don't know of anyone sane that trusts any BDB on the 4.x series.

> I should add potential problem as well. There appears to be an issue on 
> heavily loaded linux servers with the way the the cyrus skiplist db works. 
[...]
> problem existed in ext3 as well 
> (http://www.ussg.iu.edu/hypermail/linux/kernel/0409.0/0966.html). It seems 
> this is a very rare problem though since no-one else has reported it. There 
> are patches available to fix both in case anyone else has come across it.

You mean the patches on the threads, or patches available somewhere else
(where?)

> All up, we've been very happy with reiserfs and i'd recommend people use 
> it, especially in >= 2.6.8.1 kernels where data=ordered is now the default 
> option.

How stable and stress-tested is data=ordered? and what about the full
journalling (which might be a good thing on MTAs)?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-02 Thread Henrique de Moraes Holschuh
On Thu, 02 Dec 2004, John Madden wrote:
> > I think they use capacitors that will hold enough charge to allow
> > flushing the buffers to disk when there's a power loss.
> 
> And another set of caps to keep the spindles spinning so that data can be
> written?  I'm not yet willing to buy the bridge you're selling. :)

They don't have to.  OTOH, if the drive hits a bad sector while on emergency
write mode... well...  I doubt it would have enough rotational speed AND
electric power to move the heads that much.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-02 Thread Jules Agee
Jules Agee wrote:
David Lang wrote:
also note that if you are useing IDE drives you have no way of really 
knowing when the data has hit the platter (as opposed to just being in 
the buffer of the drive) as many of the drives will lie to you and 
tell you the write is complete once it hits the buffers.
I think they use capacitors that will hold enough charge to allow 
flushing the buffers to disk when there's a power loss.
Mea culpa, you're right, David. I was thinking of a controller which has 
its own write cache and disables the write cache built into the drives.

--
Jules Agee
System Administrator
Pacific Coast Feather Co.
[EMAIL PROTECTED]  x284
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-02 Thread David Lang
On Thu, 2 Dec 2004, John Madden wrote:
Date: Thu, 2 Dec 2004 14:53:07 -0500 (EST)
From: John Madden <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: best filesystem for imap server
I think they use capacitors that will hold enough charge to allow
flushing the buffers to disk when there's a power loss.
And another set of caps to keep the spindles spinning so that data can be
written?  I'm not yet willing to buy the bridge you're selling. :)
10 or so years ago when the drives had significantly more rotating mass 
and significantly lower data density there were (high-end SCSI) drives 
that could use their rotational energy to power their electronics to write 
the data and adjust the dataclock as the spindle slowed, but I don't think 
any drive does this anymore.

David Lang
--
There are two ways of constructing a software design. One way is to make it so 
simple that there are obviously no deficiencies. And the other way is to make 
it so complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-02 Thread David Lang
On Thu, 2 Dec 2004, Jules Agee wrote:
Date: Thu, 02 Dec 2004 10:11:21 -0800
From: Jules Agee <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: Re: best filesystem for imap server
David Lang wrote:
also note that if you are useing IDE drives you have no way of really 
knowing when the data has hit the platter (as opposed to just being in the 
buffer of the drive) as many of the drives will lie to you and tell you 
the write is complete once it hits the buffers.
I think they use capacitors that will hold enough charge to allow flushing 
the buffers to disk when there's a power loss.
they used to, but nowdays when the bugger is 8M (or larger), potentially 
with many seeks they don't have any capacitors large enough to hold that 
much power (disassemble a failed drive sometime and try to find any 
significant capacitors in it)

David Lang
--
Jules Agee
System Administrator
Pacific Coast Feather Co.
[EMAIL PROTECTED]  x284
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
--
There are two ways of constructing a software design. One way is to make it so 
simple that there are obviously no deficiencies. And the other way is to make 
it so complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


ReiserFS and general cyrus filesystem usage information - was Re: best filesystem for imap server

2004-12-02 Thread Rob Mueller

I didn't know reiser 3 would fully journal data (or that it has good 
enough
write barriers and write optimization to make sure the filesystem never
returns before a fsync really means everything including data is on disk).
Is that correct?  If it is, then reiser might be a better choice than ext3
with hashing (as long as you do use a fast-as-heck nvram drive for the
journal, of course).
We use reiserfs for our large cyrus installation. We changed from ext3 
several years ago when we found the performance problems with ext3 on large 
directories, and also filesystem corruption with the htree directory hashing 
patches that were available at that time (it was early days for the htree 
patches, unfortunately we couldn't really wait around for them to fix the 
bugs - http://www.spinics.net/lists/ext3/msg01656.html). So we tried 
reiserfs and haven't looked back since. We do tend to be a bit on the 
leading edge patch wise, so I've been keeping track of what's been going on 
with reiserfs for around 2 years now (I'm cc'ing Chris Mason one of the 
resierfs developers so he can correct/confirm the information below)

Originally reiserfs (v3) only had meta-data journaling. Sometime around 
2.4.20 Chris Mason released a bunch of patches 
(ftp://ftp.suse.com/pub/people/mason/patches/data-logging/) that introduced 
data logging to reiserfs. I'm not sure if these ever made it into the 2.4 
mainline, but I know at least suse included these patches in their kernels 
for a quite a while.

A different set of patches was required for 2.6 series. These patches 
finally made it in in >= 2.6.8.1 (and some general allocator improvements as 
well I believe). So < 2.6.8.1 reiserfs only had meta-data journaling. In 
>=2.6.8.1 there are now 3 journaling modes.

Meta-data = You can get data corruption (but not filesystem corruption) 
because meta-data changes can be committed to the journal (eg file size 
change) before data is written. This was the only mode available in < 
2.6.8.1
Ordered = Data is written before meta-data journal is committed. This avoids 
filesystem and data corruption. This is now the default in >= 2.6.8.1
Data = All data and meta-data is written to the journal

Reiserfs does support external journals, and we have several nvram drives in 
our systems that we've moved the journals on to. While that helped, it 
turned out that's not the major IO bottleneck. We've found that the 
mailboxes.db, .seen and quota databases generate the most IO. Putting these 
on the nvram card significantly increased our performance and reduced our IO 
wait time. Aggregating some output from iostat shows this:

Device:tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
cyrusmeta380.03   77.92  2963.97   9352 355736
rfsjournals  196.270.00  1570.13  0 188448
cyrusspool   206.36 1228.06  1206.53 147392 144808
As you can see, the cyrus "metadata" (mailboxes.db, .seen dbs, quota dbs) 
consumes more write IO than the message spool directories and journals for 
those directories combined. Something definitely to consider when rolling 
out a big cyrus installation. (As a side note... I was curious why the 
reiserfs journals had no read requests on them. I'm guessing that since 
journals are very short lived, the actual data remains in main memory before 
being actually written to disk, so really the journal only needs to be read 
on a reboot after a crash, otherwise it just ends up cached in main memory 
all the time)

One other useful feature of reiserfs is the "tails" feature. This is on by 
default, and it means that multiple small files can be stored in 1 disk 
block. On a space limited nvram drive, this is very useful for the legacy 
quota system which uses 1 file small file per quota root (eg usually per 
user). Even with >100,000 files, we're only using about 20M of the nvram for 
them. We had thought about using the skiplist db for quotas, but having 
spoken to Ken, found that because the skiplist db uses global locking, it 
wouldn't be appropriate. We could have used bdb, but generally have had lots 
of problems with bdb so don't entirely trust it...

I should add potential problem as well. There appears to be an issue on 
heavily loaded linux servers with the way the the cyrus skiplist db works. 
Basically it can cause kernel deadlocks that result in unkillable processes 
stuck in D state that requires a system reboot. While we observed this 
intermittently with reiserfs (http://lkml.org/lkml/2004/7/20/127) the same 
problem existed in ext3 as well 
(http://www.ussg.iu.edu/hypermail/linux/kernel/0409.0/0966.html). It seems 
this is a very rare problem though since no-one else has reported it. There 
are patches available to fix both in case anyone else has come across it.

All up, we've been very happy with reiserfs and i'd recommend people use it, 
especially in >= 2.6.8.1 kernels where data=ordered is now the default 
option.

Rob
---
Cyrus Hom

Re: best filesystem for imap server

2004-12-02 Thread John Madden
> I think they use capacitors that will hold enough charge to allow
> flushing the buffers to disk when there's a power loss.

And another set of caps to keep the spindles spinning so that data can be
written?  I'm not yet willing to buy the bridge you're selling. :)

John





-- 
John Madden
UNIX Systems Engineer
Ivy Tech State College
[EMAIL PROTECTED]


---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-02 Thread Jules Agee
David Lang wrote:
also note that if you are useing IDE drives you have no way of really 
knowing when the data has hit the platter (as opposed to just being in 
the buffer of the drive) as many of the drives will lie to you and tell 
you the write is complete once it hits the buffers.
I think they use capacitors that will hold enough charge to allow 
flushing the buffers to disk when there's a power loss.

--
Jules Agee
System Administrator
Pacific Coast Feather Co.
[EMAIL PROTECTED]  x284
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


RE: best filesystem for imap server

2004-12-02 Thread David Lang
On Wed, 1 Dec 2004, Jim Miller wrote:
I feel that XFS is a bad choice since it is not a 'truly' journaled file
system.  If you have a power failure/system crash/lockup, etc., etc. You
could very easily end up with a corrupt file system -- XFS doesn't write out
to the disks immediately (caching unwritten data to memory).  EXT3 is
journaled but very slow.  ReiserFS is a better choice for a journaled file
system and if you can hold off until all the bugs are worked out, Reiser4FS
would be the best choice (IMHO).
note that most journaling filesystems journal the metadata, not the file 
data (and ext3 does this as well by default, but it has a mode to enable 
journaling everything)

and actually ext3 had the option to journal everything becouse in the 
initial implementation the peopel writing the code couldn't seperate the 
two types of data so to simplify things they journaled everything.

the reason that not everything is journaled is a simple performance issue. 
having to write the data to the journal, read it from the journal and 
write it to the final location, then update the journal requires a LOT 
more IO bandwidth then if you just do this for the metadata.

personally I have trouble trusting reiserfs ever since it was revealed 
that one reason that it was doing so well on benchmarks is that it delaye 
up to 30 seconds before writing anything to disk so in many cases the 
benchmark was completed before any disk activity took place. This has been 
changed, but it leaves a bad taste behind.

also note that if you are useing IDE drives you have no way of really 
knowing when the data has hit the platter (as opposed to just being in the 
buffer of the drive) as many of the drives will lie to you and tell you 
the write is complete once it hits the buffers.

David Lang

Jim
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
--
There are two ways of constructing a software design. One way is to make it so 
simple that there are obviously no deficiencies. And the other way is to make 
it so complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-02 Thread Henrique de Moraes Holschuh
On Wed, 01 Dec 2004, Jim Miller wrote:

> > notice is that when the mail delivery queue on the MTA gets very large,
> > which happens occassionally, the CPU load average goes way up and iowait
> > time as displayed using top can exceed 300% on a four processor box and
> > performance

Are you keeping tabs on the number of LMTP processes, and on locking
contention?  You can teach a new-enough postfix to be *very* intelligent on
what the heck it is doing when doing LMTP queueing and delivery.

> > goes all to heck.  Is switching the filesystem to XFS likely to help this
> > situation?  Since there is some 80GB of mail spool currently in use,
> > switching the filesystem to XFS is not a simple task and I don't
> > won't to do
> > it on a lark.

AFAIK the best you can go for the MTA (and probably Cyrus, but I have not
tested) using non-commercial solutions is to get a few gigabytes worth of
fast nonvolatile RAM drives, and use ext3 in data=journaled mode and an
external journal (that goes in the nonvolatile RAM drive), and with all
hashing enabled. This requires a 2.6 kernel.  This is far safer than XFS,
which *will* corrupt queue and message data on crash.  

I tfind it kinda patetic that we cannot tell XFS (in Linux. I don't know
about SGI) to sync for real when fsync() is called, especially since write
barriers ARE fully implemented in the SCSI path for Linux AFAIK.  Maybe it
is just a bug, but until this gets fixed...

However, you can probably fix things (at the expense of far higher latencies
on mail delivery when the queue is huge, and a deeper queue too) by doing a
good job of limiting the LMTP resource usage.  At least the system will be
responsive, even if it will take a while for new mail to get through.

> I feel that XFS is a bad choice since it is not a 'truly' journaled file
> system.  If you have a power failure/system crash/lockup, etc., etc. You
> could very easily end up with a corrupt file system -- XFS doesn't write out
> to the disks immediately (caching unwritten data to memory).  EXT3 is
> journaled but very slow.  ReiserFS is a better choice for a journaled file
> system and if you can hold off until all the bugs are worked out, Reiser4FS
> would be the best choice (IMHO).

I didn't know reiser 3 would fully journal data (or that it has good enough
write barriers and write optimization to make sure the filesystem never
returns before a fsync really means everything including data is on disk).
Is that correct?  If it is, then reiser might be a better choice than ext3
with hashing (as long as you do use a fast-as-heck nvram drive for the
journal, of course).

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-02 Thread Bennett Crowell
On Dec 1, 2004, at 11:15, Hamish wrote:
Hello everyone
I dont want to start a religious battle, but could I have some 
opinions on filesystems for a 100ish user imap server? I have 2x 250G 
western digital disks to use.
We are using JFS on a Redhat Linux machine. The mailstore consists of 
two cyrus partitions, each of which is on a mirrored pair of disks.

--
Bennett Crowell
Electrical & Computer Engineering
Duke University
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-02 Thread John Madden
> The MTA is postfix and it is on a separate spindle -- the RAID is
> exclusively for the IMAP mailstore.  My setup includes two boxes that
> are MTA only and includes antivirus scanning of email, etc. One is
> primarily internal mail and the other is the primary external gateway.
> Neither of thses machines exhibit performance problems under load
> similar to the box running the cyrus server. Are there any postfix ->
> lmtp -> imap optimizations that you know of that I might implement?

Make sure you're up to postfix 2.1.5, first off.  Secondly, you might want
to discuss this on postfix-users, as you may have something bad in your
config.

John



-- 
John Madden
UNIX Systems Engineer
Ivy Tech State College
[EMAIL PROTECTED]


---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-02 Thread John Madden
> Is this strictly referencing UFS on Solaris?  Or is this also true with
> UFS on *BSD where UFS_DIRHASH is present?

I was, yes, but I have no experience with it on BSD.  "DIRHASH" sure
sounds nice. :)

John





-- 
John Madden
UNIX Systems Engineer
Ivy Tech State College
[EMAIL PROTECTED]


---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-02 Thread Simon Matter
> This is interesting because I have a linux box (RedHat AS3) using RAID 10.
>  I
> have some 5000 user accounts and anywhere from 2500 to 3000 concurrent
> IMAP
> sessions -- I think the Mulberry client opens multiple sessions since it's
> only some 300 to 500 individual concurrent users.  Anyway, what I
> especially
> notice is that when the mail delivery queue on the MTA gets very large,
> which
> happens occassionally, the CPU load average goes way up and iowait time as
> displayed using top can exceed 300% on a four processor box and
> performance
> goes all to heck.  Is switching the filesystem to XFS likely to help this
> situation?  Since there is some 80GB of mail spool currently in use,
> switching the filesystem to XFS is not a simple task and I don't won't to
> do
> it on a lark.

Your only problem here is that RedHat removes XFS from it's enterprise
kernels so you can't use it. I really blame RedHat for it. If you have
paid for your RedHat AS, I stronlgy suggest you complain there to get this
fixed. They could include the XFS kernel module in the unsupported
package.

Simon

>
> Thanks,
> Rob
>
>
> --On Wednesday, December 01, 2004 12:14:01 PM -0800 David Lang
> <[EMAIL PROTECTED]> wrote:
>
>> I've done some testing and seen a HUGE speedup when switching from
>> EXT2/3
>> to XFS. unfortunantly I haven't had a chance to do the same comparison
>> with
>> Reiserfs (I need to, but haven't had time)
>>
>> I was even able to see a dramatic difference with a single user
>> accessing a
>> fairly large mailbox (thousands of messages in the inbox)
>>
>> David Lang
>>
>
>
> --
> Rob Tanner
> UNIX Services Manager
> Linfield College, McMinnville OR
>
> ---
> Cyrus Home Page: http://asg.web.cmu.edu/cyrus
> Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
>
>


---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


RE: best filesystem for imap server

2004-12-01 Thread Jim Miller
> This is interesting because I have a linux box (RedHat AS3) using
> RAID 10.  I
> have some 5000 user accounts and anywhere from 2500 to 3000
> concurrent IMAP
> sessions -- I think the Mulberry client opens multiple sessions since it's
> only some 300 to 500 individual concurrent users.  Anyway, what I
> especially
> notice is that when the mail delivery queue on the MTA gets very
> large, which
> happens occassionally, the CPU load average goes way up and iowait time as
> displayed using top can exceed 300% on a four processor box and
> performance
> goes all to heck.  Is switching the filesystem to XFS likely to help this
> situation?  Since there is some 80GB of mail spool currently in use,
> switching the filesystem to XFS is not a simple task and I don't
> won't to do
> it on a lark.
>
> Thanks,
> Rob
>



I feel that XFS is a bad choice since it is not a 'truly' journaled file
system.  If you have a power failure/system crash/lockup, etc., etc. You
could very easily end up with a corrupt file system -- XFS doesn't write out
to the disks immediately (caching unwritten data to memory).  EXT3 is
journaled but very slow.  ReiserFS is a better choice for a journaled file
system and if you can hold off until all the bugs are worked out, Reiser4FS
would be the best choice (IMHO).


Jim


---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread Rob Tanner
The MTA is postfix and it is on a separate spindle -- the RAID is exclusively
for the IMAP mailstore.  My setup includes two boxes that are MTA only and
includes antivirus scanning of email, etc. One is primarily internal mail and
the other is the primary external gateway.  Neither of thses machines exhibit
performance problems under load similar to the box running the cyrus server.
Are there any postfix -> lmtp -> imap optimizations that you know of that I
might implement?

Thanks,
Rob

--On Wednesday, December 01, 2004 05:12:57 PM -0500 John Madden
<[EMAIL PROTECTED]> wrote:

>> This is interesting because I have a linux box (RedHat AS3) using RAID
>> 10.  I have some 5000 user accounts and anywhere from 2500 to 3000
>> concurrent IMAP sessions -- I think the Mulberry client opens multiple
>> sessions since it's only some 300 to 500 individual concurrent users.
>> Anyway, what I especially notice is that when the mail delivery queue on
>> the MTA gets very large, which happens occassionally, the CPU load
>> average goes way up and iowait time as displayed using top can exceed
>> 300% on a four processor box and performance goes all to heck.  Is
>> switching the filesystem to XFS likely to help this situation?  Since
>> there is some 80GB of mail spool currently in use, switching the
>> filesystem to XFS is not a simple task and I don't won't to do it on a
>> lark.
> 
> This sounds a lot like a problem with the MTA, not necessarily the
> filesystem alone.  Put your mail queue on a separate spindle if possible,
> first off, and make sure it's not doing anything "silly," as many MTA's
> have been known to do.
> 
> John
> 
> 
> PS: if($MTA ne "Postfix) { changeMTA(); }
> 
> 
> 
> 
> 
> 
> -- 
> John Madden
> UNIX Systems Engineer
> Ivy Tech State College
> [EMAIL PROTECTED]
> 
> 



-- 
Rob Tanner
UNIX Services Manager
Linfield College, McMinnville OR

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread Jason DiCioccio
Hello

On Wed, 1 Dec 2004 16:29:16 -0500 (EST), John Madden
<[EMAIL PROTECTED]> wrote:
> > Anyone know anything about Cyrus performance on UFS or the Veritas file
> > system, VXFS?
> 
> UFS is an utter nightmare, particularly with an IMAP load.  I've never run
> cyrus in particular on it, but knowing how it handles directories with
> lots of small files... Well, let's just say it's like ext2/3 without the
> horsepower.  Veritas, I hear, is supposed to be quite good at these
> things, but again, I haven't seen it personally.

Is this strictly referencing UFS on Solaris?  Or is this also true
with UFS on *BSD where UFS_DIRHASH is present?

Regards,
-JD-
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread John Madden
> This is interesting because I have a linux box (RedHat AS3) using RAID
> 10.  I have some 5000 user accounts and anywhere from 2500 to 3000
> concurrent IMAP sessions -- I think the Mulberry client opens multiple
> sessions since it's only some 300 to 500 individual concurrent users.
> Anyway, what I especially notice is that when the mail delivery queue on
> the MTA gets very large, which happens occassionally, the CPU load
> average goes way up and iowait time as displayed using top can exceed
> 300% on a four processor box and performance goes all to heck.  Is
> switching the filesystem to XFS likely to help this situation?  Since
> there is some 80GB of mail spool currently in use, switching the
> filesystem to XFS is not a simple task and I don't won't to do it on a
> lark.

This sounds a lot like a problem with the MTA, not necessarily the
filesystem alone.  Put your mail queue on a separate spindle if possible,
first off, and make sure it's not doing anything "silly," as many MTA's
have been known to do.

John


PS: if($MTA ne "Postfix) { changeMTA(); }






-- 
John Madden
UNIX Systems Engineer
Ivy Tech State College
[EMAIL PROTECTED]


---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread Jules Agee
John Madden wrote:
I dont want to start a religious battle, but could I have some opinions
on filesystems for a 100ish user imap server? I have 2x 250G western
digital disks to use.

I think the performance of those disks (and the RAID you put on them) will
be much more significant that the filesystem you use, considering the size
of your user population.  And given that factor, I'd say that even ext3
won't give you any problems performance-wise.  Still, reiserfs, IMO, would
be preferable for mail files.
The problems you run into when using ext3 depend less on how many users 
you have than how many messages those users put into individual folders. 
Even with 100 users, if they are heavy users, you can have noticable 
slowdowns if many of them keep more than 5000 messages in their INBOX. 
Using a filesystem without indexes means each fs access will run a 
linear scan through the list of files in the directory. Of course, 
having more users compounds the problem.

I agree with the RAID-1 suggestion. At least use the md driver to run a 
mirror. I did this on an old box a while back, mirrored two IDE drives 
using Linux's software RAID, for a box with over 100 users, and it 
worked well.

--
Jules Agee
System Administrator
Pacific Coast Feather Co.
[EMAIL PROTECTED]  x284
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread Rob Tanner
This is interesting because I have a linux box (RedHat AS3) using RAID 10.  I
have some 5000 user accounts and anywhere from 2500 to 3000 concurrent IMAP
sessions -- I think the Mulberry client opens multiple sessions since it's
only some 300 to 500 individual concurrent users.  Anyway, what I especially
notice is that when the mail delivery queue on the MTA gets very large, which
happens occassionally, the CPU load average goes way up and iowait time as
displayed using top can exceed 300% on a four processor box and performance
goes all to heck.  Is switching the filesystem to XFS likely to help this
situation?  Since there is some 80GB of mail spool currently in use,
switching the filesystem to XFS is not a simple task and I don't won't to do
it on a lark.

Thanks,
Rob


--On Wednesday, December 01, 2004 12:14:01 PM -0800 David Lang
<[EMAIL PROTECTED]> wrote:

> I've done some testing and seen a HUGE speedup when switching from EXT2/3
> to XFS. unfortunantly I haven't had a chance to do the same comparison with
> Reiserfs (I need to, but haven't had time)
> 
> I was even able to see a dramatic difference with a single user accessing a
> fairly large mailbox (thousands of messages in the inbox)
> 
> David Lang
> 


-- 
Rob Tanner
UNIX Services Manager
Linfield College, McMinnville OR

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread John Madden
> Anyone know anything about Cyrus performance on UFS or the Veritas file
> system, VXFS?

UFS is an utter nightmare, particularly with an IMAP load.  I've never run
cyrus in particular on it, but knowing how it handles directories with
lots of small files... Well, let's just say it's like ext2/3 without the
horsepower.  Veritas, I hear, is supposed to be quite good at these
things, but again, I haven't seen it personally.

John



-- 
John Madden
UNIX Systems Engineer
Ivy Tech State College
[EMAIL PROTECTED]


---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread Chris Doten
Anyone know anything about Cyrus performance on UFS or the Veritas file 
system, VXFS?

I'm running on Solaris, so I won't be using Reiser (or ext, for that 
matter.)  I'm seeing shockingly slow performance during a mass migrate 
to RAID 10 volumes in a reasonably fast SAN. deletemailbox, too, is 
slow. I've got about a 2,000 user base with ~700 concurrent 
connections.

Thanks- interesting thread.
Chris Doten
On Dec 1, 2004, at 2:14 PM, David Lang wrote:
I've done some testing and seen a HUGE speedup when switching from 
EXT2/3 to XFS. unfortunantly I haven't had a chance to do the same 
comparison with Reiserfs (I need to, but haven't had time)

I was even able to see a dramatic difference with a single user 
accessing a fairly large mailbox (thousands of messages in the inbox)

David Lang
 On Wed, 1 Dec 2004, John Madden wrote:
Date: Wed, 1 Dec 2004 13:12:57 -0500 (EST)
From: John Madden <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: best filesystem for imap server
I dont want to start a religious battle, but could I have some 
opinions
on filesystems for a 100ish user imap server? I have 2x 250G western
digital disks to use.
I think the performance of those disks (and the RAID you put on them) 
will
be much more significant that the filesystem you use, considering the 
size
of your user population.  And given that factor, I'd say that even 
ext3
won't give you any problems performance-wise.  Still, reiserfs, IMO, 
would
be preferable for mail files.

John

--
John Madden
UNIX Systems Engineer
Ivy Tech State College
[EMAIL PROTECTED]
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
--
There are two ways of constructing a software design. One way is to 
make it so simple that there are obviously no deficiencies. And the 
other way is to make it so complicated that there are no obvious 
deficiencies.
 -- C.A.R. Hoare
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread David Lang
I've done some testing and seen a HUGE speedup when switching from EXT2/3 
to XFS. unfortunantly I haven't had a chance to do the same comparison 
with Reiserfs (I need to, but haven't had time)

I was even able to see a dramatic difference with a single user accessing 
a fairly large mailbox (thousands of messages in the inbox)

David Lang
 On Wed, 1 Dec 2004, John 
Madden wrote:

Date: Wed, 1 Dec 2004 13:12:57 -0500 (EST)
From: John Madden <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: best filesystem for imap server
I dont want to start a religious battle, but could I have some opinions
on filesystems for a 100ish user imap server? I have 2x 250G western
digital disks to use.
I think the performance of those disks (and the RAID you put on them) will
be much more significant that the filesystem you use, considering the size
of your user population.  And given that factor, I'd say that even ext3
won't give you any problems performance-wise.  Still, reiserfs, IMO, would
be preferable for mail files.
John

--
John Madden
UNIX Systems Engineer
Ivy Tech State College
[EMAIL PROTECTED]
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
--
There are two ways of constructing a software design. One way is to make it so 
simple that there are obviously no deficiencies. And the other way is to make 
it so complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread John Madden
> Thanks for the answers, this is helpful. I use reiser for our samba
> server and it has never had problems, just wanted to check if there was
> something to bear in mind for imap. I will not be using RAID for the
> setup, I will just rsync the disks every night and in case of disaster,
> mount the "good" one. The rest of the OS is going on a 40G (couldn't
> find smaller!)

I'd reconsider using RAID.  Users hate losing a day of email and those
drives will almost certainly fail.  Plus, given that a lot of IMAP is
reading, RAID-1 might actually improve your performance (since reads can
be interleaved), not to mention improve reliability.

John




-- 
John Madden
UNIX Systems Engineer
Ivy Tech State College
[EMAIL PROTECTED]


---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread Hamish
Hamish wrote:
Hello everyone
I dont want to start a religious battle, but could I have some 
opinions on filesystems for a 100ish user imap server? I have 2x 250G 
western digital disks to use.
Thanks
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Thanks for the answers, this is helpful. I use reiser for our samba 
server and it has never had problems, just wanted to check if there was 
something to bear in mind for imap. I will not be using RAID for the 
setup, I will just rsync the disks every night and in case of disaster, 
mount the "good" one. The rest of the OS is going on a 40G (couldn't 
find smaller!)
Thanks again!
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread John Madden
> I dont want to start a religious battle, but could I have some opinions
> on filesystems for a 100ish user imap server? I have 2x 250G western
> digital disks to use.

I think the performance of those disks (and the RAID you put on them) will
be much more significant that the filesystem you use, considering the size
of your user population.  And given that factor, I'd say that even ext3
won't give you any problems performance-wise.  Still, reiserfs, IMO, would
be preferable for mail files.

John



-- 
John Madden
UNIX Systems Engineer
Ivy Tech State College
[EMAIL PROTECTED]


---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: best filesystem for imap server

2004-12-01 Thread Jules Agee
This has been discussed on the list before, check the archives. I assume 
with the hardware you mentioned, you're running Linux. For Linux, the 
consensus here seems to be XFS is the best, though I don't know what 
other filesystems these people have compared XFS to, or how detailed 
their testing was. I'm using reiserfs3 on a server that's averaging ~400 
concurrent IMAP connections with good results, so that's another good 
option.

ext3 is not recommended. Newer versions of ext3 support directory 
indexing, but from what I've read, even if you activate that feature it 
still doesn't work very well with directories that contain thousands of 
files, when compared with other available filesystems.

-Jules
Hamish wrote:
Hello everyone
I dont want to start a religious battle, but could I have some opinions 
on filesystems for a 100ish user imap server? I have 2x 250G western 
digital disks to use.
Thanks
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

--
Jules Agee
System Administrator
Pacific Coast Feather Co.
[EMAIL PROTECTED]  x284
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html