Re: Large mailboxes (was: rfc2505)

2001-12-28 Thread Benjamin Scott

On Wed, 26 Dec 2001, Derek D. Martin wrote:
> UW seems to like to write temporary copies of mailboxes to /tmp.  We ran
> into a problem where it would fill up /tmp and barf, causing a variety
> of weird problems.

  I am aware of that particular issue, but this does not appear to be it.
/tmp has plenty of space, and it has not been full for any of our checks.

  There are some large mailboxes, in the 100+ MB range, with thousands of
messages in them.  As we know, UW-IMAP likes to do full-rewrites of the
mailbox data.  It also has to read through and index the folder on every
open.  Combine this with MS Outlook, which appears to close and re-open the
IMAP session more often than the UW people expect.  Additionally, they run
the MS-Windows equivalent of an IMAP "xbiff" on each PC, which appears to
cause file lock contention/breaking issues.

  Moving to Cyrus IMAP (or other, similar systems) sounds good for a number
of reasons.  Deletes/expunges are much more efficient.  Indexes are kept on
disk.  One-file-per-message drastically reduces the chances of lock
contention.

  At least, that is the theory.  It remains to be seen how well it will work
in practice.  :-)

-- 
Ben Scott <[EMAIL PROTECTED]>
| The opinions expressed in this message are those of the author and do not |
| necessarily represent the views or policy of any other person, entity or  |
| organization.  All information is provided without warranty of any kind.  |


*
To unsubscribe from this list, send mail to [EMAIL PROTECTED]
with the text 'unsubscribe gnhlug' in the message body.
*



Re: Large mailboxes (was: rfc2505)

2001-12-26 Thread Paul Iadonisi

On Wed, Dec 26, 2001 at 12:35:33PM -0500, Kenneth E. Lussier wrote:

[snip]

> locally so the server doesn't need to process them every time. You
> also don't have to use the entire Courier system. You can individually
> get the IMAP server, the webmail system, and the filtering system
> (maildrop). Or, you can get the whole package.

That's actually one of the cool things I like about Courier.  It's not all
or nothing and you don't have to try to fit a square peg into a round hole
(or a Maildir peg into sendmail hole ;-)).  I'm actually working on a project
to ditch qmail and use *cough* sendmail (please direct security flames about
sendmail elsewhere -- sendmail hasn't had a *remotely* exploitable hole
for a *long* time other than one that wasn't sendmail's fault, but the 2.2
Linux kernel's fault.  And these are sealed servers -- no shell access.).
For the sake of transition I was trying to find a way to have sendmail to
deliver to Maildir format.  Enter maildrop.  Maildrop dropped right into
sendmail as a local mailer just fine.

When I get a chance, I'm going to investigate the Courier system further.
Especially since it looks like it's licensed under the GPL (I'm a GPL
bigot and proud of it), which makes it easier to tie into mysql without
having to worry about licensing conflicts.  Especially with Mysql AB's
slightly whacked view of the GPL -- linking is a nebulous concept, but
communicating over a network socket is NOT linking -- not that *I* really
care, it just makes it difficult to use it as a backend unless you make
sure you work with some other database as well or plan to release everything
under the GPL.  The (off-topic?) point is that pursuing my user configurable
smtp rejection might not work too well license-wise if I use sendmail and
mysql.  I could be wrong, though, having it *all* be GPLed is probably
safer.

I know, there's postgresql as well.  But since I'ld prefer it be GPLed,
and it seems that Courier generates a lot less groans when it comes to
security, that may be the route I take.  The 'pick and choose what you want'
concept of Courier is a great added bonus.

-- 
-Paul Iadonisi
 Senior Systems Administrator
 Red Hat Certified Engineer / Local Linux Lobbyist
 Ever see a penguin fly?  --  Try Linux.
 GPL all the way: Sell services, don't lease secrets

*
To unsubscribe from this list, send mail to [EMAIL PROTECTED]
with the text 'unsubscribe gnhlug' in the message body.
*



[pri.gnhlug@iadonisi.to: Re: Large mailboxes (was: rfc2505)]

2001-12-26 Thread Paul Iadonisi

Harumph!  I'm used to a different mailing list which has the Reply-To:
field set to the list.

- Forwarded message from Paul Iadonisi <[EMAIL PROTECTED]> -

Date: Wed, 26 Dec 2001 11:57:31 -0500
From: Paul Iadonisi <[EMAIL PROTECTED]>
To: Benjamin Scott <[EMAIL PROTECTED]>
Subject: Re: Large mailboxes (was: rfc2505)
User-Agent: Mutt/1.2.5i

On Wed, Dec 26, 2001 at 11:55:46AM -0500, Benjamin Scott wrote:
> On Wed, 26 Dec 2001, mike ledoux wrote:

[snip]

> > I don't know if ReiserFS solves the 'many small files' problem, but I'm
> > pretty sure I saw people on the mutt list complaining about
> > Maildir/ReiserFS being a painful combination in terms of speed.
> 
>   I suspect we will have to do some real-world testing, and likely try more
> than one solution, before we finally find what works best.  I've found it
> damn hard to simulate mail usage accurately.

  Oh, so true.  This is going to be one of the hardest things in testing
a user configurable smtp rejection system.  How do I simulate email traffic
for a large organization?

  For the record, I have seen the 'large number of small files' problems
with ext2/ext3, but not with my mail folders.  UW-IMAP fell over with
large files a *lot* sooner than Cyrus-IMAPD with lots of files in one
directory.  I haven't seen a problem yet (I think my largest folder
is around 9000 messages -- my GNHLUG folder, of course).  With UW-IMAP,
it took a looong time to do basic operations on my GNHLUG folder.  It's
much zippier with Cyrus' one message per file approach.

-- 
-Paul Iadonisi
 Senior Systems Administrator
 Red Hat Certified Engineer / Local Linux Lobbyist
 Ever see a penguin fly?  --  Try Linux.
 GPL all the way: Sell services, don't lease secrets

- End forwarded message -

-- 
-Paul Iadonisi
 Senior Systems Administrator
 Red Hat Certified Engineer / Local Linux Lobbyist
 Ever see a penguin fly?  --  Try Linux.
 GPL all the way: Sell services, don't lease secrets

*
To unsubscribe from this list, send mail to [EMAIL PROTECTED]
with the text 'unsubscribe gnhlug' in the message body.
*



Re: Large mailboxes (was: rfc2505)

2001-12-26 Thread Kenneth E. Lussier

Benjamin Scott wrote:
>
>   We've been considering Cyrus IMAP, maildir format, and possibly a better
> filesystem (ReiserFS, most likely).

I've been using the Courier mail system for a few months now with
Maildir format, and it has solved a lot of performance issues. The
nice thing is that it doesn't have to load a single >100MB file into
memory/cache/\/tmp unless you have one single e-mail that is that
large (if you do, then you need to smack the person that sent it to
you ;-) Also, depending on the client that is most commonly used in
your environment, it may not even have to load the individual files. A
lot of IMAP clients will now cache (they call it download) messages
locally so the server doesn't need to process them every time. You
also don't have to use the entire Courier system. You can individually
get the IMAP server, the webmail system, and the filtering system
(maildrop). Or, you can get the whole package.

C-Ya,
Kenny
-- 
---
 Kenneth E. Lussier
 Geek by nature, Linux by choice
 PGP KeyID C0D2BA57 
 Public key
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xC0D2BA57

*
To unsubscribe from this list, send mail to [EMAIL PROTECTED]
with the text 'unsubscribe gnhlug' in the message body.
*



Re: Large mailboxes (was: rfc2505)

2001-12-26 Thread Benjamin Scott

On Wed, 26 Dec 2001, mike ledoux wrote:
> I'm not using IMAP, and that INBOX is one of the main reasons why I'm
> not--I've yet to find an IMAP server that is capable of dealing with the
> way I use mail.

  s/use/abuse/  ;-)  I question whether any mailbox with that many messages
in it can really be useful.  (As an active mailbox -- as an archive, yes.)

> The UW-IMAP server is really bad with large (>50MB) mailboxes,
> independent of client software.

  The UW-IMAP server is really bad, period.  Rather like Sendmail, it
appears to be the popular implementation more by accident than anything
else.  And being popular does mean implementing something else is harder.

> The server is a PIII-700 with 1GB of ECC RAM that runs mail, news, and
> web services for Suzanne and I, so it is likely that most of my
> mailboxes live in cache most of the time.

  Okay then.  I think that answers the question of why performance is not an
issue!  :-)

> As I type this message the RSS of my mutt process is 3580, as reported
> by 'top'.

  Not that it matters, but whether or not the mailbox is in the filesystem
cache should have no impact the RSS of your MUA.  :-)

>> We've been considering Cyrus IMAP, maildir format, and possibly a better
>> filesystem (ReiserFS, most likely).
>
> I don't know if ReiserFS solves the 'many small files' problem, but I'm
> pretty sure I saw people on the mutt list complaining about
> Maildir/ReiserFS being a painful combination in terms of speed.

  I suspect we will have to do some real-world testing, and likely try more
than one solution, before we finally find what works best.  I've found it
damn hard to simulate mail usage accurately.

-- 
Ben Scott <[EMAIL PROTECTED]>
| The opinions expressed in this message are those of the author and do not |
| necessarily represent the views or policy of any other person, entity or  |
| organization.  All information is provided without warranty of any kind.  |


*
To unsubscribe from this list, send mail to [EMAIL PROTECTED]
with the text 'unsubscribe gnhlug' in the message body.
*



Re: Large mailboxes (was: rfc2505)

2001-12-26 Thread Derek D. Martin

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

At some point hitherto, Benjamin Scott hath spake thusly:
> On Wed, 26 Dec 2001, mike ledoux wrote:
> > my inbox has 23481 messages in it right now in a 495166727 byte
> > mbox--performance is fine with mbox ...
> 
>   Really?  Wow.  Is that a single-user system where the whole mailbox can
> fit into RAM, or something?  One of our customers has been hitting numbers
> like that, and performance sucks.  It appears to be a combination of mailbox
> format (mbox), size of mailboxes (multiple mailboxes in the 100+ MB range)
> IMAP server implementation (UW), and client software behavior (MS Outlook
> and friends), but whatever the exact cause, it appears to be pathological.

Ran into this problem, or something like it.  Try making sure /tmp is
large enough to contain multible copies of your larger mailboxes.  Or
if the system is busy, make sure it's large enough to hold numerous
copies of larger mailboxes.  I'd suggest 1GB or larger.  UW seems to
like to write temporary copies of mailboxes to /tmp.  We ran into a
problem where it would fill up /tmp and barf, causing a variety of
weird problems.

HTH

- -- 
Derek Martin   [EMAIL PROTECTED]
- -
I prefer mail encrypted with PGP/GPG!
GnuPG Key ID: 0x81CFE75D
Retrieve my public key at http://pgp.mit.edu
Learn more about it at http://www.gnupg.org
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8KfTGdjdlQoHP510RAjFRAJ9MoKlPldJmYzjkrSZZBbyXGyHr8ACgook1
LFK1XxrRwk1ZunzNn+uKp9M=
=o/AV
-END PGP SIGNATURE-

*
To unsubscribe from this list, send mail to [EMAIL PROTECTED]
with the text 'unsubscribe gnhlug' in the message body.
*



Large mailboxes (was: rfc2505)

2001-12-26 Thread Benjamin Scott

On Wed, 26 Dec 2001, mike ledoux wrote:
> my inbox has 23481 messages in it right now in a 495166727 byte
> mbox--performance is fine with mbox ...

  Really?  Wow.  Is that a single-user system where the whole mailbox can
fit into RAM, or something?  One of our customers has been hitting numbers
like that, and performance sucks.  It appears to be a combination of mailbox
format (mbox), size of mailboxes (multiple mailboxes in the 100+ MB range)
IMAP server implementation (UW), and client software behavior (MS Outlook
and friends), but whatever the exact cause, it appears to be pathological.

  We've been considering Cyrus IMAP, maildir format, and possibly a better
filesystem (ReiserFS, most likely).

> How does the quote go?  'There are seldom good technological solutions
> to social problems.'

"There are seldom good technological solutions to behavioral problems."
  -- Ed Crowley

-- 
Ben Scott <[EMAIL PROTECTED]>
| The opinions expressed in this message are those of the author and do not |
| necessarily represent the views or policy of any other person, entity or  |
| organization.  All information is provided without warranty of any kind.  |


*
To unsubscribe from this list, send mail to [EMAIL PROTECTED]
with the text 'unsubscribe gnhlug' in the message body.
*