Re: [Dovecot] IMP reading entire mailbox

2009-06-15 Thread Timo Sirainen
On Tue, 2009-06-09 at 17:49 -0700, Kenneth Porter wrote:

 How about a configuration option that's a regex or glob of files to ignore 
 in the mail directories?

A plugin could do that. ACL plugin could be used as an example.



signature.asc
Description: This is a digitally signed message part


Re: [Dovecot] IMP reading entire mailbox

2009-06-11 Thread Chris O'Regan
 [...] My hunch
 is that IMP became very confused when we switched IMAP servers, and fell
 back to re-reading the entire mailbox each time it connected to the
 server. If I am correct, then we should see normal usage when he next
 accesses IMP with a fresh session.

It appears that my hunch is correct. Access via webmail for this user is
back to normal. Thank you for all the help!


-- 
Chris O'Regan ch...@encs.concordia.ca
Senior Unix Systems Administrator, Academic IT Services
Faculty of Engineering and Computer Science
Concordia University, Montreal, Canada



Re: [Dovecot] IMP reading entire mailbox

2009-06-10 Thread Chris O'Regan
  DUH! Please ignore the Corrupted index cache file; the mailbox in
  question is a file named delivery.log (i.e. procmail log file) that is
  in the user's ~/mail directory. Of course there will be problems
  trying
  to index this file, it is not an mbox! *sigh*
 
  Hmm. It still seems like there's something Dovecot should do better here.
  I'll see what I can do about it with my procmail log file :)
 
 How about a configuration option that's a regex or glob of files to ignore 
 in the mail directories?

I second the motion! :-) We have a number users with procmail recipes
that put deliver.log in ~/mail. This was not a problem with uw-imap
since most of our users would subscribe to their folders, and so they
would never see it. Now that Dovecot roots itself in ~/mail we are
recommending to our users not to bother with subscribing but instead
show all folders, so it is appearing in their folder list by default. A
few people have already complained, which means that there are a zillion
others who are just muttering to themselves. :-/


Chris



Re: [Dovecot] IMP reading entire mailbox

2009-06-10 Thread Chris O'Regan
 I have another hunch. Time to speak to the user...

He is now using a different client to access his mail and the problem
has gone away. I have asked him to try IMP again as I wish to confirm my
hunch.

Evidence suggests that he was away the day we migrated from uw-imap to
Dovecot but left himself logged in to IMP. He did not return until
Thursday, at which point he resumed using IMP. I suspect that he did not
actually start a new session, he just revived the existing one. My hunch
is that IMP became very confused when we switched IMAP servers, and fell
back to re-reading the entire mailbox each time it connected to the
server. If I am correct, then we should see normal usage when he next
accesses IMP with a fresh session.

If I can confirm this, then I don't think there is much that can be done
to fix the problem in either IMP or Dovecot as it is a very unique
case that will rarely be encountered. And we did warn people to close
their mail before the maintenance


Thanks everyone!

-- 
Chris O'Regan ch...@encs.concordia.ca
Senior Unix Systems Administrator, Academic IT Services
Faculty of Engineering and Computer Science
Concordia University, Montreal, Canada



[Dovecot] IMP reading entire mailbox

2009-06-09 Thread Chris O'Regan
We have been noticing a constant hum of traffic between our IMAP server
(Dovecot v1.1.14) and our filers over the past several days. I have
narrowed it down to one particular user who has been logged in to
IMP/Horde since the problem began (he hadn't logged since we upgraded to
Dovecot a week before).

IMP is periodically checking this user's mail every five minutes. Each
time it does, the his entire mailbox is being read. I initially thought
it might be an indexing problem since I am also seeing messages such as
this in the logs:

Corrupted index cache file 
/local/data/dovecot/indexes/mail/X//.imap/delivery.log/dovecot.index.cache:
 Broken MIME parts for mail UID 30880

But the in/out bytes being logged by Dovecot indicate that the entire
contents of his mailbox are being sent to the client. That is, this user
has a particularly large mailbox (~500M) and about 500M are being sent
to IMP every five minutes. (Actually, what is the meaning of the above
message? He's the only user logging it.)

The user does have some filters defined in IMP, but I can't imagine that
it would try to apply them to the entire mailbox each time it is
accessed. Surely it would only apply the filters to new messages, right?
But I can't think of any other reason why we would be seeing this
behaviour, and only for this user.

I have not yet contacted the user as I need something to tell him other
than you're a bandwidth hog. I think the next step is to ask him to
disable his filters, but I would prefer not shooting in the dark.

This user alone is adding about 50Mbps of bandwidth; between the time we
upgraded to Dovecot (from uw-imap) and he logged in, we were witnessing
a huge drop in traffic. He's ruining our graphs! ;-)

Seriously, if anyone has some ideas, please let me know.


Thanks!

-- 
Chris O'Regan ch...@encs.concordia.ca
Senior Unix Systems Administrator, Academic IT Services
Faculty of Engineering and Computer Science
Concordia University, Montreal, Canada



Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Timo Sirainen

On Jun 9, 2009, at 9:18 AM, Chris O'Regan wrote:


IMP is periodically checking this user's mail every five minutes. Each
time it does, the his entire mailbox is being read. I initially  
thought
it might be an indexing problem since I am also seeing messages such  
as

this in the logs:

Corrupted index cache file /local/data/dovecot/indexes/mail/X/ 
/.imap/delivery.log/dovecot.index.cache: Broken MIME parts  
for mail UID 30880


Is it always this same UID? How often does it happen? Every 5 minutes?  
Can you reproduce this manually somehow? If there's a way to reproduce  
this, I'd like to get a copy of the mailbox. You can put the mailbox  
through http://dovecot.org/tools/mbox-anonymize.pl and hopefully it  
will then still break.


Dovecot anyway detects that data in cache file is corrupted, so it  
deletes the entire cache file. If IMP actually fetches something like  
BODY or BODYSTRUCTURE for all messages, that causes all the messages  
to be read since the cache was just deleted.




Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Rick Romero

Quoting Timo Sirainen t...@iki.fi:


On Jun 9, 2009, at 9:18 AM, Chris O'Regan wrote:


IMP is periodically checking this user's mail every five minutes. Each
time it does, the his entire mailbox is being read. I initially thought
it might be an indexing problem since I am also seeing messages such as
this in the logs:

Corrupted index cache file  
/local/data/dovecot/indexes/mail/X//.imap/delivery.log/dovecot.index.cache: Broken MIME parts for mail UID  
30880


Is it always this same UID? How often does it happen? Every 5  
minutes? Can you reproduce this manually somehow? If there's a way  
to reproduce this, I'd like to get a copy of the mailbox. You can  
put the mailbox through http://dovecot.org/tools/mbox-anonymize.pl  
and hopefully it will then still break.


Dovecot anyway detects that data in cache file is corrupted, so it  
deletes the entire cache file. If IMP actually fetches something  
like BODY or BODYSTRUCTURE for all messages, that causes all the  
messages to be read since the cache was just deleted.


IMP also has an option to preview messages.  It reads the entire  
contents of the messages (should only be 20 or so for the 'current'  
INBOX window), and will display the first 200 characters or so of the  
email.  Is it possible the numbers you're seeing reflect the total  
size of his most current 20 emails?


Rick




Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Michael M. Slusarz

Quoting Rick Romero r...@havokmon.com:


Quoting Timo Sirainen t...@iki.fi:


On Jun 9, 2009, at 9:18 AM, Chris O'Regan wrote:


IMP is periodically checking this user's mail every five minutes. Each
time it does, the his entire mailbox is being read. I initially thought
it might be an indexing problem since I am also seeing messages such as
this in the logs:

Corrupted index cache file  
/local/data/dovecot/indexes/mail/X//.imap/delivery.log/dovecot.index.cache: Broken MIME parts for mail UID  
30880


Is it always this same UID? How often does it happen? Every 5  
minutes? Can you reproduce this manually somehow? If there's a way  
to reproduce this, I'd like to get a copy of the mailbox. You can  
put the mailbox through http://dovecot.org/tools/mbox-anonymize.pl  
and hopefully it will then still break.


Dovecot anyway detects that data in cache file is corrupted, so it  
deletes the entire cache file. If IMP actually fetches something  
like BODY or BODYSTRUCTURE for all messages, that causes all the  
messages to be read since the cache was just deleted.


IMP also has an option to preview messages.  It reads the entire  
contents of the messages (should only be 20 or so for the 'current'  
INBOX window), and will display the first 200 characters or so of  
the email.  Is it possible the numbers you're seeing reflect the  
total size of his most current 20 emails?


Unfortunately, with IMP 4.x we are limited to the API provided by  
c-client/PHP imap module.  This API is *horribly* inefficient and does  
all sorts of terrible IMAP things that are  
unnecessary/inefficient/embarassing.  So it is conceivable that the  
entire mailbox is scanned when a mailbox refresh occurs.


Good news is that IMP 5 fixes all this with a new IMAP library  
(created, in part, with advice directly from Timo himself :)).   
Although none of this explains why the dovecot cache is apparently  
being corrupted.


michael



Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Chris O'Regan
  Corrupted index cache file /local/data/dovecot/indexes/mail/X/ 
  /.imap/delivery.log/dovecot.index.cache: Broken MIME parts  
  for mail UID 30880
 
 Is it always this same UID? How often does it happen? Every 5 minutes?

The UID is increasing by one each time. There does not seem to be a
pattern. Here's a sample from the past couple of days:

Jun 8 05:56:57 ... UID 30859
Jun 8 06:14:44 ... UID 30860
Jun 8 09:03:01 ... UID 30861
Jun 8 09:44:27 ... UID 30862
Jun 8 10:25:48 ... UID 30863
Jun 8 11:30:46 ... UID 30864
Jun 8 12:00:49 ... UID 30865
Jun 8 12:48:12 ... UID 30866
Jun 8 13:53:12 ... UID 30867
Jun 8 14:17:04 ... UID 30868
Jun 8 14:29:44 ... UID 30869
Jun 8 17:35:01 ... UID 30870
Jun 8 18:34:49 ... UID 30871
Jun 8 18:41:14 ... UID 30872
Jun 8 22:25:33 ... UID 30873
Jun 8 23:36:45 ... UID 30874
Jun 9 01:28:36 ... UID 30875
Jun 9 03:25:57 ... UID 30876
Jun 9 06:33:33 ... UID 30877
Jun 9 10:11:08 ... UID 30878
Jun 9 10:23:47 ... UID 30879
Jun 9 10:29:56 ... UID 30880
Jun 9 12:40:48 ... UID 30881
Jun 9 13:40:08 ... UID 30882
Jun 9 13:58:18 ... UID 30883
Jun 9 14:28:23 ... UID 30884

 Can you reproduce this manually somehow? If there's a way to reproduce  
 this, I'd like to get a copy of the mailbox. You can put the mailbox  
 through http://dovecot.org/tools/mbox-anonymize.pl and hopefully it  
 will then still break.

It is doubtful that I would be permitted to send a copy of the mailbox,
anonymized or not. It is questionable whether I would even be permitted
to make a copy of it as an experiment.

 Dovecot anyway detects that data in cache file is corrupted, so it  
 deletes the entire cache file. If IMP actually fetches something like  
 BODY or BODYSTRUCTURE for all messages, that causes all the messages  
 to be read since the cache was just deleted.

Is there an easy way to tell if the cache file has been deleted and
recreated? For example, will the inode number change? I enabled
mail_debug for a short while this morning and did not notice anything
unusual with the logs for this user: connect, set up namespace,
disconnect just like everyone else.

At any rate, the Dovecot stats report bytes out of about 500MB every
five minutes (when the user's session closes); would this include the
data read to build the index? I suspect not.


-- 
Chris O'Regan ch...@encs.concordia.ca
Senior Unix Systems Administrator, Academic IT Services
Faculty of Engineering and Computer Science
Concordia University, Montreal, Canada



Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Chris O'Regan
 IMP also has an option to preview messages.  It reads the entire  
 contents of the messages (should only be 20 or so for the 'current'  
 INBOX window), and will display the first 200 characters or so of the  
 email.  Is it possible the numbers you're seeing reflect the total  
 size of his most current 20 emails?

I am fairly confident that IMP is reading the entire mailbox. The burst
of traffic caused by this user's imap process was about 500MB in size
(as reported by iptraf), same as the size of the user's mailbox, same as
the bytes out reported by Dovecot. The user has close to 4000 messages
in his mailbox. I just verified that the maximum messages per page that
can be set is 999.


Thanks.

-- 
Chris O'Regan ch...@encs.concordia.ca
Senior Unix Systems Administrator, Academic IT Services
Faculty of Engineering and Computer Science
Concordia University, Montreal, Canada



Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Timo Sirainen

On Jun 9, 2009, at 11:44 AM, Chris O'Regan wrote:


Corrupted index cache file /local/data/dovecot/indexes/mail/X/
/.imap/delivery.log/dovecot.index.cache: Broken MIME parts
for mail UID 30880


Is it always this same UID? How often does it happen? Every 5  
minutes?


The UID is increasing by one each time. There does not seem to be a
pattern. Here's a sample from the past couple of days:

Jun 8 05:56:57 ... UID 30859
Jun 8 06:14:44 ... UID 30860
Jun 8 09:03:01 ... UID 30861


That's kind of weird. So it doesn't seem like the contents of the  
mailbox actually matter.



Dovecot anyway detects that data in cache file is corrupted, so it
deletes the entire cache file. If IMP actually fetches something like
BODY or BODYSTRUCTURE for all messages, that causes all the messages
to be read since the cache was just deleted.


Is there an easy way to tell if the cache file has been deleted and
recreated? For example, will the inode number change?


The inode might change, but then again after the next recreation it  
could get the original inode back. But you could run idxview program  
and it'll show the cache file's file_seq which always increases:


idxview ~/Maildir
..
file_seq . = 1234639580 (2009-02-14 11:26:20) (9  
compressions)



I enabled
mail_debug for a short while this morning and did not notice anything
unusual with the logs for this user: connect, set up namespace,
disconnect just like everyone else.


mail_debug isn't really helpful in figuring out bugs.


At any rate, the Dovecot stats report bytes out of about 500MB every
five minutes (when the user's session closes); would this include the
data read to build the index? I suspect not.


Hmm. Right, no, it wouldn't. The indexing issue sounds like a bug I  
should fix though. As for sending the 500 MB it sounds like an IMP  
issue. You could enable rawlog to see what commands exactly IMP sends: http://wiki.dovecot.org/Debugging/Rawlog


And to avoid it writing 500 MB log files every time, you can use  
rawlog -o parameter to log only the client - server traffic (looks  
like I've messed up the -i and -o parameter order, guess I should  
switch them some day..)


Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Chris O'Regan
 Unfortunately, with IMP 4.x we are limited to the API provided by  
 c-client/PHP imap module.  This API is *horribly* inefficient and does  
 all sorts of terrible IMAP things that are  
 unnecessary/inefficient/embarassing.  So it is conceivable that the  
 entire mailbox is scanned when a mailbox refresh occurs.

I discussed this with our network administrator and he is not convinced
since we weren't seeing this problem prior to the migration to Dovecot a
couple of weeks ago (and the last time the user was seen logged in prior
to last Thursday when he reappeared and the problem started).

It's just peculiar that we are seeing a problem with only one user. I
have a feeling that the corrupt cache files are related since no other
user has triggered these messages.


Thanks.

-- 
Chris O'Regan ch...@encs.concordia.ca
Senior Unix Systems Administrator, Academic IT Services
Faculty of Engineering and Computer Science
Concordia University, Montreal, Canada



Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Michael M. Slusarz

Quoting Chris O'Regan ch...@encs.concordia.ca:


Unfortunately, with IMP 4.x we are limited to the API provided by
c-client/PHP imap module.  This API is *horribly* inefficient and does
all sorts of terrible IMAP things that are
unnecessary/inefficient/embarassing.  So it is conceivable that the
entire mailbox is scanned when a mailbox refresh occurs.


I discussed this with our network administrator and he is not convinced
since we weren't seeing this problem prior to the migration to Dovecot a
couple of weeks ago (and the last time the user was seen logged in prior
to last Thursday when he reappeared and the problem started).

It's just peculiar that we are seeing a problem with only one user. I
have a feeling that the corrupt cache files are related since no other
user has triggered these messages.


I would agree with the statement that this may be an issue due to a  
certain user/mailbox - I was not suggesting that IMP requires a full  
mailbox rebuild on every access for every user (yuck).  But just in  
case you started directly debugging the IMAP traffic, I just wanted to  
throw out a warning that you will definitiely see some queries that  
are extremely inefficient (e.g. c-client has a weird lookahead  
configuration that will grab information for the 20 messages  
immediately following the queried message, even though none of this  
information is needed).  To help in debugging, didn't want you/anyone  
else to think this was an issue with either Dovecot or IMP.


michael



Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Chris O'Regan
 Corrupted index cache file 
 /local/data/dovecot/indexes/mail/X//.imap/delivery.log/dovecot.index.cache:
  Broken MIME parts for mail UID 30880

DUH! Please ignore the Corrupted index cache file; the mailbox in
question is a file named delivery.log (i.e. procmail log file) that is
in the user's ~/mail directory. Of course there will be problems trying
to index this file, it is not an mbox! *sigh*

Okay, I think this rules out an indexing problem. But the main problem
persists, why has IMP all of a sudden begun reading the entire contents
of this user's mailbox each time it connects, and why has the started
since upgrading to Dovecot?

I have another hunch. Time to speak to the user...


Many thanks for the help so far!

-- 
Chris O'Regan ch...@encs.concordia.ca
Senior Unix Systems Administrator, Academic IT Services
Faculty of Engineering and Computer Science
Concordia University, Montreal, Canada



Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Chris O'Regan
 The inode might change, but then again after the next recreation it  
 could get the original inode back. But you could run idxview program  
 and it'll show the cache file's file_seq which always increases:

Here is the file sequence for the cache file that corresponds to the
problematic mailbox:

file_seq . = 1243426033 (2009-05-27 08:07:13) (13 compressions)

The timestamp corresponds with the time we switched over to Dovecot.

 Hmm. Right, no, it wouldn't. The indexing issue sounds like a bug I  
 should fix though. As for sending the 500 MB it sounds like an IMP  
 issue. You could enable rawlog to see what commands exactly IMP sends: 
 http://wiki.dovecot.org/Debugging/Rawlog
 
 And to avoid it writing 500 MB log files every time, you can use  
 rawlog -o parameter to log only the client - server traffic (looks  
 like I've messed up the -i and -o parameter order, guess I should  
 switch them some day..)

This looks interesting, but I may want to ask the user to try this on
our development server as I am reluctant to turn this on in production.
All of our IMP connections are localhost - localhost. :-)


-- 
Chris O'Regan ch...@encs.concordia.ca
Senior Unix Systems Administrator, Academic IT Services
Faculty of Engineering and Computer Science
Concordia University, Montreal, Canada



Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Timo Sirainen

On Jun 9, 2009, at 1:22 PM, Chris O'Regan wrote:


The inode might change, but then again after the next recreation it
could get the original inode back. But you could run idxview program
and it'll show the cache file's file_seq which always increases:


Here is the file sequence for the cache file that corresponds to the
problematic mailbox:

file_seq . = 1243426033 (2009-05-27 08:07:13) (13  
compressions)


So it looks like the cache file doesn't actually get deleted? Make a  
copy of the dovecot.index* files and delete the cache file. Perhaps  
the error goes away then? If that helps, I could then ask you to  
copypaste some of the idxview results from the copy so I could figure  
out why it wasn't really deleting the file.


Or are you sure you're looking at the right index files, in the right  
mailbox? :)




Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Timo Sirainen

On Jun 9, 2009, at 1:01 PM, Chris O'Regan wrote:

Corrupted index cache file /local/data/dovecot/indexes/mail/X/ 
/.imap/delivery.log/dovecot.index.cache: Broken MIME parts  
for mail UID 30880


DUH! Please ignore the Corrupted index cache file; the mailbox in
question is a file named delivery.log (i.e. procmail log file) that is
in the user's ~/mail directory. Of course there will be problems  
trying

to index this file, it is not an mbox! *sigh*


Hmm. It still seems like there's something Dovecot should do better  
here. I'll see what I can do about it with my procmail log file :)



Okay, I think this rules out an indexing problem. But the main problem
persists, why has IMP all of a sudden begun reading the entire  
contents

of this user's mailbox each time it connects, and why has the started
since upgrading to Dovecot?


I've no idea without rawlogs.



Re: [Dovecot] IMP reading entire mailbox

2009-06-09 Thread Kenneth Porter

--On Tuesday, June 09, 2009 2:49 PM -0700 Timo Sirainen t...@iki.fi wrote:


DUH! Please ignore the Corrupted index cache file; the mailbox in
question is a file named delivery.log (i.e. procmail log file) that is
in the user's ~/mail directory. Of course there will be problems
trying
to index this file, it is not an mbox! *sigh*


Hmm. It still seems like there's something Dovecot should do better here.
I'll see what I can do about it with my procmail log file :)


How about a configuration option that's a regex or glob of files to ignore 
in the mail directories?