Hi,
I've run courier-imap with this setup for many years, and 4.1.1 has been in
use now for a long time on openbsd 4.6. I also ran 4.8.0 on openbsd 4.8
for some time with no issues. Mailboxes are on NFS.
Same NFS mount options and server, except now I'm trying courier-imap-4.9.3
on openbsd 5.0. Very frequently, imapd goes into an apparent loop (sanitized):
25204 imapd EMUL "native"
25204 imapd RET nanosleep 0
25204 imapd CALL getpid()
25204 imapd RET getpid 25204/0x6274
25204 imapd CALL __sysctl(1.10,0x7f7ffffe5666,0x7f7ffffe5618,0,0)
25204 imapd RET __sysctl 0
25204 imapd CALL
open(0x207329f00,0x202<O_RDWR|O_CREAT>,0x1a4<S_IRUSR|S_IWUSR|S_IRGRP|S_IROTH>)
25204 imapd NAMI
"./tmp/1234567890.M12345P25204_courierlock.thehost.invalid"
25204 imapd RET open 4
25204 imapd CALL write(0x4,0x7f7ffffe5660,0x19)
25204 imapd GIO fd 4 wrote 25 bytes
"25204:thehost.invalid"
25204 imapd RET write 25/0x19
25204 imapd CALL close(0x4)
25204 imapd RET close 0
25204 imapd CALL link(0x207329f00,0x20c321b00)
25204 imapd NAMI
"./tmp/1234567890.M12345P25204_courierlock.thehost.invalid"
25204 imapd NAMI "./tmp/courier.lock"
25204 imapd RET link -1 errno 17 File exists
25204 imapd CALL unlink(0x207329f00)
25204 imapd NAMI
"./tmp/1234567890.M12345P25204_courierlock.thehost.invalid"
25204 imapd RET unlink 0
25204 imapd CALL open(0x20c321b00,0<O_RDONLY>,<unused>0x78)
25204 imapd NAMI "./tmp/courier.lock"
25204 imapd RET open 4
25204 imapd CALL read(0x4,0x7f7ffffe5410,0x1ff)
25204 imapd GIO fd 4 read 25 bytes
"25204:thehost.invalid"
The pid in the pidfile matches the running process id.
It seems to be locking against itself... and never breaks free. I've
let it run for hours this way. The clocks are synchronized with NTP.
In going through the code changes, there really haven't been any changes
between 4.1.1 and 4.9.3 w.r.t this section of the code. (I don't use
FAM or GAMIN at all.)
Unsurprisingly, rm Maildir/tmp/courier.lock immediately allows imapd
to proceed.
Remote IMAP clients just keep reconnecting, causing hundreds of imapd's
to be running which never exit on their own. Local commandline clients
don't like it very much either :)
There doesn't seem to be anything specific to cause it to happen -- I've
seen it when switching folders, or more recently by simply leaving it
idle for a few hours and then returning... finding it having been stuck
for hours this way, by the timestamp on the lockfile.
I've followed openbsd development closely and didn't notice anything
recently that should have affected this. This same NFS mount handles
tens of thousands of postfix deliveries a day to these same mailboxes,
so far without any weirdness.
The config files are stock except for increased MAXPERIP and MAXDAEMONS.
In running the testsuite I found some failures; rfc2045/reformime segfaults
because in /usr/share/locale, "en_US.utf-8" doesn't exist, while
"en_US.UTF-8" (caps) does. OS bug?
If I cd into the imap subdirectory and run 'gmake testsuite-imap' on
the NFS mounted area, it proceeds fine until test T014:
001313 T013 OK LOGOUT completed
001314 * PREAUTH Ready.
001315 * BYE [ALERT] Fatal error: Invalid argument
001316 * PREAUTH Ready.
001317 * BYE [ALERT] Fatal error: Invalid argument
001318 * PREAUTH Ready.
001319 * BYE [ALERT] Fatal error: Invalid argument
001320 * PREAUTH Ready.
001321 * BYE [ALERT] Fatal error: Invalid argument
001322 * PREAUTH Ready.
001323 * BYE [ALERT] Fatal error: Invalid argument
Is this in any way related?
Thanks in advance for any help in resolving this, I've done everything
I can think of...
------------------------------------------------------------------------------
The demand for IT networking professionals continues to grow, and the
demand for specialized networking skills is growing even more rapidly.
Take a complimentary Learning@Cisco Self-Assessment and learn
about Cisco certifications, training, and career opportunities.
http://p.sf.net/sfu/cisco-dev2dev
_______________________________________________
courier-users mailing list
[email protected]
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users