I've been seeing some weird hangs with Mailman in the last few days. We have a large number of lists, running 2.1.4 w/ htdig patches. This is a system that's been running with no problems for quite a while (without any major changes that I'm aware of).
In most cases, removing all lockfiles other than the master qrunner processes seems to restore normal operation. None of the lists that seem to be causing things to hang are large at all (most seem to have about 5-6 members). The processes need a SIGKILL to die - sending them a SIGTERM does nothing. Nothing unusual in $MAILMAN_DIR/logs/error (or other logs other than "locks") AFAICT. A bunch of these: locks:Mar 03 11:23:52 2004 (21323) xxxx.lock lifetime has expired, breaking Mailman is running on local disk (not over NFS); this is on a Debian Linux 3.0 system, with Python 2.1.3, and Mailman built from source (Python 1.5 and 2.2 are installed as well, but 2.1.3 is the default and the one that Mailman is probably using). The clock is synched using NTP, and nothing appears to be wrong with NTP at the moment. I've also had some problems with the Mailman master qrunner process itself not dying when stopped via mailmanctl; I've had to send a SIGKILL to the mailman user's processes, remove all lockfiles and then restart Mailman. For now, I've just been doing a "watch ls" in /home/mailman/locks, and keeping an eye out for locks that are sitting there way too long or for a total lack of activity. I'd be happy to send more information (either on or off list) if there's anything else that would be helpful. I'm not an expert on Python (or programming in general), and while I've administered Mailman lists, I'm not the person who normally deals with this particular installation. Running strace on the pid of one of these long-running processes usually shows something repetitive like this: old_mmap(NULL, 249856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40304000 munmap(0x40341000, 249856) = 0 old_mmap(NULL, 249856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40341000 munmap(0x40304000, 249856) = 0 old_mmap(NULL, 249856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40304000 munmap(0x40341000, 249856) = 0 old_mmap(NULL, 249856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40341000 munmap(0x40304000, 249856) = 0 old_mmap(NULL, 249856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40304000 munmap(0x40341000, 249856) = 0 old_mmap(NULL, 249856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40341000 munmap(0x40304000, 249856) = 0 old_mmap(NULL, 249856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40304000 munmap(0x40341000, 249856) = 0 old_mmap(NULL, 249856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40341000 munmap(0x40304000, 249856) = 0 old_mmap(NULL, 249856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40304000 munmap(0x40341000, 249856) = 0 old_mmap(NULL, 249856, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40341000 munmap(0x40304000, 249856) = 0 or this: brk(0x8340000) = 0x8340000 brk(0x830a000) = 0x830a000 brk(0x8325000) = 0x8325000 brk(0x8340000) = 0x8340000 brk(0x830a000) = 0x830a000 brk(0x8325000) = 0x8325000 brk(0x8340000) = 0x8340000 brk(0x830a000) = 0x830a000 brk(0x8325000) = 0x8325000 brk(0x8340000) = 0x8340000 brk(0x830a000) = 0x830a000 brk(0x8325000) = 0x8325000 brk(0x8340000) = 0x8340000 brk(0x830a000) = 0x830a000 brk(0x8325000) = 0x8325000 brk(0x8340000) = 0x8340000 -- "Since when is skepticism un-American? Dissent's not treason but they talk like it's the same..." (Sleater-Kinney - "Combat Rock") ------------------------------------------------------ Mailman-Users mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/