Mark, I referred to https://wiki.list.org/x/17891756 <https://wiki.list.org/x/17891756> before I even contacted the list.
All of the cgi wrappers are suid. check_perms run as root finds no problems. One thing I noticed is that there was no locks directory anywhere in the installation. Is this normal? (Places I looked: /var/lib/mailman /usr/lib/mailman and /etc/mailman.) Also the /var/log/mailman/lock was empty but now shows: Sep 27 10:51:45 2016 (12700) fttc.lock lifetime has expired, breaking Sep 27 10:51:45 2016 (12700) File "/usr/lib/mailman/bin/qrunner", line 278, in <module> Sep 27 10:51:45 2016 (12700) main() Sep 27 10:51:45 2016 (12700) File "/usr/lib/mailman/bin/qrunner", line 238, in main Sep 27 10:51:45 2016 (12700) qrunner.run() Sep 27 10:51:45 2016 (12700) File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 70, in run Sep 27 10:51:45 2016 (12700) filecnt = self._oneloop() Sep 27 10:51:45 2016 (12700) File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 119, in _oneloop Sep 27 10:51:45 2016 (12700) self._onefile(msg, msgdata) Sep 27 10:51:45 2016 (12700) File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 190, in _onefile Sep 27 10:51:45 2016 (12700) keepqueued = self._dispose(mlist, msg, msgdata) Sep 27 10:51:45 2016 (12700) File "/usr/lib/mailman/Mailman/Queue/IncomingRunner.py", line 115, in _dispose Sep 27 10:51:45 2016 (12700) mlist.Lock(timeout=mm_cfg.LIST_LOCK_TIMEOUT) Sep 27 10:51:45 2016 (12700) File "/usr/lib/mailman/Mailman/MailList.py", line 161, in Lock Sep 27 10:51:45 2016 (12700) self.__lock.lock(timeout) Sep 27 10:51:45 2016 (12700) File "/usr/lib/mailman/Mailman/LockFile.py", line 306, in lock Sep 27 10:51:45 2016 (12700) important=True) Sep 27 10:51:45 2016 (12700) File "/usr/lib/mailman/Mailman/LockFile.py", line 416, in __writelog Sep 27 10:51:45 2016 (12700) traceback.print_stack(file=logf) (Which is referencing the list in question.) Thanks again, Chuck > On Sep 27, 2016, at 10:29 AM, Mark Sapiro <[email protected]> wrote: > > On 09/27/2016 06:55 AM, Chuck Weinstock wrote: >> Whoops. The reinstalled Mailman stopped working with the same problem >> overnight. Two of the eight qrunners crashed. >> >> I have 3-4 lists and one of them will not open in the web admin >> interface. It times out as per the apache log: >> >> [Tue Sep 27 09:45:53.591373 2016] [cgi:warn] [pid 2483] [client >> 128.237.211.152:49581] AH01220: Timeout waiting for output from CGI >> script /usr/lib/mailman/cgi-bin/admin, referer: >> http://www.conjel.co/mailman/admin/fttc >> [Tue Sep 27 09:45:53.592426 2016] [cgi:error] [pid 2483] [client >> 128.237.211.152:49581] Script timed out before returning headers: admin, >> referer: http://www.conjel.co/mailman/admin/fttc >> [Tue Sep 27 09:46:53.639699 2016] [cgi:warn] [pid 2483] [client >> 128.237.211.152:49581] AH01220: Timeout waiting for output from CGI >> script /usr/lib/mailman/cgi-bin/admin, referer: >> http://www.conjel.co/mailman/admin/fttc >> [Tue Sep 27 09:46:53.640524 2016] [reqtimeout:info] [pid 2483] [client >> 128.237.211.152:49581] AH01382: Request body read timeout > > > The CGIs are timing out. This is normally caused by a locked list. > > >> Here is the access log from the same time frame: >> >> 128.237.211.152 - - [27/Sep/2016:09:44:51 -0400] "GET >> /mailman/admin/fttc HTTP/1.1" 200 2078 >> 128.237.211.152 - - [27/Sep/2016:09:44:53 -0400] "POST >> /mailman/admin/fttc HTTP/1.1" 504 247 >> >> Here is the qrunner log (from earlier when the two qrunners stopped): >> >> Sep 27 06:09:59 2016 (7136) Master qrunner detected subprocess exit >> (pid: 1194, sig: 9, sts: None, class: VirginRunner, slice: 1/1) [restarting] > > sig: 9 is a SIGKILL. This seems to say that something external is > killing the runner. > > This is likely the same or a similar underlying cause as the CGI > timeouts, but is different as the CGIs are independent of the qrunners. > > >> >> Finally this is the only error in the Mailman error file since the >> reinstall last night. >> >> Sep 26 20:59:51 2016 admin(8885): >> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ >> admin(8885): [----- Mailman Version: 2.1.15 -----] >> admin(8885): [----- Traceback ------] >> admin(8885): Traceback (most recent call last): >> admin(8885): File "/usr/lib/mailman/scripts/driver", line 112, in run_main >> admin(8885): main() >> admin(8885): File "/usr/lib/mailman/Mailman/Cgi/admindb.py", line 198, >> in main >> admin(8885): mlist.Save() >> admin(8885): File "/usr/lib/mailman/Mailman/MailList.py", line 578, in >> Save >> admin(8885): self.__save(dict) >> admin(8885): File "/usr/lib/mailman/Mailman/MailList.py", line 555, in >> __save >> admin(8885): os.link(fname, fname_last) >> admin(8885): OSError: [Errno 1] Operation not permitted > > > This is a permission or security manager (SELinux, apparmor, ?) issue. > > First try running Mailman's 'bin/check_perms -f` as root. If that fixes > things, it may help. Also, see <https://wiki.list.org/x/17891756>. > > Note that Mailman's CGI wrappers must be group mailman and SETGID. In > particular, these files must not be on a file system mounted with 'nosuid'. > > If none of this helps, try disabling SELinux. > > The qrunners being SIGKILLed is still a bit mysterious, but that could > be related to a permissions or SELinux issue. > > -- > Mark Sapiro <[email protected]> The highway is for gamblers, > San Francisco Bay Area, California better use your sense - B. Dylan ------------------------------------------------------ Mailman-Users mailing list [email protected] https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org
