Whoops. The reinstalled Mailman stopped working with the same problem overnight. Two of the eight qrunners crashed.
I have 3-4 lists and one of them will not open in the web admin interface. It times out as per the apache log: [Tue Sep 27 09:45:53.591373 2016] [cgi:warn] [pid 2483] [client 128.237.211.152:49581] AH01220: Timeout waiting for output from CGI script /usr/lib/mailman/cgi-bin/admin, referer: http://www.conjel.co/mailman/admin/fttc [Tue Sep 27 09:45:53.592426 2016] [cgi:error] [pid 2483] [client 128.237.211.152:49581] Script timed out before returning headers: admin, referer: http://www.conjel.co/mailman/admin/fttc [Tue Sep 27 09:46:53.639699 2016] [cgi:warn] [pid 2483] [client 128.237.211.152:49581] AH01220: Timeout waiting for output from CGI script /usr/lib/mailman/cgi-bin/admin, referer: http://www.conjel.co/mailman/admin/fttc [Tue Sep 27 09:46:53.640524 2016] [reqtimeout:info] [pid 2483] [client 128.237.211.152:49581] AH01382: Request body read timeout Here is the access log from the same time frame: 128.237.211.152 - - [27/Sep/2016:09:44:51 -0400] "GET /mailman/admin/fttc HTTP/1.1" 200 2078 128.237.211.152 - - [27/Sep/2016:09:44:53 -0400] "POST /mailman/admin/fttc HTTP/1.1" 504 247 Here is the qrunner log (from earlier when the two qrunners stopped): Sep 27 06:09:59 2016 (7136) Master qrunner detected subprocess exit (pid: 1194, sig: 9, sts: None, class: VirginRunner, slice: 1/1) [restarting] Sep 27 06:09:59 2016 (1439) VirginRunner qrunner started. Sep 27 06:13:22 2016 (7136) Master qrunner detected subprocess exit (pid: 1246, sig: 9, sts: None, class: IncomingRunner, slice: 1/1) [restarting] Sep 27 06:13:23 2016 (1564) IncomingRunner qrunner started. Sep 27 06:15:09 2016 (7136) Master qrunner detected subprocess exit (pid: 1439, sig: 9, sts: None, class: VirginRunner, slice: 1/1) [restarting] Sep 27 06:15:09 2016 (1679) VirginRunner qrunner started. Sep 27 06:18:00 2016 (7136) Master qrunner detected subprocess exit (pid: 1564, sig: 9, sts: None, class: IncomingRunner, slice: 1/1) [restarting] Sep 27 06:18:00 2016 (1786) IncomingRunner qrunner started. Sep 27 06:20:30 2016 (7136) Master qrunner detected subprocess exit (pid: 1679, sig: 9, sts: None, class: VirginRunner, slice: 1/1) [restarting] Sep 27 06:20:31 2016 (1917) VirginRunner qrunner started. Sep 27 06:21:56 2016 (7136) Master qrunner detected subprocess exit (pid: 1786, sig: 9, sts: None, class: IncomingRunner, slice: 1/1) [restarting] Sep 27 06:21:56 2016 (1980) IncomingRunner qrunner started. Sep 27 06:24:28 2016 (7136) Master qrunner detected subprocess exit (pid: 1917, sig: 9, sts: None, class: VirginRunner, slice: 1/1) [restarting] Sep 27 06:24:29 2016 (2048) VirginRunner qrunner started. Sep 27 06:25:55 2016 (7136) Master qrunner detected subprocess exit (pid: 1980, sig: 9, sts: None, class: IncomingRunner, slice: 1/1) [restarting] Sep 27 06:25:56 2016 (2160) IncomingRunner qrunner started. Sep 27 06:28:06 2016 (7136) Master qrunner detected subprocess exit (pid: 2048, sig: 9, sts: None, class: VirginRunner, slice: 1/1) [restarting] Sep 27 06:28:06 2016 (2223) VirginRunner qrunner started. Sep 27 06:30:03 2016 (7136) Master qrunner detected subprocess exit (pid: 2160, sig: 9, sts: None, class: IncomingRunner, slice: 1/1) [restarting] Sep 27 06:30:03 2016 (2317) IncomingRunner qrunner started. Sep 27 06:32:36 2016 (7136) Master qrunner detected subprocess exit (pid: 2223, sig: 9, sts: None, class: VirginRunner, slice: 1/1) [restarting] Sep 27 06:32:37 2016 (2443) VirginRunner qrunner started. Sep 27 06:34:03 2016 (7136) Master qrunner detected subprocess exit (pid: 2317, sig: 9, sts: None, class: IncomingRunner, slice: 1/1) [restarting] Sep 27 06:34:04 2016 (2494) IncomingRunner qrunner started. Sep 27 06:36:44 2016 (7136) Master qrunner detected subprocess exit (pid: 2443, sig: 9, sts: None, class: VirginRunner, slice: 1/1) [restarting] Sep 27 06:36:44 2016 (7136) Qrunner VirginRunner reached maximum restart limit of 10, not restarting. Sep 27 06:45:04 2016 (7136) Master qrunner detected subprocess exit (pid: 2494, sig: 9, sts: None, class: IncomingRunner, slice: 1/1) [restarting] Sep 27 06:45:04 2016 (7136) Qrunner IncomingRunner reached maximum restart limit of 10, not restarting. Finally this is the only error in the Mailman error file since the reinstall last night. Sep 26 20:59:51 2016 admin(8885): @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ admin(8885): [----- Mailman Version: 2.1.15 -----] admin(8885): [----- Traceback ------] admin(8885): Traceback (most recent call last): admin(8885): File "/usr/lib/mailman/scripts/driver", line 112, in run_main admin(8885): main() admin(8885): File "/usr/lib/mailman/Mailman/Cgi/admindb.py", line 198, in main admin(8885): mlist.Save() admin(8885): File "/usr/lib/mailman/Mailman/MailList.py", line 578, in Save admin(8885): self.__save(dict) admin(8885): File "/usr/lib/mailman/Mailman/MailList.py", line 555, in __save admin(8885): os.link(fname, fname_last) admin(8885): OSError: [Errno 1] Operation not permitted admin(8885): [----- Python Information -----] admin(8885): sys.version = 2.7.5 (default, Sep 15 2016, 22:37:39) [GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] admin(8885): sys.executable = /usr/bin/python admin(8885): sys.prefix = /usr admin(8885): sys.exec_prefix = /usr admin(8885): sys.path = ['/usr/lib/mailman/pythonlib', '/usr/lib/mailman', '/usr/lib/mailman/scripts', '/usr/lib/mailman', '/usr/li b64/python27.zip', '/usr/lib64/python2.7/', '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', '/usr/lib64/python2.7/lib-old ', '/usr/lib64/python2.7/lib-dynload', '/usr/lib/python2.7/site-packages'] admin(8885): sys.platform = linux2 admin(8885): [----- Environment Variables -----] admin(8885): HTTP_REFERER: http://conjel.co/mailman/admindb/dsn admin(8885): CONTEXT_DOCUMENT_ROOT: /usr/lib/mailman/cgi-bin/ admin(8885): SERVER_SOFTWARE: Apache/2.4.6 (CentOS) OpenSSL/1.0.1e-fips PHP/5.4.16 admin(8885): CONTEXT_PREFIX: /mailman/ admin(8885): SERVER_SIGNATURE: admin(8885): REQUEST_METHOD: POST admin(8885): PATH_INFO: /dsn admin(8885): HTTP_ORIGIN: http://conjel.co admin(8885): SERVER_PROTOCOL: HTTP/1.1 admin(8885): QUERY_STRING: admin(8885): CONTENT_LENGTH: 39 admin(8885): HTTP_USER_AGENT: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.116 Safari/537.36 admin(8885): HTTP_CONNECTION: keep-alive admin(8885): HTTP_COOKIE: mailman+admin=280200000069afc0e95773280000003665333231613538636235383833376661383331666565643265333961653063313 3366130663062 admin(8885): SERVER_NAME: conjel.co admin(8885): REMOTE_ADDR: 2601:547:f00:cf2c:8c4a:63df:fcba:58e9 admin(8885): PATH_TRANSLATED: /home/personal/htdocs/dsn admin(8885): SERVER_PORT: 80 admin(8885): SERVER_ADDR: 2001:4800:7818:103:be76:4eff:fe04:5321 admin(8885): DOCUMENT_ROOT: /home/personal/htdocs admin(8885): PYTHONPATH: /usr/lib/mailman admin(8885): SCRIPT_FILENAME: /usr/lib/mailman/cgi-bin/admindb admin(8885): SERVER_ADMIN: root@localhost admin(8885): HTTP_HOST: conjel.co admin(8885): SCRIPT_NAME: /mailman/admindb admin(8885): HTTP_UPGRADE_INSECURE_REQUESTS: 1 admin(8885): HTTP_CACHE_CONTROL: max-age=0 admin(8885): REQUEST_URI: /mailman/admindb/dsn admin(8885): HTTP_ACCEPT: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 admin(8885): GATEWAY_INTERFACE: CGI/1.1 admin(8885): REMOTE_PORT: 63197 admin(8885): HTTP_ACCEPT_LANGUAGE: en-US,en;q=0.8 admin(8885): REQUEST_SCHEME: http admin(8885): CONTENT_TYPE: application/x-www-form-urlencoded admin(8885): HTTP_ACCEPT_ENCODING: gzip, deflate admin(8885): UNIQUE_ID: V@nEh3AeyVpBSf2Pn@BbogAAAAI > On Sep 26, 2016, at 9:04 PM, Mark Sapiro <[email protected]> wrote: > > On 09/26/2016 06:27 AM, Chuck Weinstock wrote: >> Not sure this is relevant but I see this in the qrunner log: >> >> Sep 26 01:03:28 2016 (12454) Qrunner VirginRunner reached maximum restart >> limit of 10, not restarting. >> >> (And a bunch of similar messages.) > > > It is absolutely relevant, but it contradicts your prior "All of the > qrunners etc. are running." statement. > > It says that VirginRunner encountered a fatal error, died and was > restarted 10 times and the master (mailmanctl) has given up on it. > > What is the sig and sts from messages in the qrunner log like > > Master qrunner detected subprocess exit > (pid: 5651, sig: None, sts: 15, class: RetryRunner, slice: 1/1) > > and what's in Mailman's error log from the same times that qrunners are > dying. > > -- > Mark Sapiro <[email protected]> The highway is for gamblers, > San Francisco Bay Area, California better use your sense - B. Dylan > ------------------------------------------------------ > Mailman-Users mailing list [email protected] > https://mail.python.org/mailman/listinfo/mailman-users > Mailman FAQ: http://wiki.list.org/x/AgA3 > Security Policy: http://wiki.list.org/x/QIA9 > Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ > Unsubscribe: > https://mail.python.org/mailman/options/mailman-users/weinstock%40conjelco.com ------------------------------------------------------ Mailman-Users mailing list [email protected] https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org
