Don't ask me why, but /var/log/messages recorded the problem this most recent crash. Does this mean anything to anyone?
Oct 11 14:47:24 RTServer RT:  Argument "myork" isn't numeric in numeric ne (!=) at /usr/share/request-tracker4/lib/RT/Interface/Web.pm line 2949. Oct 11 14:47:41 RTServer RT:  Argument "myork" isn't numeric in numeric ne (!=) at /usr/share/request-tracker4/lib/RT/Interface/Web.pm line 2949. I didn't paste that twice, it appeared twice in the log. This is still 4.2.8. On Tue, Oct 11, 2016 at 1:59 PM, Kenneth Marshall <k...@rice.edu> wrote: > On Tue, Oct 11, 2016 at 12:55:13PM -0400, Alex Hall wrote: > > Hello list, > > This may be off-topic, but I'm serving RT with Nginx and FCGI. Randomly, > it > > seems, the FCGI server is failing. Nginx works, but users see "error 502: > > bad gateway". I see the same in the logs, with connect() failing. All I > > have to do is run the spawn-fcgi command to get things back. > > > > Why this is happening, with some frequency, is the question. My Nginx, > RT, > > and system logs all show nothing, and to my knowledge, there are no FCGI > > logs at all. The first error for today in Nginx is when a client failed > to > > connect after the server went down; there's nothing that says what the > > actual problem was. This happened Saturday, then again today. > > > > The server has the latest updates for Debian 8.6, and has 4GB of ram. > It's > > serving a few dozen users at most, so the load can't be the problem. I'm > > using Nginx 1.6.2 with four workers and 768 threads per worker. Users see > > nothing unusual before this happens, just a 502 instead of the page they > > expected. > > > > If anyone else is using Nginx and has ever seen this, I'd love some > input. > > As this could be considered off topic, feel free to respond directly to > > ah...@autodist.com. If I need to provide more details, please let me > know. > > Thank you. > > > > -- > > Alex Hall > > Automatic Distributors, IT department > > ah...@autodist.com > > > Hi Alex, > > You will get the 502 error when there are no more RT backends running. I > tracked down verious errors in the RT logs that resulted in a backend > exits. Most were of the 'cannot believe I did that type' by people > setting up the system, i.e. not really fixable with a distributed > management environment. We ended up using 'multiwatch' in RHEL6 > and systemd in RHEL7 to keep an appropriate number of backends always > available. > > Regards, > Ken > -- Alex Hall Automatic Distributors, IT department ah...@autodist.com
--------- RT 4.4 and RTIR training sessions, and a new workshop day! https://bestpractical.com/training * Boston - October 24-26 * Los Angeles - Q1 2017