I've opened https://phabricator.wikimedia.org/T133090 to track this issue.
Cheers, Morten On 18 April 2016 at 20:22, Morten Wang <[email protected]> wrote: > I dug a little further by ssh'ing to the exec host where lighttpd is > running and sending HTTP request directly using the port defined in > /var/run/lighttpd/suggestbot.conf, and lighttpd is running and happy to > serve up the pages locally (I don't actually have to ssh to the exec host, > I can just as easily send the requests from tools-dev). To me this suggests > the problem is somewhere in the proxy. > > > Cheers, > Morten > > > On 18 April 2016 at 14:27, Morten Wang <[email protected]> wrote: > >> Hi Merlijn, >> >> Thanks for looking into that. Yes, I've noticed that restarting the web >> service seems to fix the problem, but only intermittently. If I now, about >> 25 mins after you emailed, go to >> http://tools.wmflabs.org/suggestbot/hw.html I get a 503 service >> unavailable. This has appeared to be a consistent pattern lately, I restart >> and shortly thereafter it is unavailable again. Which is also why it's >> difficult to debug what's going on. >> >> >> Cheers, >> Morten >> >> >> On 18 April 2016 at 14:00, Merlijn van Deen (valhallasw) < >> [email protected]> wrote: >> >>> Hi Morten, >>> >>> The 503 suggests the web proxy hit a hitch -- the web server is running >>> on tools-webgrid-lighttpd-1415:37083 (qstat -xml, ssh to that host, ps aux >>> | grep suggestbot; vim /var/run/lighttpd/suggestbot.conf), but the proxy is >>> unaware of that. >>> >>> I have restarted your webservice job (qmod -rj 5458226), and this seems >>> to have resolved the issue (https://tools.wmflabs.org/suggestbot/ now >>> 404s, but that's your lighttpd webservice and not the proxy). >>> >>> Best, >>> Merlijn >>> >>> On 18 April 2016 at 19:00, Morten Wang <[email protected]> wrote: >>> >>>> I'm currently having an issue with SuggestBot's (tools.suggestbot) web >>>> services failing after a while, resulting in a 503 (service unavailable). >>>> The web service process on the grid appears to be running, but it also >>>> appears to be unreachable. >>>> >>>> There is nothing in the error.log nor the access.log that suggests >>>> anything happened to lighttpd, and I am unsure about what debug settings to >>>> add to make it possible to log any errors. The fastcgi.debug setting >>>> doesn't appear to reveal anything. In other words, I've reached the end of >>>> my abilities to figure this out, and thus I wonder: >>>> >>>> What are some best practices when it comes to debugging what is going >>>> on? >>>> >>>> >>>> Cheers, >>>> Morten >>>> >>>> >>>> _______________________________________________ >>>> Labs-l mailing list >>>> [email protected] >>>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>>> >>>> >>> >>> _______________________________________________ >>> Labs-l mailing list >>> [email protected] >>> https://lists.wikimedia.org/mailman/listinfo/labs-l >>> >>> >> >
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
