Also, sometimes the webservice seems "half dead" (Schrödinger's webservice?), need to restart Even had "restart" not working - needed to stop , then start again manually.
On Thu, Jul 10, 2014 at 11:26 PM, Hasteur Wikipedia < [email protected]> wrote: > Um... That's a very very bad idea. A crontab entry like that will fire > multiple times a minute. What's the largest downtime that the service can > tolerate? > > Sent from my iPhone > > > On Jul 10, 2014, at 4:52 PM, Petr Bena <[email protected]> wrote: > > > > what about appending this to crontab: > > > > * * * * * webservice start > > > >> On Thu, Jul 10, 2014 at 5:34 PM, Tim Landscheidt < > [email protected]> wrote: > >> Magnus Manske <[email protected]> wrote: > >> > >>> I've been manually restarting about a dozen webservices for my tools > in the > >>> last 24h. > >> > >>> And before you say it, some of those were Hedonil's hand-rolled > webservice. > >> > >>> Could we PLEASE either have a Labs-official, auto- and self-restarting > >>> webservice, or something a little more stable than lighttpd (or a more > >>> stable way to run it)? > >> > >> I looked at all the tools you are a developer of and I as- > >> sume you speak about wikidata-todo. This has some logs that > >> appear to have indications of OOM shutdowns. > >> > >> You use a custom lighttpd configuration, and I'm not sure if > >> the decision to have two PHP FCGIs doubles the memory re- > >> quirements, at the moment using 6 GBytes out of 7 GBytes re- > >> quested. > >> > >> What is clear however is that your PHP script: > >> > >> | 2014-07-10 14:11:39: (mod_fastcgi.c.2701) FastCGI-stderr: PHP Fatal > error: Allowed memory size of 2621440000 bytes exhausted (tried to > allocate 71 bytes) in /data/project/wikidata-todo/public_html/autolist2.php > on line 201 > >> > >> uses almost 2.5 GByte of memory -- if I don't misread the > >> documentation -- per /request/. > >> > >> Memory is cheap and we could just increase the requested > >> limit, but I assume there are some PHP developers around who > >> might want to have a poke at optimizing > >> < > https://bitbucket.org/magnusmanske/wikidata-todo/src/master/public_html/autolist2.php > >. > >> > >> Regarding self-restarting web services, with continuous jobs > >> we have a "while ! $JOB; do sleep 5; done" loop that ensures > >> that the job is restarted if it aborts. This however does > >> not work on OOMs that are the predominant cause of webser- > >> vice shutdowns, as the grid engine will kill the loop as > >> well :-). So we will probably have to start the webservice > >> and then start a watchdog job with the webservice's job num- > >> ber as its parameter that periodically checks that the web- > >> service is still running and, in case, restarts the webser- > >> vice. But to do that, jobs on execution nodes need to be > >> able to submit jobs, and this is still pending > >> (cf. https://bugzilla.wikimedia.org/54786). > >> > >> Tim > >> > >> > >> _______________________________________________ > >> Labs-l mailing list > >> [email protected] > >> https://lists.wikimedia.org/mailman/listinfo/labs-l > > > > _______________________________________________ > > Labs-l mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/labs-l > > _______________________________________________ > Labs-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/labs-l >
_______________________________________________ Labs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/labs-l
