Can you link us to some manual page for that utility? Thanks

On Sat, Jul 12, 2014 at 8:45 PM, Matanya <[email protected]> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> Hi All,
> The best tool to make sure your websever is running is monit. It will monitor 
> the service and bring it back up if it died. Furthermore, it can mail you on 
> every action taken and even have in the config file an option to declare how 
> many tries will try before giving up. Other features are available as well.
>
> On 12 ביולי 2014 15:00:37 GMT+03:00, [email protected] wrote:
>>Send Labs-l mailing list submissions to
>>       [email protected]
>>
>>To subscribe or unsubscribe via the World Wide Web, visit
>>       https://lists.wikimedia.org/mailman/listinfo/labs-l
>>or, via email, send a message with subject or body 'help' to
>>       [email protected]
>>
>>You can reach the person managing the list at
>>       [email protected]
>>
>>When replying, please edit your Subject line so it is more specific
>>than "Re: Contents of Labs-l digest..."
>>
>>
>>Today's Topics:
>>
>>   1. On disk use (Marc A. Pelletier)
>>   2. Re: Webservice (Petr Bena)
>>   3. Re: Webservice (Marc-André Pelletier)
>>
>>
>>----------------------------------------------------------------------
>>
>>Message: 1
>>Date: Fri, 11 Jul 2014 12:23:53 -0400
>>From: "Marc A. Pelletier" <[email protected]>
>>To: Wikimedia Labs <[email protected]>
>>Subject: [Labs-l] On disk use
>>Message-ID: <[email protected]>
>>Content-Type: text/plain; charset=ISO-8859-1
>>
>>Hey all.
>>
>>So, a quick reminder to every labs user: project space (/data/project)
>>is on a networked drive.  While it provides a lot of space and is
>>conveniently accesible to all instances of a project, using it /does/
>>incur a performance cost.
>>
>>Whenever a service you are running on a labs instance needs to have
>>/local/ storage (that is, does not need to share the data with other
>>instances), it is generally preferable to _not_ use /data/project for
>>it.  Being careful about when you use this filesystem means improved
>>performance and reliability for everyone.
>>
>>On Tool Labs, where your tools can be moved arbitrarily from one node
>>to
>>another, this mostly does not apply - you should be storing any
>>persistent data in your tools' homes (which are on /data/project).  It
>>*is* possible to store data locally to the instance where the tool is
>>running (for temporary data that does not need to persist from one run
>>to another), provided you are careful about cleaning up after yourself.
>>
>>In any case, if you have any question about your disk usage or ways in
>>which you can improve performance, don't hesitate to ask on-list or
>>communicate with me by email or on IRC.
>>
>>-- Marc
>>
>>
>>
>>------------------------------
>>
>>Message: 2
>>Date: Fri, 11 Jul 2014 20:30:25 +0200
>>From: Petr Bena <[email protected]>
>>To: Wikimedia Labs <[email protected]>
>>Subject: Re: [Labs-l] Webservice
>>Message-ID:
>>       <ca+4eq5ftpwksqmrf7n8ix03zoh4pazgcobx_y0qhhmig5pe...@mail.gmail.com>
>>Content-Type: text/plain; charset=UTF-8
>>
>>nope, just once in minute, crontab doesn't handle seconds, it wouldn't
>>fire up anything but the check if it's running
>>
>>On Fri, Jul 11, 2014 at 12:26 AM, Hasteur Wikipedia
>><[email protected]> wrote:
>>> Um... That's a very very bad idea. A crontab entry like that will
>>fire multiple times a minute. What's the largest downtime that the
>>service can tolerate?
>>>
>>> Sent from my iPhone
>>>
>>>> On Jul 10, 2014, at 4:52 PM, Petr Bena <[email protected]> wrote:
>>>>
>>>> what about appending this to crontab:
>>>>
>>>> * * * * * webservice start
>>>>
>>>>> On Thu, Jul 10, 2014 at 5:34 PM, Tim Landscheidt
>><[email protected]> wrote:
>>>>> Magnus Manske <[email protected]> wrote:
>>>>>
>>>>>> I've been manually restarting about a dozen webservices for my
>>tools in the
>>>>>> last 24h.
>>>>>
>>>>>> And before you say it, some of those were Hedonil's hand-rolled
>>webservice.
>>>>>
>>>>>> Could we PLEASE either have a Labs-official, auto- and
>>self-restarting
>>>>>> webservice, or something a little more stable than lighttpd (or a
>>more
>>>>>> stable way to run it)?
>>>>>
>>>>> I looked at all the tools you are a developer of and I as-
>>>>> sume you speak about wikidata-todo.  This has some logs that
>>>>> appear to have indications of OOM shutdowns.
>>>>>
>>>>> You use a custom lighttpd configuration, and I'm not sure if
>>>>> the decision to have two PHP FCGIs doubles the memory re-
>>>>> quirements, at the moment using 6 GBytes out of 7 GBytes re-
>>>>> quested.
>>>>>
>>>>> What is clear however is that your PHP script:
>>>>>
>>>>> | 2014-07-10 14:11:39: (mod_fastcgi.c.2701) FastCGI-stderr: PHP
>>Fatal error:  Allowed memory size of 2621440000 bytes exhausted (tried
>>to allocate 71 bytes) in
>>/data/project/wikidata-todo/public_html/autolist2.php on line 201
>>>>>
>>>>> uses almost 2.5 GByte of memory -- if I don't misread the
>>>>> documentation -- per /request/.
>>>>>
>>>>> Memory is cheap and we could just increase the requested
>>>>> limit, but I assume there are some PHP developers around who
>>>>> might want to have a poke at optimizing
>>>>>
>><https://bitbucket.org/magnusmanske/wikidata-todo/src/master/public_html/autolist2.php>.
>>>>>
>>>>> Regarding self-restarting web services, with continuous jobs
>>>>> we have a "while ! $JOB; do sleep 5; done" loop that ensures
>>>>> that the job is restarted if it aborts.  This however does
>>>>> not work on OOMs that are the predominant cause of webser-
>>>>> vice shutdowns, as the grid engine will kill the loop as
>>>>> well :-).  So we will probably have to start the webservice
>>>>> and then start a watchdog job with the webservice's job num-
>>>>> ber as its parameter that periodically checks that the web-
>>>>> service is still running and, in case, restarts the webser-
>>>>> vice.  But to do that, jobs on execution nodes need to be
>>>>> able to submit jobs, and this is still pending
>>>>> (cf. https://bugzilla.wikimedia.org/54786).
>>>>>
>>>>> Tim
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Labs-l mailing list
>>>>> [email protected]
>>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>>>
>>>> _______________________________________________
>>>> Labs-l mailing list
>>>> [email protected]
>>>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>>
>>> _______________________________________________
>>> Labs-l mailing list
>>> [email protected]
>>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>>
>>
>>
>>------------------------------
>>
>>Message: 3
>>Date: Fri, 11 Jul 2014 15:12:46 -0400
>>From: Marc-André Pelletier <[email protected]>
>>To: [email protected]
>>Subject: Re: [Labs-l] Webservice
>>Message-ID: <[email protected]>
>>Content-Type: text/plain; charset=UTF-8
>>
>>On 07/11/2014 02:30 PM, Petr Bena wrote:
>>> nope, just once in minute, crontab doesn't handle seconds, it
>>wouldn't
>>> fire up anything but the check if it's running
>>
>>I'm currently looking at some system by which tool maintainers may
>>specify automatic restart scripts that are sufficiently robust for
>>general use.
>>
>>Amongst other requirements, it will send an obligatory email on every
>>restart and will refuse to restart more than X times in a window of Y
>>(where X and Y are still undecided but likely to be 3 and 24h).  This
>>is
>>to prevent problematic tools from hammering on resources unattended.
>>
>>-- Marc
>>
>>
>>
>>
>>------------------------------
>>
>>_______________________________________________
>>Labs-l mailing list
>>[email protected]
>>https://lists.wikimedia.org/mailman/listinfo/labs-l
>>
>>
>>End of Labs-l Digest, Vol 31, Issue 11
>>**************************************
> -----BEGIN PGP SIGNATURE-----
> Version: APG v1.1.1
>
> iQJABAEBCgAqBQJTwYI6IxxNYXRhbnlhIE1vc2VzIDxtYXRhbnlhQGZvc3MuY28u
> aWw+AAoJEKzSGXfsOI0veOEP/jXtutHDwYHzIzU/zQ6tHV89xufuD0+2pKVPzqmh
> VyGclpkelV/JVjHvygnovJcluqfPA/smUTBn7YgwBtaT2ElEUoeCpim/ljOdxqLE
> dAAgEt9JtoBytqJxZ5z6uQkoMK1k92xjUP8U9wp9ZYqrn7i89MkuxmdaUhXp0KlM
> DmoqC+Cg1XxBk6Zq7wOQYLv3Lr5uSvUynvd3rCQI0wPlsWMr+B0r5nGV1zb+DWKZ
> qzFYvUW7FONdxglde3vgrMhxcl2zWEtcHz0uh/ucMcSPcoUbiUi2Oy6tPZOwscEp
> nCp9Yq0nBvD//GsA/hnKXp0sLK1VtFIv0cpufevZhdoV8/4+cy36bMToRx3KBiJg
> WGxfjq85vEgC9yl2+2D9Pyuz2mK+bUrKcbVyQm0FVlOyIQAKQvJ4pqff8VzlLFsR
> nYLRdE+RwjsG8k4hYDVqlln0KxlNT3ZbNErNJ84ndbEXVlWreM0tTWREcof3nn/x
> 22ijokyz8F4AifUAo7AyYk7elJeSQVdizxptQSj0CtQXLa34R/wgwgHgxavhsPiH
> Bwma4RHnVoVSZhpd1kgSOMy2lPTRK/Ww+FfwvJyZ9m18pGg0W7cm81JHDpC5BmEM
> 58HXmhxvcCvq1snpmcdNrac30FzotYscryC/Fed6dP2lOLiXZsGGxTdKvuAb95Qa
> zPlH
> =YX2r
> -----END PGP SIGNATURE-----
>
>
> _______________________________________________
> Labs-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/labs-l

_______________________________________________
Labs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/labs-l

Reply via email to