[Supervisor-users] group of groups?
i have a set of services, each of which consists of several programs. the individual sets of programs for each of these services are grouped already: [group:den] programs = read_remotes, read_remotes_relay, run_lircd [group:study] programs = read_remotes, read_remotes_relay, run_lircd [group:shop] programs = read_remotes, read_remotes_relay, run_lircd [group:sunny] programs = read_remotes, read_remotes_relay, run_lircd [program:run_lircd] command = /usr/local/bin/run_lircd %(group_name)s autorestart = true stdout_logfile=/var/log/supervisor/run_lircd-%(group_name)s.log [program:read_remotes] command = bash -c "sleep 3; exec /usr/local/bin/read_remotes %(group_name)s no-hotkeys" autorestart = true stdout_logfile=/var/log/supervisor/read_remotes-%(group_name)s.log [program:read_remotes_relay] command = bash -c "sleep 6; exec /usr/local/bin/read_remotes_relay %(group_name)s" autorestart = true stdout_logfile=/var/log/supervisor/read_remotes_relay-%(group_name)s.log i'd like to be able to manage the groups all at once -- i.e., i'd like to say this, so that "supervisorctl stop remotes" would work: [group:remotes] groups = den,study,shop,sunny is something like this possible? or is the group hierarchy just one level deep? paul =-- paul fox, p...@foxharp.boston.ma.us (arlington, ma) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users
Re: [Supervisor-users] infinite startup retries?
On Dec 2, 2016, rod wrote: > Afaik infinite retries is not supported, but using a very high startretries > (eg. ) gives you a decent amount of infinite... > i'm reviving an old thread here. i'm interesting in enhancing the algorithm supervisord uses when starting processes that experience errors. the documentation (http://supervisord.org/subprocess.html?highlight=retry#process-states) says: "When an autorestarting process is in the BACKOFF state, it will be automatically restarted by supervisord. It will switch between STARTING and BACKOFF states until it becomes evident that it cannot be started because the number of startretries has exceeded the maximum, at which point it will transition to the FATAL state. Each start retry will take progressively more time." looking at the source, it seems that the time delay for each backoff is equal to the number of backoffs --- so the first backoff is for 1 second, the second for 2, the third for 3, etc. (this should really be documented.) this means that setting the startretries to a high value will likely not be very useful, since at some point the retry latency will potentially be too long to be practical. it seems like having a configurable cap on the backoff delay would make the startretries parameter much more useful -- small values would continue to be useful for catching quick unforeseen startup failures, while high values, along with a cap on the retry delay, would be useful for processes that might sometimes be expected to fail to start, which should be retried forever, and which should recover relatively quickly when their failure conditions are fixed. would anyone else find such a configuration option useful? i'm picturing a new "maxbackoffsecs" parameter to specify the maximum retry backoff. at the same time, it might also be useful to allow specifying a "startretries" value of -1, to signify "try forever", rather than having to rely on enough digits in 999. (i have an initial patch, which works.) paul > On Fri, 2 Dec 2016 at 16:10, Paul Fox <p...@foxharp.boston.ma.us> wrote: > > > is there a way to get supervisor to attempt restarting a process > > forever, at some low rate? > > > > i have some services that rely on USB hardware. when they start, they > > detect the hardware and continue, else the exit. i'd like them retry > > once in a while, forever, in case the hardware has been inserted. > > > > as far as i can tell from the man page, once "startretries" (which > > undefined in the man page, but appears to be "4") have been attempted, > > the process is never tried again. > > > > is my only recourse to run a cron job to do a "supervisorctl start jobname" > > every minute or two? > > > > paul > > =-- > > paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 44.6 > > degrees) > > > > _______ > > Supervisor-users mailing list > > Supervisor-users@lists.supervisord.org > > https://lists.supervisord.org/mailman/listinfo/supervisor-users > > =-- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 32.9 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users
Re: [Supervisor-users] infinite startup retries?
thanks to you both. i've clearly been missing out on most of the documenation for supervisord. perhaps even _mentioning_ http://supervisor.org in the man page would help! honestly -- it's bad enough that the man page doesn't offer complete documentation, but to not even mention the site where the documentation does exist is a huge omission. paul paul lockaby wrote: > If you go this route, I did write a program in Perl that does this that you > could work with: > > https://github.com/plockaby/supercron > > > > On Dec 2, 2016, at 11:28 AM, skee...@skeeved.org wrote: > > > > On 12/02/2016 06:08 PM, Paul Fox wrote: > >> rod wrote: > >> > Afaik infinite retries is not supported, but using a very high > >> > startretries > >> > (eg. ) gives you a decent amount of infinite... > >> > >> oh -- startretries is something i can set in the conf file? i thought > >> it was an internal parameter of some sort. where can i find a full > >> list of the settable values? i would have expected it to be in the > >> man page. the backoff sequence is also not specified -- will the > >> delay interval eventually stop growing? > >> > >> thanks, > >> paul > > > > You might also want to look at the event subsystem available in > > supervisord. > > > > http://supervisord.org/events.html > > > > You could then have your code respond to, for instance, TICK_60 events and > > decide whether it's the appropriate time to do whatever work they are > > designed to do. > > > > > > > > ___ > > Supervisor-users mailing list > > Supervisor-users@lists.supervisord.org > > https://lists.supervisord.org/mailman/listinfo/supervisor-users > > ___ > Supervisor-users mailing list > Supervisor-users@lists.supervisord.org > https://lists.supervisord.org/mailman/listinfo/supervisor-users > =-- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 43.7 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users
[Supervisor-users] infinite startup retries?
is there a way to get supervisor to attempt restarting a process forever, at some low rate? i have some services that rely on USB hardware. when they start, they detect the hardware and continue, else the exit. i'd like them retry once in a while, forever, in case the hardware has been inserted. as far as i can tell from the man page, once "startretries" (which undefined in the man page, but appears to be "4") have been attempted, the process is never tried again. is my only recourse to run a cron job to do a "supervisorctl start jobname" every minute or two? paul =-- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 44.6 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users
Re: [Supervisor-users] unix:///tmp/supervisor.sock no such file
joan perez esteban wrote: Hi, not sure if I am sending this twice, my mail server returned me an error. I did try restart the service but the error persist, I don’t have the required file so it returns an error, [root@nimbusNode ~]# service supervisor stop [root@nimbusNode ~]# service supervisor start [root@nimbusNode ~]# supervisorctl reread error: class 'socket.error', [Errno 2] No such file or directory: file: string line: 1 [root@nimbusNode ~]# supervisorctl update error: class 'socket.error', [Errno 2] No such file or directory: file: string line: 1 [root@nimbusNode ~]# supervisorctl unix:///tmp/supervisor.sock no such file supervisor exit [root@nimbusNode ~]# supervisor.sock does not exist, so getting an error. Any other ideas? i see. your symptom is different than mine. in my case, when i restarted supervisord, the socket came back, and remained there for days or weeks, and then it would disappear again. in your case, the socket is missing right away. i don't have any more to say -- sorry. paul many thanks in advance. Juan On 25 Jan 2015, at 04:07, Paul Fox p...@foxharp.boston.ma.us wrote: hussain wrote: Hello I guess the error occurs because supervisor service is not running. i can't speak for juan, but that was not true in my case. the web interface worked fine -- all managed processes were running, and i could interact with supervisord from a browser. it was only the unix domain socket that had been somehow removed, and prevented supervisorctl from working. paul Do - Sudo service supervisor start or Sodo /etc/supervisor start Then run supervisorctl. I haven't tested this. Regards, Hussain On 25 Jan 2015 03:36, Paul Fox p...@foxharp.boston.ma.us wrote: joan perez esteban wrote: Hi all, I am trying to start supervisorctl but all the time I get the same error: unix:///tmp/supervisor.sock no such file. I haven’t found anything out there that can cause that error as I am running last version. I am running supervisor in a centos server over a vm. i reported this exact symptom (running on ubuntu) back in may of last year. i assume my report is in the archives. glad i wasn't imagining it. :-) it happened to me several times, and since i had no real means of debugging it, i switch to serverurl=http://127.0.0.1:9001; in my config, to use a non-unix socket. paul Thanks, Juan ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users =-- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 26.4 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users =-- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 24.1 degrees) =-- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 7.9 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users
Re: [Supervisor-users] unix:///tmp/supervisor.sock no such file
joan perez esteban wrote: Hi all, I am trying to start supervisorctl but all the time I get the same error: unix:///tmp/supervisor.sock no such file. I haven’t found anything out there that can cause that error as I am running last version. I am running supervisor in a centos server over a vm. i reported this exact symptom (running on ubuntu) back in may of last year. i assume my report is in the archives. glad i wasn't imagining it. :-) it happened to me several times, and since i had no real means of debugging it, i switch to serverurl=http://127.0.0.1:9001; in my config, to use a non-unix socket. paul Thanks, Juan ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users =-- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 26.4 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users
Re: [Supervisor-users] unix:///tmp/supervisor.sock no such file
hussain wrote: Hello I guess the error occurs because supervisor service is not running. i can't speak for juan, but that was not true in my case. the web interface worked fine -- all managed processes were running, and i could interact with supervisord from a browser. it was only the unix domain socket that had been somehow removed, and prevented supervisorctl from working. paul Do - Sudo service supervisor start or Sodo /etc/supervisor start Then run supervisorctl. I haven't tested this. Regards, Hussain On 25 Jan 2015 03:36, Paul Fox p...@foxharp.boston.ma.us wrote: joan perez esteban wrote: Hi all, I am trying to start supervisorctl but all the time I get the same error: unix:///tmp/supervisor.sock no such file. I haven’t found anything out there that can cause that error as I am running last version. I am running supervisor in a centos server over a vm. i reported this exact symptom (running on ubuntu) back in may of last year. i assume my report is in the archives. glad i wasn't imagining it. :-) it happened to me several times, and since i had no real means of debugging it, i switch to serverurl=http://127.0.0.1:9001; in my config, to use a non-unix socket. paul Thanks, Juan ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users =-- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 26.4 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users =-- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 24.1 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users
Re: [Supervisor-users] lost supervisor.sock
sergey wrote: See if any of your periodic clean-up jobs could wipe out the entry from /tmp. If a directory entry was created and opened by a process, then the removal of the entry from the directory does not affect the process that holds the socket open. But any new attempt to find or open the deleted entry results in error. This is just one possibility of what could be happening in your case. yeah, i wondered about that too. but supervisord was running for over a month successfully -- cleanup jobs would typicall run more often. and i'm the only person who logs in (it's a server), and i hadn't done so during the time period when the socket got removed. i'll look at the cron jobs, but i'm not hopeful. there aren't many other possibilities though! thanks, paul /Sergey On Thu, May 8, 2014 at 8:25 PM, Paul Fox p...@foxharp.boston.ma.us wrote: matt's mail made me start thinking about pathnames, so i did some more groveling. it seems that netstat thinks the socket is being listened on: # netstat -an | grep supervisor unix 2 [ ACC ] STREAM LISTENING 8404 /var/tmp/supervisor.sock.567 and it seems that supervisord still thinks it has the socket open: # ps axf | grep '[p]ython.*supervisord' 723 ?Ss83:07 /usr/bin/python /usr/bin/supervisord # ls -l /proc/723/fd | grep sock lrwx-- 1 root root 64 May 8 19:26 4 - socket:[8247] lrwx-- 1 root root 64 May 8 19:26 5 - socket:[8404] so i guess it's possible that some unrelated process removed the socket from /tmp. i guess i'll restart supervisord, and see if it happens again. paul -- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 54.9 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users -- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 48.4 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users
[Supervisor-users] lost supervisor.sock
i'm running a relatively old version of supervisor, so if this has been fixed in a later release, i'll live with it -- upgrading this particular system is difficult at the moment. i have a supervisord process that's been running for over 32 days, since april 6th. it's still running, and will happily restart processes if i kill them manually. the web interface (on port 9001) is also working fine. but if i try and talk to it with supervisorctl, i get: # supervisorctl status unix:///var/tmp/supervisor.sock no such file the socket file is indeed not present in /var/tmp. this happened sometime between 3:01am yesterday, and 3:01am today. i know this because there's one process i restart every night at that time via cron, and this morning i got a failure message, whereas yesterday i did not. is this a familar, known, bug? since everything's kind of working right now, i'm happy to leave the system in this state in case someone has a debugging idea -- i'm not particularly skilled with python, but can follow directions well. :-) on the other hand, if the answer is upgrade, then i'll just restart it and see if it happens again. paul p.s. here's the fedora version info: # yum info supervisor Installed Packages Name: supervisor Arch: noarch Version : 3.0 Release : 1.fc18 Size: 2.5 M Repo: installed From repo : updates Summary : A System for Allowing the Control of Process State on UNIX URL : http://supervisord.org/ License : ZPLv2.1 and BSD and MIT Description : The supervisor is a client/server system that allows its users to : control a number of processes on UNIX-like operating systems. -- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 56.7 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users
Re: [Supervisor-users] lost supervisor.sock
matt wrote: I’d suggest checking your /etc/supervisord.conf to be sure that the following two config entries match up: [unix_http_server] file=/tmp/supervisord.sock ; (the path to the socket file) [supervisorctl] serverurl=unix:///tmp/supervisord.sock they match. [unix_http_server] file=/var/tmp/supervisor.sock ; (the path to the socket file) [supervisorctl] serverurl=unix:///var/tmp/supervisor.sock ; use a unix:// URL for a unix socket paul Matt On 9 May 2014 07:52, Paul Fox p...@foxharp.boston.ma.us wrote: i'm running a relatively old version of supervisor, so if this has been fixed in a later release, i'll live with it -- upgrading this particular system is difficult at the moment. i have a supervisord process that's been running for over 32 days, since april 6th. it's still running, and will happily restart processes if i kill them manually. the web interface (on port 9001) is also working fine. but if i try and talk to it with supervisorctl, i get: # supervisorctl status unix:///var/tmp/supervisor.sock no such file the socket file is indeed not present in /var/tmp. this happened sometime between 3:01am yesterday, and 3:01am today. i know this because there's one process i restart every night at that time via cron, and this morning i got a failure message, whereas yesterday i did not. is this a familar, known, bug? since everything's kind of working right now, i'm happy to leave the system in this state in case someone has a debugging idea -- i'm not particularly skilled with python, but can follow directions well. :-) on the other hand, if the answer is upgrade, then i'll just restart it and see if it happens again. paul p.s. here's the fedora version info: # yum info supervisor Installed Packages Name: supervisor Arch: noarch Version : 3.0 Release : 1.fc18 Size: 2.5 M Repo: installed From repo : updates Summary : A System for Allowing the Control of Process State on UNIX URL : http://supervisord.org/ License : ZPLv2.1 and BSD and MIT Description : The supervisor is a client/server system that allows its users to : control a number of processes on UNIX-like operating systems. -- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 56.7 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users -- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 55.8 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users
Re: [Supervisor-users] lost supervisor.sock
matt's mail made me start thinking about pathnames, so i did some more groveling. it seems that netstat thinks the socket is being listened on: # netstat -an | grep supervisor unix 2 [ ACC ] STREAM LISTENING 8404 /var/tmp/supervisor.sock.567 and it seems that supervisord still thinks it has the socket open: # ps axf | grep '[p]ython.*supervisord' 723 ?Ss83:07 /usr/bin/python /usr/bin/supervisord # ls -l /proc/723/fd | grep sock lrwx-- 1 root root 64 May 8 19:26 4 - socket:[8247] lrwx-- 1 root root 64 May 8 19:26 5 - socket:[8404] so i guess it's possible that some unrelated process removed the socket from /tmp. i guess i'll restart supervisord, and see if it happens again. paul -- paul fox, p...@foxharp.boston.ma.us (arlington, ma, where it's 54.9 degrees) ___ Supervisor-users mailing list Supervisor-users@lists.supervisord.org https://lists.supervisord.org/mailman/listinfo/supervisor-users