On Fri, Jul 25, 2014 at 10:32 PM, James Powell <[email protected]> wrote:
> Another thing could be that the service may not need a log. I've directed a 
> lot of unwanted output to /dev/null.

My service needs a log. I need to know the things these programs send
to stderr/stdout.

> Can you post one of your run files as an example?

Sure.

run:

#!/bin/sh

mkdir -p /mnt/log/baz
chown -R user1 /mnt/log/baz
cd /opt/baz/current
exec chpst -u user1 ./run_baz 2>&1

log/run:

#!/bin/sh

# The main run script takes care of ensuring the log dir exists.
exec svlogd -ttt /mnt/log/baz/


>
> Sent from my Windows Phone
> ________________________________
> From: James Powell<mailto:[email protected]>
> Sent: ‎7/‎25/‎2014 9:35 PM
> To: Caleb Spare<mailto:[email protected]>; 
> [email protected]<mailto:[email protected]>
> Subject: RE: Rare runsv logging problem
>
> My question is why are you running Upstart? Runit has it's own init so 
> Upstart is pointless. Runit's binary should maintain runsv. It also could 
> depend on the run script also having an improper handling.
>
> Sent from my Windows Phone
> ________________________________
> From: Caleb Spare<mailto:[email protected]>
> Sent: ‎7/‎25/‎2014 5:16 PM
> To: [email protected]<mailto:[email protected]>
> Subject: Rare runsv logging problem
>
> Hi,
>
> I've been using runit for a while now and it has been mostly
> wonderful. I'm noticing a persistent issue and I'm not sure how to
> debug it.
>
> On the servers we're running Ubuntu and we use runit 2.1.1 via the
> default package that comes with the distro. Upstart runs runsvdir and
> we use runit to manage all of our application processes. Each
> application has a simple ./run and ./log/run; the latter execs svlogd
> (this is all a typical configuration, as I understand it).
>
> The problem I'm seeing is that, very occasionally, runsv will get into
> a bad state where svlogd is not running. (I'm not sure if it fails to
> start svlogd or if this happens later on after it has been running
> properly.) When the problem occurs, pstree shows something like this:
>
> runsvdir-+-runsv-+-foo---5*[{foo}]
>          |       `-svlogd
>          |-runsv-+-bar---21*[{bar}]
>          |       `-svlogd
>          `-runsv---baz---250*[{baz}]
>
> Here you can see that the baz process does not have an associated
> svlogd process. Further:
>
> $ sudo sv s foo
> run: foo: (pid 4885) 526260s; run: log: (pid 875) 526517s
> $ sudo sv s baz
> run: baz: (pid 2337) 2983swarning: baz: unable to open supervise/ok:
> file does not exist
> ; run: log: (pid 2337) 2983s
>
> Two strange things there: the warning about supervise/ok and also that
> the pid for 'log' is the same as for 'baz'.
>
> When runsv is in this bad state, the output from baz goes right to
> runsvdir and ends up in /var/log/upstart/runsvdir.log.
>
> The fix I've been using is to 'sv d baz' and then kill the offending
> runsv process. Runsvdir will quickly restart it and then everything
> will be working:
>
> runsvdir-+-runsv-+-foo---5*[{foo}]
>          |       `-svlogd
>          |-runsv-+-baz---25*[{baz}]
>          |       `-svlogd
>          `-runsv-+-bar---20*[{bar}]
>                  `-svlogd
>
> I'm unsure what causes this rare problem. We only do simple things
> with the runit: sv {t,d,u}. When we deploy services, we rsync a
> directory from elsewhere on the box into /etc/services/<name> and then
> 'sv t <name>'. That source dir only has ./run, ./finish, and
> ./log/run.
>
> Any ideas of what we might be doing wrong, or how to otherwise avoid
> this issue? Or if not, what I could do to further debug?
>
> Sorry for the long email; I wanted to be thorough in my description
> and avoid making assumptions about what could be causing this problem.
>
> Thanks,
> Caleb Spare

Reply via email to