Hi,
On 01/21/2011 07:18 PM, Jonathan Nieder wrote:
>
>> +start_trace(pid_t pid)
>> +{
>> + wait(NULL);
>> +
>> + int retries;
>> + const int max_retries = 10;
>> +
>> + for (retries = 0; retries < max_retries; ++retries) {
>> + if (ptrace (PTRACE_SETOPTIONS, pid, NULL,
>> + PTRACE_O_TRACEFORK) >= 0) {
>> + ptrace(PTRACE_CONT, pid, NULL, 0);
>> + return;
>>
> Yikes. upstart allows something similar (though it uses the simpler
> TRACEME and only tries once). For reference, this is imitating the
> EXPECT_DAEMON and EXPECT_FORK cases in init/job_process.c.
>
Your observation is absolutely correct, I wrote this as a replacement
for what upstart does, and obviously, the code there served as an example.
And the "yikes" is also totally deserved, sleeping and retrying the
SETOPTIONS or detaching is not the most beautiful code I've ever
written. For me, however, while being extremely far from being an expert
of the ptrace interface, it seems like this thing is inherently race-y
-- even upstart code contains error handling for such situations, like
this in job_process.c:
/* We may have already had the wait notification for the new child
* waiting at SIGSTOP, in which case a ptrace() call will succeed
* for it.
*/
if (ptrace (PTRACE_SETOPTIONS, job->pid[process], NULL, 0) < 0) {
nih_debug ("Failed to set options for new %s %s process
(%d), "
"probably not yet forked: %s",
job_name (job), process_name (process),
job->pid[process], strerror (errno));
return;
}
If you can refer me to some good documentation or examples on how to do
this in a better way, I'd be happy to read them and make this code more
robust.
But I do not understand your other remark: I am indeeed using TRACEME,
all other ptrace calls are just to limit the tracing to only forks and
to get the PID we need.
> I guess start-stop-daemon is the right place to implement this kind of
> hack, but I'd much rather see daemons providing either a --no-detach
> or a --pidfile option. A serious problem with the ptrace approach is
> making it impossible to debug the daemon with "gdb -p" (and that
> downside ought to be mentioned in the documentation if this is
> adopted).
>
I cannot confirm your fears here: I'm happily able to "gdb -p" the
daemons started this way. And it's not just me: upstart does exactly the
same thing for a large part of the services it starts on a standard
Lucid install and I'm pretty sure someone would've recognized if they
couldn't "gdb -p", for example, sshd :) Also, see the comment section in
the blog post about this feature in upstart at
http://netsplit.com/2007/12/07/how-to-and-why-supervise-forking-processes/
, where the author addresses exactly the same issue:
> Note that the trace is removed as soon as we've seen the two (or one)
> forks, so it's only there for a very short amount of time. As soon as
> Upstart believes the service is actually running, you can attach
> strace or gdb to it as normal.
>
> (If you run the service under strace or gdb, you don't need the fork
> following since strace/gdb stay in the foreground and follow the forks
> of the daemon themselves.)
>
But I also agree that it's better if the daemons themselves can create
their own pidfiles or stay in foreground: if it's added, maybe a WARNING
should be included for this option just as there is one for
--background, telling the user to only use this as a last resort in
special cases.
> For your particular application: is it possible to run each service in
> a separate cgroup (like systemd does), maybe? (See
> http://0pointer.de/blog/projects/systemd-for-admins-4.html for what
> that looks like from the enduser pov.) What services use "expect =
> fork/daemon"? Maybe they can be fixed (so strace and gdb -p can use
> if for no other reason)?
>
Hmm, it seems like a viable option, I'll play around with it when I'll
have the time. However, in our particular application the goal was to
make things work with as little modification to the boot process or the
init scripts as possible, as we're using this in a customized distro
using lots of packages and the burden of forking/rewriting them wouldn't
have worth it. But since then, upstart seems to start to get the
functionality we needed (that is, working properly in chroots), so maybe
that'll be the way to go.
What services use expect = fork/daemon? Quite a lot of them, for
example, on my workstation:
gyp@chuck:/etc/init$ egrep "expect (fork|daemon)" *
acpid.conf:expect fork
anacron.conf:expect fork
atd.conf:expect fork
avahi-daemon.conf:expect daemon
cron.conf:expect fork
dbus.conf:expect fork
mountall.conf:expect daemon
network-manager.conf:expect fork
nmbd.conf:expect fork
plymouth.conf:expect fork
rsyslog.conf:expect fork
squid.conf:expect fork
ssh.conf:expect fork
udev.conf:expect fork
upstart-udev-bridge.conf:expect daemon
ureadahead.conf:expect fork
ureadahead-other.conf:expect fork
usplash.conf:expect fork
Most of these services are able to write their own pidfiles, it's just
that upstart does not want to use this feature, probably because it
wants to work independently of the capabilities of the actual service.
>
>> + // NOTE: we don't treat it as a fatal error as it could happen in some
>> race-y situations
>> + warning("Error detaching from process: %s (pid: %d)\n",
>> strerror(errno), pid);
>>
> Hmm...
>
I agree, but see my remarks about ptrace being a strange animal above :)
>
>> +static void
>> +trace_pid(char *startas, char **argv)
>> +{
>> + pid_t pid, child_pid;
>> + int num_of_forks = 0;
>> +
>> + pid = fork();
>> + if (pid < 0) {
>> + fatal("Unable to fork!\n");
>> + }
>> + if(pid == 0) {
>> + ptrace(PTRACE_TRACEME, 0, NULL, NULL);
>> + execvp(startas, argv);
>> + }
>> + else {
>>
> The nested ifs here are hard to read. It should be safe to drop the
> else after handling execvp failure.
>
Absolutely right, will do that.
In general, what do you think, is there a point in adding this feature
to upstream start-stop-daemon? We needed this, I wrote this, it's been
tested and seem working fine in our setup, so I thought I'd send it in.
I can see its value outside our specific situation, but I admit these
are very special cases when this might be needed. If yes, what changes
would it need to be accepted? I'm willing to fix some things up and test
the changes, though my resources are quite limited, so it might take a
while...
Thanks for your input,
Peter