On further investigation, this does *not* affect jobs that block the
'stopping' event. So I believe this may be related more to the way
job_process_run treates pre-stop. According to comments in
init/job_process.c:


        /* We don't change the state if we're in post-start and there's
         * a post-start process running, or if we're in pre-stop and
         * there's a pre-stop process running; we wait for those to
         * finish instead.
         */
        if ((job->state == JOB_POST_START)
            && job->class->process[PROCESS_POST_START]
            && (job->pid[PROCESS_POST_START] > 0)) {
            state = FALSE;
        } else if ((job->state == JOB_PRE_STOP)
            && job->class->process[PROCESS_PRE_STOP]
            && (job->pid[PROCESS_PRE_STOP] > 0)) {
            state = FALSE;
        }

Later on, if (!state) calls job_change_goal(job, JOB_RESPAWN) ..

Digging into job_next_state a bit, I found that a goal of JOB_RESPAWN
causes job_next_state to change the goal only if state is JOB_POST_START
or JOB_PRE_STOP. JOB_POST_START changes the goal to JOB_START, which
makes sense. But JOB_PRE_STOP changes the goal to JOB_START as well!
This doesn't make much sense to me, and I thought the obvious fix would
be to change it to be JOB_STOP, however this caused test_job to fail
here:


...with post-start job and a goal of respawn
BAD: wrong value for job->goal, expected 1 got 0
        at tests/test_job.c:4129 (test_next_state).
/bin/bash: line 5: 29784 Aborted                 ${dir}$tst
FAIL: test_job

This was confusing until I realized that the test code is wrong:

    /* Check that the next state if we're respawning after a pre-stop
     * job is stopping with the goal changed.
     */
    TEST_FEATURE ("with post-start job and a goal of respawn");
    job->goal = JOB_RESPAWN;
    job->state = JOB_PRE_STOP;

    TEST_EQ (job_next_state (job), JOB_STOPPING);
    TEST_EQ (job->goal, JOB_START);

That should read 'with pre-stop job and a goal of respawn'.

So, this brings to  the edge case that I think is causing this problem,
and my perceived fix.

If you want a running job to be restarted, it should move from wherever
it was, through stopping->pre-stop->killed->post-stop->starting-> ...

However, this doesn't happen. Instead the goal is changed to STOP, but
job_change_goal() does not wait for that to happen, which its docblock
would suggest.

Instead, it returns with the state set to JOB_PRE_STOP ..

This is my proposed fix.. though I'm not entirely confident this is the
right way to handle it, and I haven't had a chance to test this yet.


=== modified file 'init/job.c'
--- init/job.c  2010-12-14 15:32:41 +0000
+++ init/job.c  2011-01-17 07:56:24 +0000
@@ -1265,6 +1265,12 @@
                nih_list_add (&job->blocking, &blocked->entry);
 
        job_change_goal (job, JOB_STOP);
+    if (job->state == JOB_PRE_STOP)
+    {
+      /* hack: should not happen but does */
+      job->goal = JOB_RESPAWN;
+      job_change_goal (job, JOB_STOP);
+    }
        job_change_goal (job, JOB_START);
 
        if (! wait)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/703800

Title:
  restart command fails to restart main process when pre-stop stanza
  exists

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to