Public bug reported:
Upstart sometimes aborts on a stateful re-execution
triggered by "telinit u":
job.c:1977: Assertion failed in job_deserialise: job->kill_process
Caught abort, core dumped
init:job.c:1977: Assertion failed in job_deserialise: job->kill_process
[ 69.668199] Kernel panic - not syncing: Attempted to kill init!
exitcode=0x00000600
The attached file (sessions.json) is a salvaged dump of the Upstart state
that triggers the assertion failure; the problem evidently occurs while
processing the following piece:
[...]
"name": "",
"path": "\/com\/ubuntu\/Upstart\/jobs\/ureadahead\/_",
"goal": "JOB_STOP",
"state": "JOB_KILLED",
[...]
"kill_timer": {
"timeout": 180,
"due": 245
},
"kill_process": "PROCESS_MAIN",
[...]
The issue has been caught in the package ubuntu-1.12.1 (Ubuntu 14.04)
and is caused by the following code:
[init/job.c]
1954 json_kill_timer = json_object_object_get (json, "kill_timer");
1955
1956 if (json_kill_timer) {
[...]
1973 nih_local NihTimer *kill_timer =
job_deserialise_kill_timer (json_kill_timer);
1974 if (! kill_timer)
1975 goto error;
1976
1977 nih_assert (job->kill_process);
1978 job_process_set_kill_timer (job, job->kill_process,
1979 kill_timer->timeout);
1980 job_process_adj_kill_timer (job, kill_timer->due);
1981 }
The assertion (job->kill_process) fails in the routine job_deserialise()
if the deserialised job has an associated kill timer and
the field kill_process == PROCESS_MAIN.
It seems the issue might still affect the trunk as well:
there're no similar checks in the routines job_process_kill()
and job_serialise(), so if the Upstart state is serialised
after the job_process_kill() but before the job kill timer fires
then the resulting state representation cannot be restored
since job->kill_timer is non-NULL and job->kill_process
isn't PROCESS_INVALID that is a result of job_process_set_kill_timer()
operation.
Probably the assertion in question should read
(job->kill_process != PROCESS_INVALID)
if job_process_set_kill_timer() is assumed to operate correctly.
Unfortunately the issue is extremely difficult to reproduce
so additional diagnostics might be difficult to perform
and it might kill the race that triggers the issue.
** Affects: upstart (Ubuntu)
Importance: Undecided
Status: New
** Attachment added: "Serialised Upstart state dump"
https://bugs.launchpad.net/bugs/1514609/+attachment/4515781/+files/sessions.json
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1514609
Title:
Deserialising a job with the attribute "kill_timer" and
"kill_process"="PROCESS_MAIN" results in abort
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/upstart/+bug/1514609/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs