On Fri, Aug 14, 2015 at 1:30 PM, Lennart Poettering
<mzerq...@0pointer.de> wrote:
> On Mon, 10.08.15 08:03, Rich Freeman (r-syst...@thefreemanclan.net) wrote:
>
> We have watchdog (see WatchdogSec= documentation in
> systemd.service(5)) support in all our long-running daemons, and PID 1
> will kill the service and generate a backtrace for them if they don't
> send a watchdog message often enough. So actually we should be pretty
> good here...

Thanks.  In this case I'm not sure if it is needed more for nspawn
itself, or for systemd (which probably won't work unless nspawn
supports watchdog), or for journald/etc.

>
>> Example of a frozen container:
>>
>> systemctl status mariadb-contain
>> ● mariadb-contain.service - mariadb container
>>    Loaded: loaded (/etc/systemd/system/mariadb-contain.service;
>> enabled; vendor preset: enabled)
>>    Active: active (running) since Mon 2015-08-10 07:21:48 EDT; 37min ago
>>      Docs: man:systemd-nspawn(1)
>>  Main PID: 1033 (systemd-nspawn)
>>    Status: "Container running."
>>    CGroup: /system.slice/mariadb-contain.service
>>            ├─1033 /usr/bin/systemd-nspawn --quiet --keep-unit --boot
>> --link-journal=guest --directory=/sstorage3/cont...
>>            ├─1044 /usr/lib/systemd/systemd
>>            └─system.slice
>>              ├─systemd-journald.service
>>              │ └─1407 /usr/lib/systemd/systemd-journald
>>              └─systemd-journal-flush.service
>>                └─1340 /usr/bin/journalctl --flush
>
> Hmm, this is really weird... Would be good to get a backtrac of both
> journald and journalctl here. Note that journald has a much higher PID
> that journalctl though, which indicates that it might have gotten
> restarted by systemd already...

I'll look to get one.

>
> journalctl --flush actually pretty much only sends SIGUSR1 to
> journald, but does this through PID1's bus APIs... It then waits for a
> file in /run/systemd/journal/flushed to appear... For some reason that
> doesn't work here... Weird...

I'm actually wondering if it is some kind of dbus api issue.  I don't
have anything in this email but I seem to recall seeing some error in
a situation like this that mentioned dbus.

>
> Anyway, before tracking this down further, could you update to a more
> recent systemd version?

That's fair to ask.  I'll see about doing just that.  Perhaps it will
resolve the issue as a bonus.  I've been seeing this for a while
though.

The other issue I see sometimes is restarting an nspawn container with
bridged ethernet and having it fail with an error that the interface
is already in use.  After I update I'll see if I can get more info on
that (though in that case everything terminates).

--
Rich
_______________________________________________
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/systemd-devel

Reply via email to