Re: [systemd-devel] How do I easily resolve conflicting jobs in custom units?

2017-03-22 Thread Andrei Borzenkov
22.03.2017 23:47, John Florian пишет:
> I build an mostly-stateless appliance OS derived from Fedora (25 ATM)
> and have several custom units to make it all possible.  My units had
> worked great with F21, but are now giving me problems with F25.  One
> pair of the custom units do some trickery to relocate sshd host keys
> from /etc/ssh to an alternate location that provides persistence:
> 
> $ systemctl cat sshd-persist-keys.service
> # /usr/lib/systemd/system/sshd-persist-keys.service
> [Unit]
> Description=OpenSSH server - persist volatile keys for the AOS
> After=sshd-keygen.target
> Before=sshd.service
> Wants=sshd-keygen.target
> 
> [Service]
> ExecStart=/usr/sbin/sshd-persist-keys
> Type=oneshot
> RemainAfterExit=yes
> 
> [Install]
> WantedBy=multi-user.target
> 
> 
> $ systemctl cat sshd-restore-keys.service
> # /usr/lib/systemd/system/sshd-restore-keys.service
> [Unit]
> Description=OpenSSH server - restore persisted keys for the AOS
> After=aos-storage-init.service
> Before=sshd-keygen@rsa.service sshd-keygen@ecdsa.service sshd-keygen@ed
> 25519.service
> 
> [Service]
> ExecStart=/usr/sbin/sshd-restore-keys
> Type=oneshot
> RemainAfterExit=yes
> 
> [Install]
> WantedBy=multi-user.target
> 
> 
> I found that on some boots, sshd wasn't getting started.  With the help
> of booting with systemd.log_level=debug I learned:
> 
> $ sudo journalctl | grep conflict

Please make full log available as well as actual unit definitions that
are not started.

> Mar 22 16:11:42 localhost systemd[1]: sshd.service: Looking at job
> sshd.service/start conflicted_by=no
> Mar 22 16:11:42 localhost systemd[1]: sshd.service: Looking at job
> sshd.service/stop conflicted_by=yes
> Mar 22 16:11:42 localhost systemd[1]: sshd.service: Fixing conflicting
> jobs sshd.service/start,sshd.service/stop by deleting job
> sshd.service/start

___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


[systemd-devel] How do I easily resolve conflicting jobs in custom units?

2017-03-22 Thread John Florian
I build an mostly-stateless appliance OS derived from Fedora (25 ATM)
and have several custom units to make it all possible.  My units had
worked great with F21, but are now giving me problems with F25.  One
pair of the custom units do some trickery to relocate sshd host keys
from /etc/ssh to an alternate location that provides persistence:

$ systemctl cat sshd-persist-keys.service
# /usr/lib/systemd/system/sshd-persist-keys.service
[Unit]
Description=OpenSSH server - persist volatile keys for the AOS
After=sshd-keygen.target
Before=sshd.service
Wants=sshd-keygen.target

[Service]
ExecStart=/usr/sbin/sshd-persist-keys
Type=oneshot
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target


$ systemctl cat sshd-restore-keys.service
# /usr/lib/systemd/system/sshd-restore-keys.service
[Unit]
Description=OpenSSH server - restore persisted keys for the AOS
After=aos-storage-init.service
Before=sshd-keygen@rsa.service sshd-keygen@ecdsa.service sshd-keygen@ed
25519.service

[Service]
ExecStart=/usr/sbin/sshd-restore-keys
Type=oneshot
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target


I found that on some boots, sshd wasn't getting started.  With the help
of booting with systemd.log_level=debug I learned:

$ sudo journalctl | grep conflict
Mar 22 16:11:42 localhost systemd[1]: sshd.service: Looking at job
sshd.service/start conflicted_by=no
Mar 22 16:11:42 localhost systemd[1]: sshd.service: Looking at job
sshd.service/stop conflicted_by=yes
Mar 22 16:11:42 localhost systemd[1]: sshd.service: Fixing conflicting
jobs sshd.service/start,sshd.service/stop by deleting job
sshd.service/start
Mar 22 16:11:42 localhost systemd[1]: sshd.socket: Looking at job
sshd.socket/stop conflicted_by=no
Mar 22 16:11:42 localhost systemd[1]: sshd.socket: Looking at job
sshd.socket/start conflicted_by=no
Mar 22 16:11:42 localhost systemd[1]: sshd.socket: Fixing conflicting
jobs sshd.socket/stop,sshd.socket/start by deleting job
sshd.socket/stop
Mar 22 16:11:42 localhost systemd[1]: sshd-keygen.target: Looking at
job sshd-keygen.target/start conflicted_by=no
Mar 22 16:11:42 localhost systemd[1]: sshd-keygen.target: Looking at
job sshd-keygen.target/stop conflicted_by=no
Mar 22 16:11:42 localhost systemd[1]: sshd-keygen.target: Fixing
conflicting jobs sshd-keygen.target/start,sshd-keygen.target/stop by
deleting job sshd-keygen.target/stop


I'm sure systemd-analyze is my friend here, but I've not found an
effective way to grok the problem.  s-a dot either gives me too much to
comprehend or not enough to make evident the problem.  I've tried
feeding it different PATTERNS like: none at all, sshd.service, ssh-
keygen.target, sshd-{persist,restore}-keys.service and more.  I either
see a whole forest or just a few trees that seem to be getting along
happily together.

What advice can you suggest?___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] more verbose debug info than systemd.log_level=debug?

2017-03-22 Thread Chris Murphy
On Tue, Mar 21, 2017 at 9:48 PM, Andrei Borzenkov  wrote:
> 22.03.2017 00:10, Chris Murphy пишет:
>> OK so I had the idea to uninstall plymouth, since that's estensibly
>> what's holding up the remount read-only. But it's not true.
>>
>> Sending SIGTERM to remaining processes...
>> Sending SIGKILL to remaining processes...
>> Unmounting file systems.
>> Remounting '/tmp' read-only with options 'seclabel'.
>> Unmounting /tmp.
>> Remounting '/' read-only with options 'seclabel,attr2,inode64,noquota'.
>> Remounting '/' read-only with options 'seclabel,attr2,inode64,noquota'.
>> Remounting '/' read-only with options 'seclabel,attr2,inode64,noquota'.
>> All filesystems unmounted.
>
> Could you show your /proc/self/mountinfo before starting shutdown (or
> ideally just before systemd goes into uount all)? This suggests that "/"
> appears there three times there.

I'm too stupid to figure out how to get virsh console to attach to
tty9/early debug shell but here's a screen shot right as
pk-offline-update is done, maybe 2 seconds before the remounting and
reboot.
https://drive.google.com/open?id=0B_2Asp8DGjJ9NXRGTTFjSlVPSU0

>
> Result code of "remount ro" is not evaluated or logged. systemd does
>
> (void) mount(NULL, m->path, NULL, MS_REMOUNT|MS_RDONLY, options);
>
> where "options" are those from /proc/self/mountinfo sans ro|rw.
>
> Probably it should log it at least with debug level.

So I've asked over on the XFS about this, and they suggest all of this
is expected behavior under the circumstances. The sync only means data
is committed to disk with an appropriate journal entry, it doesn't
mean fs metadata is up to date, and it's the fs metadata that GRUB is
depending on, but isn't up to date yet. So the suggestion is that if
remount-ro fails, to use freeze/unfreeze and then reboot. The
difference with freeze/unfreeze and remount-ro is that freeze/unfreeze
will update fs metadata even if there's something preventing
remount-ro.

If it's useful I'll file an issue with systemd on github to get a
freeze/unfreeze inserted. remount-ro isn't always successful, and
clearly it's not ok to reboot anyway if remount-ro fails.



-- 
Chris Murphy
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel