Bug#808151: systemd: failed to start remount root and kernel file system

2016-01-17 Thread Michael Biebl
Am 17.01.2016 um 16:02 schrieb Frank B. Brokken:
> Dear Michael Biebl, you wrote:
> 
>> After further consideration, I'm downgrading this bug to important.
>>
>> It only happens for split-usr setups and is apparently hard to trigger,
>> i.e. happens rarely due to race conditions.
> 
> Thanks for the update. I can live with that: after all 32 bit computers are
> more and more turning into things of the past

If I read your bug report correctly, this issue happens for you on 1 out
of 3 systems, right? The old Acer laptop.

My guess is, that this is not 32 bit related, but rather that your 32
bit system, where you encounter this issue, is rather old and likely
slower then your other systems.

As this looks like a race condition, you don't hit the problem on your
newer systems.

See the logs I posted. I can partially reproduce the problem, that there
is an attempt for usr.mount to be stopped, but that request is cancelled
quickly enough so the actual umount never happens.


Michael

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



signature.asc
Description: OpenPGP digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2016-01-16 Thread Michael Biebl
Control: severity -1 important

After further consideration, I'm downgrading this bug to important.

It only happens for split-usr setups and is apparently hard to trigger,
i.e. happens rarely due to race conditions.

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



signature.asc
Description: OpenPGP digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2016-01-04 Thread Martin Pitt
Hello Frank,

Frank B. Brokken [2016-01-04 12:06 +0100]:
> However, if there is/are .deb packages available that I can use to test the
> new, hopefully fixed, version I'm of course more than willing to give it a
> try: in that case all dependencies and the installation/removal of files are
> handled by the package installer, and I don't have to be afraid of possibly
> breaking the operational integrity of my computer.

You reported the bug on i386, so I built packages with that patch for
that arch, and put them on

 https://people.debian.org/~mpitt/tmp/systemd-808151/

You can use that as an apt source by adding

  deb [trusted=yes] http://people.debian.org/~mpitt/tmp/systemd-808151/ /

to /etc/apt/sources.list (or sources.list.d/pitti.list or similar),
then running "apt update" and "apt upgrade".

Thanks,

Martin
-- 
Martin Pitt| http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)


signature.asc
Description: Digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2016-01-04 Thread Michael Biebl
Am 05.01.2016 um 00:58 schrieb Michael Biebl:
> Am 03.01.2016 um 16:21 schrieb Martin Pitt:

>>   https://github.com/systemd/systemd/commit/9d06297e26
>>
>> Frank, Michael, if possible, it would be nice if you could confirm
>> that this really fixes your case as well?
> 
> I tried the patch, but I still get odd behaviour here.
> 
> Systemd still tries to stop usr.mount during boot (full log attached).

The behaviour did change a bit though. I don't see the

> Dez 19 23:18:46 debian systemd[1]: usr.mount: Unit is bound to inactive unit 
> dev-sda5.device. Stopping, too.

log message anymore.
So something changed, but systemd still behaves incorrectly I think, as
it tries to run umount on sda5.

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



signature.asc
Description: OpenPGP digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2016-01-03 Thread Martin Pitt
Hello Frank,

Michael Biebl [2015-12-20  2:08 +0100]:
> So the best guess is that the timing in v228 changed a little (some code
> paths became faster). This would confirm Frank's findings that enabling
> verbose output (which slows down boot a bit) made it less likely to hit.

This also came up recently in a different context, and this
auto-cleanup was massively dialed back in upstream master:

  https://github.com/systemd/systemd/commit/9d06297e26

Frank, Michael, if possible, it would be nice if you could confirm
that this really fixes your case as well?

Thanks,

Martin
-- 
Martin Pitt| http://www.piware.de
Ubuntu Developer (www.ubuntu.com)  | Debian Developer  (www.debian.org)


signature.asc
Description: Digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-20 Thread Frank B. Brokken
Dear Michael Biebl, you wrote:
> Here is a more complete log excerpt for v228 (full log attached)
> 
> > Dez 20 01:27:42 debian systemd[1]: -.mount: Changed dead -> mounted
...
> 
> So the best guess is that the timing in v228 changed a little (some code
> paths became faster). This would confirm Frank's findings that enabling
> verbose output (which slows down boot a bit) made it less likely to hit.
> 
> Martin has been working/debugging the tentative stuff in the past, so
> maybe he has some insights here.
> 
> We should probably also involve upstream at some point.

OK, thanks for the help and (for me at least) final conclusion. For me
personally the problem has been solved: for the time being I'm happy with 227,
and I'm sure that the problem will soon be fixed.

Thanks again for helping along!

Cheers,

-- 
Frank B. Brokken
Center for Information Technology, University of Groningen
(+31) 50 363 9281 
Public PGP key: http://pgp.surfnet.nl
Key Fingerprint: DF32 13DE B156 7732 E65E  3B4D 7DB2 A8BE EAE4 D8AA


signature.asc
Description: PGP signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-19 Thread Frank B. Brokken
Dear Michael Biebl, you wrote:

> > This information is available at https://www.icce.rug.nl/systemd in the 
> > files 
> > initramfs.debug and alb.
> 
> Hm, unfortunately the journal dump is incomplete again. I have no idea why

Remarkable. I made it available the way I got it, so that's apparently what
there is.

> > booting procedure. You're sure it can't be some timing problem? 
> 
> Well, what kind of timing problem do you have in mind?

Don't know: I didn't design systemd. But if it's doing things in parallel then
maybe on newer, faster, computers things might have completed, like remounting
/usr rw before it's actually used. A race condition might then explain why the
problem doesn't always show itself, and why chances of failure are reduced
when more time is spent writing debug/verbose messages.

 
> So far, the only thing I can say for sure looking at the initramfs log,
> is that /usr has been mounted successfully in the initramfs.
> 
> "Something" apparently causes /usr to be unmounted later on. Which part
> and why that is, is not clear yet.
> 
> Do you have any (custom) init scripts in /etc/rcS.d/ which fiddle around
> with mount settings, run telinit or stuff like that?

Nope.

> I'm running out of ideas, tbh.

Well, that's already a *lot* more than I could offer myself :-) But
fortunately (for me, but hard to fix, I realize), the problem doesn't emerge
all the time. If rebooting every now and then gets me a running system, then
so be it. The FailureAction=reboot-force entry in systemd-remount-fs.service
already has proven to be my friend :-)


> If you suspect the remount service to be the cause for this, let's
> output the mounts before and after
> For that run
> $ systemctl edit systemd-remount-fs.service

When I issue that command I get the reply

Warning: systemd-remount-fs.service changed on disk. Run 'systemctl
daemon-reload' to reload units.

I guess the warning is obvious as I edited the file 

/lib/systemd/system/local-fs.target.wants/systemd-remount-fs.service

to prevent the reboot action. So I did as advised and reran the command,
but got an empty file in my editor, while the following message was shown
after ending the editor:

Editing "/etc/systemd/system/systemd-remount-fs.service.d/override.conf"
canceled: temporary file is empty.

> Then add
> [Service]
> ExecStartPre=/bin/sh -c 'echo "before rootfs remount"; findmnt'
> ExecStartPost=/bin/sh -c 'echo "after rootfs remount"; findmnt'
> 
> Reboot and attach the journal log again.

Instead of running the systemctl command I edited the file
/lib/systemd/system/local-fs.target.wants/systemd-remount-fs.service and added
the lines you suggested. My next e-mail is about the contents of journal log.

Thereafter I'll try to downgrade to the previous version to see what
happens then.


-- 
Frank B. Brokken
Center for Information Technology, University of Groningen
(+31) 50 363 9281 
Public PGP key: http://pgp.surfnet.nl
Key Fingerprint: DF32 13DE B156 7732 E65E  3B4D 7DB2 A8BE EAE4 D8AA


signature.asc
Description: PGP signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-19 Thread Frank B. Brokken
Hi Michael,

The journalctl -alb output after adding:

> Then add
> [Service]
> ExecStartPre=/bin/sh -c 'echo "before rootfs remount"; findmnt'
> ExecStartPost=/bin/sh -c 'echo "after rootfs remount"; findmnt'

and rebooting (until failure) is at https:/www.icce.rug.nl/systemd/alb-1650

It does contain the 'before rootfs' line, but the 'after rootfs' line isn't
there:

$ grep 'before rootfs' *1650
Dec 19 16:45:18 localhost.localdomain sh[430]: before rootfs remount
Dec 19 16:45:20 localhost.localdomain sh[487]: before rootfs remount
Dec 19 16:45:21 localhost.localdomain sh[516]: before rootfs remount
Dec 19 16:45:24 localhost.localdomain sh[620]: before rootfs remount
$ grep 'after rootfs' *1650
$

Next thing I'll try is to downgrade to 227-2.

-- 
Frank B. Brokken
Center for Information Technology, University of Groningen
(+31) 50 363 9281 
Public PGP key: http://pgp.surfnet.nl
Key Fingerprint: DF32 13DE B156 7732 E65E  3B4D 7DB2 A8BE EAE4 D8AA


signature.asc
Description: PGP signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-19 Thread Frank B. Brokken
Hi Michael,

I downgraded to the following versions of the following packages:

libpam-systemd_227-2_i386.deb  libudev1_227-2_i386.deb 
libsystemd-dev_227-2_i386.deb  systemd-sysv_227-2_i386.deb 
libsystemd0_227-2_i386.deb systemd_227-2_i386.deb 
libudev-dev_227-2_i386.deb udev_227-2_i386.deb 

Thereafter I rebooted several times without encountering any problems. Also
with reduced output (grub's option 'quiet') no problems were encountered.

Cheers,

-- 
Frank B. Brokken
Center for Information Technology, University of Groningen
(+31) 50 363 9281 
Public PGP key: http://pgp.surfnet.nl
Key Fingerprint: DF32 13DE B156 7732 E65E  3B4D 7DB2 A8BE EAE4 D8AA


signature.asc
Description: PGP signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-19 Thread Michael Biebl
Am 19.12.2015 um 16:27 schrieb Frank B. Brokken:
> Dear Michael Biebl, you wrote:
> 
>>> This information is available at https://www.icce.rug.nl/systemd in the 
>>> files 
>>> initramfs.debug and alb.
>>
>> Hm, unfortunately the journal dump is incomplete again. I have no idea why
> 
> Remarkable. I made it available the way I got it, so that's apparently what
> there is.

I've setup a test VM with a split /usr.
While I can't quite reproduce the problem you have, I found this in the
logs (those are exactly the early lines that are missing from yours
unfortunately)

> Dez 19 23:18:46 debian systemd[1]: systemd 228 running in system mode. (+PAM 
> +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT 
> +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN)
> Dez 19 23:18:46 debian systemd[1]: Detected virtualization kvm.
> Dez 19 23:18:46 debian systemd[1]: Detected architecture x86-64.
> Dez 19 23:18:46 debian systemd[1]: Set hostname to .
> Dez 19 23:18:46 debian systemd[1]: usr.mount: Unit is bound to inactive unit 
> dev-sda5.device. Stopping, too.

Apparently systemd considers the /dev/sda5 device as inactive so tries
to stop usr.mount, which might result in /usr being unmounted.

I guess that's the root cause also for your problem.
I think I have enough information for now to further investigate this on
my own.

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



signature.asc
Description: OpenPGP digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-19 Thread Frank B. Brokken
Dear Michael Biebl, you wrote:
> Am 18.12.2015 um 15:59 schrieb Frank B. Brokken:
> > Is there a way to determine that? What I do to upgrade the system is run
> > 'aptitude update' and then 'aptitude upgrade'. Is there a log somewhere that
> > tells me what packages and versions were updated at what moments in time?
> 
> /var/log/dpkg.log is a low-level log.
> 
> and then there is one for aptitude at /var/log/aptitude

Thanks: I made the logs available at https://www.icce.rug.nl/systemd


> ...
> Btw, you mentioned that this happened after an upgrade. Which previous
> version did you run which worked fine? Which other packages were
> upgraded along with it?

Tue, Dec  1 2015 14:07:23 +0100: 
the aptitude log shows an upgrade from systemd 227-2 to 228-2 

dpkg log: 2015-12-01 14:08:42 upgrade systemd:i386 227-2 228-2

dpkg log: 2015-12-03 08:30:01 upgrade systemd:i386 228-2 215-17+deb8u2

Thu, Dec  3 2015 08:31:37 +0100
the aptitude log shows an upgrade from systemd 215-17+deb8u2 -> 228-2

dpkg log: 2015-12-03 08:31:40 upgrade systemd:i386 215-17+deb8u2 228-2

and then, recently, by me trying to downgrade:

dpkg log: 2015-12-17 12:59:12 upgrade systemd:i386 228-2 228-2
dpkg log: 2015-12-18 16:15:37 upgrade systemd:i386 228-2 215-17+deb8u2
dpkg log: 2015-12-18 16:17:11 upgrade systemd:i386 215-17+deb8u2 228-2

Before Dec 1 no updates were recorded for systemd or udev, and until the
upgrades early December everything ran fine.




> If you downgrade systemd/udev, does the problem go away?

As I feared, downgrading is difficult because of the many reverse
dependencies. 

I looked at 

ftp://ftp.de.debian.org/debian/pool/main/s/systemd/

which is the mirror I usually use for earlier .deb files, but the one before
228-2 is 215-17, and 227-2 is only available as source archives and not AFAICS
as .deb packages.

I'll add the debug entry next (cf. your mail from Date: Fri, 18 Dec 2015
03:15:15 +0100) and let you know the results.


-- 
Frank B. Brokken
Center for Information Technology, University of Groningen
(+31) 50 363 9281 
Public PGP key: http://pgp.surfnet.nl
Key Fingerprint: DF32 13DE B156 7732 E65E  3B4D 7DB2 A8BE EAE4 D8AA


signature.asc
Description: PGP signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-19 Thread Frank B. Brokken
Hi Michael,

As announced in my previous e-mail:

> I'll add the debug entry next (cf. your mail from Date: Fri, 18 Dec 2015
> 03:15:15 +0100) and let you know the results.

This information is available at https://www.icce.rug.nl/systemd in the files 
initramfs.debug and alb.

Maybe useful to note: it took like four or five reboot attempts before the
booting process eventually failed. This time even more output than with using
'verbose' flashes by during the booting process, which somewhat slows down the
booting procedure. You're sure it can't be some timing problem? 

-- 
Frank B. Brokken
Center for Information Technology, University of Groningen
(+31) 50 363 9281 
Public PGP key: http://pgp.surfnet.nl
Key Fingerprint: DF32 13DE B156 7732 E65E  3B4D 7DB2 A8BE EAE4 D8AA


signature.asc
Description: PGP signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-19 Thread Michael Biebl
Here is a more complete log excerpt for v228 (full log attached)

> Dez 20 01:27:42 debian systemd[1]: -.mount: Changed dead -> mounted
> Dez 20 01:27:42 debian systemd[1]: usr.mount: Changed dead -> mounted
> Dez 20 01:27:42 debian systemd[1]: usr.mount: Unit is bound to inactive unit 
> dev-sda5.device. Stopping, too.
> Dez 20 01:27:42 debian systemd[1]: usr.mount: Trying to enqueue job 
> usr.mount/stop/fail
> Dez 20 01:27:42 debian systemd[1]: usr.mount: Installed new job 
> usr.mount/stop as 1
> Dez 20 01:27:42 debian systemd[1]: usr.mount: Enqueued job usr.mount/stop as 1
> Dez 20 01:27:42 debian systemd[1]: dev-sda1.device: Changed dead -> tentative
> Dez 20 01:27:42 debian systemd[1]: -.slice changed dead -> active
> Dez 20 01:27:42 debian systemd[1]: dev-sda5.device: Changed dead -> tentative
> Dez 20 01:27:42 debian systemd[1]: init.scope changed dead -> running
> Dez 20 01:27:42 debian systemd[1]: Activating default unit: default.target
> Dez 20 01:27:42 debian systemd[1]: graphical.target: Trying to enqueue job 
> graphical.target/start/isolate
> Dez 20 01:27:42 debian systemd[1]: display-manager.service: Cannot add 
> dependency job, ignoring: Unit display-manager.service failed to load: No 
> such file or directory.
> Dez 20 01:27:42 debian systemd[1]: usr.mount: Job usr.mount/stop finished, 
> result=canceled

So in my case the stop request is cancelled, probably because the state
has changed from dead to tentative quickly enough. So this looks like a
race indeed.

I also tested v227 and it shows exactly the same behaviour. So this
issue has been lurking for a while (see attached journal) it seems.
My guess is that it goes way back until [1] or even [2].

So the best guess is that the timing in v228 changed a little (some code
paths became faster). This would confirm Frank's findings that enabling
verbose output (which slows down boot a bit) made it less likely to hit.

Martin has been working/debugging the tentative stuff in the past, so
maybe he has some insights here.

We should probably also involve upstream at some point.


[1] https://github.com/systemd/systemd/commit/f620094
[2] https://github.com/systemd/systemd/commit/628c89c

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?


journal-228.xz
Description: application/xz


journal-227.xz
Description: application/xz


signature.asc
Description: OpenPGP digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-19 Thread Michael Biebl
Am 19.12.2015 um 12:40 schrieb Frank B. Brokken:
> Dear Michael Biebl, you wrote:

>> If you downgrade systemd/udev, does the problem go away?
> 
> As I feared, downgrading is difficult because of the many reverse
> dependencies. 
> 
> I looked at 
> 
> ftp://ftp.de.debian.org/debian/pool/main/s/systemd/
> 
> which is the mirror I usually use for earlier .deb files, but the one before
> 228-2 is 215-17, and 227-2 is only available as source archives and not AFAICS
> as .deb packages.
> 

See http://snapshot.debian.org/
It contains all uploaded versions, including 227-2

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



signature.asc
Description: OpenPGP digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-19 Thread Michael Biebl
Am 19.12.2015 um 13:06 schrieb Frank B. Brokken:
> Hi Michael,
> 
> As announced in my previous e-mail:
> 
>> I'll add the debug entry next (cf. your mail from Date: Fri, 18 Dec 2015
>> 03:15:15 +0100) and let you know the results.
> 
> This information is available at https://www.icce.rug.nl/systemd in the files 
> initramfs.debug and alb.

Hm, unfortunately the journal dump is incomplete again. I have no idea why

> Maybe useful to note: it took like four or five reboot attempts before the
> booting process eventually failed. This time even more output than with using
> 'verbose' flashes by during the booting process, which somewhat slows down the
> booting procedure. You're sure it can't be some timing problem? 

Well, what kind of timing problem do you have in mind?

So far, the only thing I can say for sure looking at the initramfs log,
is that /usr has been mounted successfully in the initramfs.

"Something" apparently causes /usr to be unmounted later on. Which part
and why that is, is not clear yet.

Do you have any (custom) init scripts in /etc/rcS.d/ which fiddle around
with mount settings, run telinit or stuff like that?

I'm running out of ideas, tbh.

If you suspect the remount service to be the cause for this, let's
output the mounts before and after
For that run
$ systemctl edit systemd-remount-fs.service

Then add
[Service]
ExecStartPre=/bin/sh -c 'echo "before rootfs remount"; findmnt'
ExecStartPost=/bin/sh -c 'echo "after rootfs remount"; findmnt'

Reboot and attach the journal log again.

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



signature.asc
Description: OpenPGP digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-18 Thread Frank B. Brokken
Dear Michael Biebl, you wrote:

> Well, /usr is mounted by the initramfs these days. So it should already
> be available when systemd is started. If that fails, this is a bug which
> needs to be addressed by initramfs-tools (or one of the hook scripts).
> It wasn't clear so far that /usr hasn't been mounted at all.
> 
> Is /usr on LVM, RAID, etc?

No, nothing like that. And for what it's worth: the problem only appeared
after I upgraded systemd last week. The laptop has nothing special in its
setup, and has been working perfectly for years, until last week when systemd
was renwed. I think in my bugreport I mentioned the problem that /usr wasn't
mounted.

In your next reply you wrote:

> I'm a bit confused by those logs. They show that sda5 have been mounted.
> 
> Dec 17 15:44:29 localhost.localdomain kernel: EXT4-fs (sda5): mounting
> ext3 file system using the ext4 subsystem
> Dec 17 15:44:29 localhost.localdomain kernel: EXT4-fs (sda5): mounted
> filesystem with ordered data mode. Opts: (null)
> 
> I figure /dev/sda5 is your /usr partition? Just to be sure, please
> attach ls -la /dev/disk/by-uuid/

I seem to remember that message, in particular the Opts: (null) remark, and I
think at that point /usr was mounted by me fron the systemd shell. Also,
couldn't it be that initramfs *did* do the mount, but that remounting it rw,
als reported in the error message is the problem? Also, to me it appears
remarkable that by removing the 'quiet' from the kernel parameters, so that
things go a bit slower because of the extra messages that are displayed the
frequency of failing boot procedures is greatly diminished. I'm considering
trying to add 'verbose' to grub's parameters to see if that produces more
output and maybe further reduces the frequency, but I haven't had the time to
do that yet. Something on the TODO list :-)

Anyway, here's the ls -la output:

total 0
drwxr-xr-x 2 root root 200 Dec 18 13:05 ./
drwxr-xr-x 5 root root 100 Dec 18 13:02 ../
lrwxrwxrwx 1 root root  10 Dec 18 13:02 04b82e8b-f871-4abb-978a-44ae44c5d1f7
-> ../../sda1
lrwxrwxrwx 1 root root  10 Dec 18 13:02 595bcdbf-6436-45a7-99d2-297a3dd85930
-> ../../sda6
lrwxrwxrwx 1 root root  10 Dec 18 13:02 693c71eb-d411-4ee0-a1b3-c577df02e01b
-> ../../sda9
lrwxrwxrwx 1 root root  10 Dec 18 13:02 6bcb2a05-33c9-402b-8093-e6a35ffd7aa1
-> ../../sda8
lrwxrwxrwx 1 root root  11 Dec 18 13:05 82e52787-6072-4af9-a5e6-2d88c365e62b
-> ../../loop0
lrwxrwxrwx 1 root root  10 Dec 18 13:02 c5591eff-0a6c-4310-bb11-7d5535f7da7b
-> ../../sda7
lrwxrwxrwx 1 root root  10 Dec 18 13:05 e289e4ad-be1d-42a8-9b38-f4dad9473520
-> ../../dm-0
lrwxrwxrwx 1 root root  10 Dec 18 13:02 ea8202e7-4564-424c-af70-a6a640fafb65
-> ../../sda5
~

I'll do the 'debug' addition later this weekend, like you requested.

Finally, you asked:

> Do you have any custom udev rules in /etc/udev/rules.d?

I don't think so, looking at the time stamps nothing has been changed there for 
years:

total 10
drwxr-xr-x 2 root root 3072 Dec  6  2014 ./
drwxr-xr-x 4 root root 1024 Dec  3 08:34 ../
-rw-r--r-- 1 root root  115 Dec  6  2014 70-automount.rules
-rw-r--r-- 1 root root 3841 Dec  6  2014 70-persistent-cd.rules
-rw-r--r-- 1 root root  895 Feb 26  2013 70-persistent-net.rules

And I definitely didn't recently change there any files, so again: the problem
appeared out of the blue since last weeks upgrade. 

I hope the above gives you at least some additional info. As I wrote: I'll do
the 'debug' addition tomorrow.

Cheers,

-- 
Frank B. Brokken
Center for Information Technology, University of Groningen
(+31) 50 363 9281 
Public PGP key: http://pgp.surfnet.nl
Key Fingerprint: DF32 13DE B156 7732 E65E  3B4D 7DB2 A8BE EAE4 D8AA



Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-18 Thread Michael Biebl
Am 18.12.2015 um 13:34 schrieb Frank B. Brokken:
>> I'm a bit confused by those logs. They show that sda5 have been mounted.
>>
>> Dec 17 15:44:29 localhost.localdomain kernel: EXT4-fs (sda5): mounting
>> ext3 file system using the ext4 subsystem
>> Dec 17 15:44:29 localhost.localdomain kernel: EXT4-fs (sda5): mounted
>> filesystem with ordered data mode. Opts: (null)
> 
> I seem to remember that message, in particular the Opts: (null) remark, and I
> think at that point /usr was mounted by me fron the systemd shell. Also,
> couldn't it be that initramfs *did* do the mount, but that remounting it rw,
> als reported in the error message is the problem? Also, to me it appears

The verbose debug log from the initramfs and systemd can maybe tell us
more. But looking at https://www.icce.rug.nl/systemd/journalctl, the
sda5 mount happens at line 773, the first errors start at line 785 and
the remount is at line 802.
So it looks like /usr is not mounted at the time
systemd-remount-fs.service is run.

What's also curious is, that the log doesn't seem to be complete.
Usually systemd's first log line is something like

> Dez 18 07:03:47 pluto systemd[1]: systemd 228 running in system mode. (+PAM 
> +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT 
> +GNUTLS +ACL +X
> Dez 18 07:03:47 pluto systemd[1]: Detected architecture x86-64.

Those early boot messages seem to be missing completely.


Btw, you mentioned that this happened after an upgrade. Which previous
version did you run which worked fine? Which other packages were
upgraded along with it?
If you downgrade systemd/udev, does the problem go away?


-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-18 Thread Frank B. Brokken
Dear Michael Biebl, you wrote:

> The verbose debug log from the initramfs and systemd can maybe tell us
> more. But looking at https://www.icce.rug.nl/systemd/journalctl, the
> sda5 mount happens at line 773, the first errors start at line 785 and
> the remount is at line 802.
> So it looks like /usr is not mounted at the time
> systemd-remount-fs.service is run.

Right. That's consistent with the impression I got from the error messages.
*Why* that is true, however, eludes me.

> 
> What's also curious is, that the log doesn't seem to be complete.
> Usually systemd's first log line is something like
> 
> > Dez 18 07:03:47 pluto systemd[1]: systemd 228 running in system mode. (+PAM 
> > +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP 
> > +GCRYPT +GNUTLS +ACL +X
> > Dez 18 07:03:47 pluto systemd[1]: Detected architecture x86-64.
> 
> Those early boot messages seem to be missing completely.

Well, I didn't edit anything. The information I generated is passed to you the
way it was made available by the various programs/commands.


> Btw, you mentioned that this happened after an upgrade. Which previous
> version did you run which worked fine? Which other packages were
> upgraded along with it?

Is there a way to determine that? What I do to upgrade the system is run
'aptitude update' and then 'aptitude upgrade'. Is there a log somewhere that
tells me what packages and versions were updated at what moments in time?


> If you downgrade systemd/udev, does the problem go away?

I thought about doing that, but was afraid for an avalanche of forced
downgrades of packages that might now depend on the most recent udev and
systemd versions. But I'll give it a try asap and let you know the results.

-- 
Frank B. Brokken
Center for Information Technology, University of Groningen
(+31) 50 363 9281 
Public PGP key: http://pgp.surfnet.nl
Key Fingerprint: DF32 13DE B156 7732 E65E  3B4D 7DB2 A8BE EAE4 D8AA



Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-18 Thread Michael Biebl
Am 18.12.2015 um 15:59 schrieb Frank B. Brokken:
> Is there a way to determine that? What I do to upgrade the system is run
> 'aptitude update' and then 'aptitude upgrade'. Is there a log somewhere that
> tells me what packages and versions were updated at what moments in time?

/var/log/dpkg.log is a low-level log.

and then there is one for aptitude at /var/log/aptitude

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



signature.asc
Description: OpenPGP digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-17 Thread Michael Biebl
Am 17.12.2015 um 13:46 schrieb Frank B. Brokken:
> Dear Michael Biebl, you wrote:
> 
>> What happens afterwards? Are you dropped into the rescue shell?
> 
> Afterwards (i.e., after the initial failure message) the system tries to
> continue booting, but shows lots of failure messages, eventually grinding to a
> halt. No reboot (e.g. ctrl-alt-del) is possible and there's no rescue shell.
> 
>> If not, try to enable the debug shell by adding "systemd.debug-shell" to
>> the kernel command line. This will give you a root shell on tty9.
> 
> Unfortunately, it doesn't, since the system halts (I first removed the
> automatic reboot on failure).

What exactly do you mean with halt? The systems completely locks up so
you can't use the keyboard and switch to tty9?

That would sound like a kernel problem.

> However, during the process I observed that setting systemd.debug-shell and
> removing the default 'quiet' specification from grub's /etc/default/grub (so,
> now it specifies:
> 
> GRUB_CMDLINE_LINUX_DEFAULT="systemd.debug-shell" 
> 
> greatly reduces the chances of a failing boot. Not completely, but
> greatly. Still, when rebooting fails there's just the plain halt, w/o a debug
> shell. Since removing the quiet also produces a lot more output on the screen,
> might my problem not simply be some timing problem?

Can you make a screenshot or a video from the boot process with "quiet"
removed from the kernel command line.


-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



signature.asc
Description: OpenPGP digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-17 Thread Frank B. Brokken
Dear Michael Biebl, you wrote:

> What happens afterwards? Are you dropped into the rescue shell?

Afterwards (i.e., after the initial failure message) the system tries to
continue booting, but shows lots of failure messages, eventually grinding to a
halt. No reboot (e.g. ctrl-alt-del) is possible and there's no rescue shell.

> If not, try to enable the debug shell by adding "systemd.debug-shell" to
> the kernel command line. This will give you a root shell on tty9.

Unfortunately, it doesn't, since the system halts (I first removed the
automatic reboot on failure).

However, during the process I observed that setting systemd.debug-shell and
removing the default 'quiet' specification from grub's /etc/default/grub (so,
now it specifies:

GRUB_CMDLINE_LINUX_DEFAULT="systemd.debug-shell" 

greatly reduces the chances of a failing boot. Not completely, but
greatly. Still, when rebooting fails there's just the plain halt, w/o a debug
shell. Since removing the quiet also produces a lot more output on the screen,
might my problem not simply be some timing problem?


-- 
Frank B. Brokken
Center for Information Technology, University of Groningen
(+31) 50 363 9281 
Public PGP key: http://pgp.surfnet.nl
Key Fingerprint: DF32 13DE B156 7732 E65E  3B4D 7DB2 A8BE EAE4 D8AA


signature.asc
Description: PGP signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-17 Thread Frank B. Brokken
Dear Michael Biebl, you wrote:
> Am 17.12.2015 um 13:46 schrieb Frank B. Brokken:

> > halt. No reboot (e.g. ctrl-alt-del) is possible and there's no rescue 
> > shell> 
> What exactly do you mean with halt? The systems completely locks up so
> you can't use the keyboard and switch to tty9?

No, that's not what happens. I mean that doing a reboot using ctrl-alt-del
isn't possible. Switching VTs is no problem, but except for VT1 nothing was
being shown. But maybe I overlooked things when I sent you the previous reply:
see below.

> That would sound like a kernel problem.

> > might my problem not simply be some timing problem?
> 
> Can you make a screenshot or a video from the boot process with "quiet"
> removed from the kernel command line.

I did. Not only that, I also tried to reboot again and this time I was able to
run the commands you asked before from tty9:

systemctl status
systemd-analyze dump
journalctl -alb

This time the debug shell prompt was available at tty9, although booting
failed. And in line with my previous findings, systemd-analyze and journalctl
weren't available, as they live in /usr/bin, and /usr hadn't been mounted. But
after mounting /usr from tty9 and then using the mount command systemd-analyze
and journalctl were available, so I also have the output from those commands
for you. The output, and the mp4 movie I made during the booting process are a
bit too large for the e-mail, but they are available for download/inspection
at https://www.icce.rug.nl/systemd/

Cheers,

-- 
Frank B. Brokken
Center for Information Technology, University of Groningen
(+31) 50 363 9281 
Public PGP key: http://pgp.surfnet.nl
Key Fingerprint: DF32 13DE B156 7732 E65E  3B4D 7DB2 A8BE EAE4 D8AA


signature.asc
Description: PGP signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-17 Thread Michael Biebl
Am 17.12.2015 um 16:24 schrieb Frank B. Brokken:
> This time the debug shell prompt was available at tty9, although booting
> failed. And in line with my previous findings, systemd-analyze and journalctl
> weren't available, as they live in /usr/bin, and /usr hadn't been mounted. But

Well, /usr is mounted by the initramfs these days. So it should already
be available when systemd is started. If that fails, this is a bug which
needs to be addressed by initramfs-tools (or one of the hook scripts).
It wasn't clear so far that /usr hasn't been mounted at all.

Is /usr on LVM, RAID, etc?

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



signature.asc
Description: OpenPGP digital signature


Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-17 Thread Michael Biebl
Am 17.12.2015 um 22:38 schrieb Michael Biebl:
> Am 17.12.2015 um 16:24 schrieb Frank B. Brokken:
>> This time the debug shell prompt was available at tty9, although booting
>> failed. And in line with my previous findings, systemd-analyze and journalctl
>> weren't available, as they live in /usr/bin, and /usr hadn't been mounted. 
>> But
> 
> Well, /usr is mounted by the initramfs these days. So it should already
> be available when systemd is started. If that fails, this is a bug which
> needs to be addressed by initramfs-tools (or one of the hook scripts).
> It wasn't clear so far that /usr hasn't been mounted at all.
> 
> Is /usr on LVM, RAID, etc?

I'm a bit confused by those logs. They show that sda5 have been mounted.

Dec 17 15:44:29 localhost.localdomain kernel: EXT4-fs (sda5): mounting
ext3 file system using the ext4 subsystem
Dec 17 15:44:29 localhost.localdomain kernel: EXT4-fs (sda5): mounted
filesystem with ordered data mode. Opts: (null)

I figure /dev/sda5 is your /usr partition? Just to be sure, please
attach ls -la /dev/disk/by-uuid/

If so, I'm honestly puzzled why later on the binaries from /usr/bin are
not found.

Could you do another try and add "debug" to the kernel command line.
Then attach /run/initramfs/initramfs.debug and journalctl -alb (and
avoid mounting /usr manually on tty9)

Do you have any custom udev rules in /etc/udev/rules.d?


Michael

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



Bug#808151: systemd: failed to start remount root and kernel file system

2015-12-16 Thread Michael Biebl
Am 16.12.2015 um 15:43 schrieb Frank B. Brokken:
> Package: systemd
> Version: 228-2
> Severity: critical
> Justification: breaks the whole system
> 
> Dear Maintainer,
> 
>* What led up to the situation?
> 
> Last week, Tuesday Dec. 8 I upgraded this system. One of the packages that
> were upgraded was systemd. After the upgrade rebooting the system failed, with
> the error mentioned in the subject.

What happens afterwards? Are you dropped into the rescue shell?
If not, try to enable the debug shell by adding "systemd.debug-shell" to
the kernel command line. This will give you a root shell on tty9.

There you should be able to inspect the system. The output of
systemctl status
systemd-analyze dump
journalctl -alb
would be helpful

-- 
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?



signature.asc
Description: OpenPGP digital signature