Re: [gentoo-user] problems debugging a systemd problem
On 29.05.2015 01:57, cov...@ccs.covici.com wrote: OK, thanks much. so, does it boot now?
Re: [gentoo-user] problems debugging a systemd problem
Stefan G. Weichinger li...@xunil.at wrote: On 29.05.2015 01:57, cov...@ccs.covici.com wrote: OK, thanks much. so, does it boot now? Sure, once I found the typo and fixed it, it booted, I was just trying to save time in the future and I did learn a few things in the process. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] problems debugging a systemd problem
On Thu, May 28, 2015 at 2:11 PM, cov...@ccs.covici.com wrote: Also, as Rich said, if you wait it's possible that systemd (and/or dracut) will drop you into a rescue shell anyway. Unfortunately, thanks to very slow hardware in the wild, the timeout has been increased to three minutes, and I believe those are *per hardware unit*. So if you have five disks, in theory it could take fifteen minutes to get you to a rescue shell. Thanks much. Does the rescue target try to mount all the disks? Also, I would still like to get in touch with the dracut devs -- although I may never make that particular mistake again, but maybe other things will happen. As I said in my previous mail: emergency mounts the root filesystem read-only; rescue mounts all the filesystems read/write. If dracut cannot mount the root filesystem, it *WILL* drop you to a shell, but it will take some time while all the timeouts expire. This could be *several* minutes depending on hardware. The dracut mailing list is in [1]. Regards. [1] http://vger.kernel.org/vger-lists.html#initramfs -- Canek Peláez Valdés Profesor de asignatura, Facultad de Ciencias Universidad Nacional Autónoma de México
Re: [gentoo-user] problems debugging a systemd problem
On Thu, May 28, 2015 at 5:32 PM, Canek Peláez Valdés can...@gmail.com wrote: As I said, I did the following tests: 1. Adding emergency to the kernel command line, with a valid root=. 2. Adding rescue to the kernel command line, with a valid root=. 2. Leaving root= invalid without adding neither emergency nor rescue. If root= is valid, with emergency systemd drops you to a shell with your root filesystem mounted read-only. With rescue, systemd drops you to a shell with all your filesystems mounted read-write. If root= is invalid, it doesn't matter if you use emergency, rescue, or neither, *dracut* drops you to a shell, still inside the initramfs obviously. It takes a while; I didn't took the time, but I think it was 3 minutes. Inside this shell, you can use systemd normally, and if you manage to mount the root filesystem, I'm sure you could continue the normal boot process. You'll have to pivot root manually, though. That was basically my understanding of how dracut behaved. I think we're just having a communication gap or something, because you seem to be disagreeing with me when I'm basically trying to describe the behavior you just listed above. -- Rich
Re: [gentoo-user] problems debugging a systemd problem
Canek Peláez Valdés can...@gmail.com wrote: On Thu, May 28, 2015 at 1:36 PM, Rich Freeman ri...@gentoo.org wrote: On Thu, May 28, 2015 at 2:11 PM, Canek Peláez Valdés can...@gmail.com wrote: Actually, it does work (see attached screenshot). I set my root= kernel command line parameter wrong on purpose, and systemd (inside dracut) dropped me inside a rescue shell. Interesting. Perhaps it just enables shell access. There is a separate option that configures whether dracut drops to a shell at all, or if it just hangs on failure. The latter might be desirable for security purposes in some cases. Are you sure that you don't get a shell if you don't pass emergency on the command line, but still have an invalid root=? I wasn't sure, I did a couple of tests more. I comment them below. Usually when somebody wants a rescue shell, they want it in their root filesystem, and not in their initramfs before it has pivoted. That is why dracut has options like rd.break. But that doesn't help you at all when the problem is exactly that you cannot mount your root filesystem. With the rescue shell of systemd (inside dracut), you can analyze the problem, or perhaps even mount your root filesystem and continue the boot process; the initramfs should have all the necessary tools to do that. rd.break DOES give you a shell before root is mounted, if you tell it to. rd.shell tells dracut to give you a shell if something fails rd.break forces a shell at the specified point, whether something fails or not. The official docs do not list emergency as a valid dracut option. Obviously systemd uses it, but again the fact that you had to mangle your root= option sugests that systemd within dracut ignores it if it can mount your root. No, if you set emergency or rescue, systemd will go to emergency.target and rescue.target, respectively. If the problem were with systemd/services/etc in the actual root filesystem (once the actual distro has started booting), then putting emergency on the command line should get you a rescue shell. Again, what if the problem is before *that*? Then you tell dracut to drop to a shell. I wasn't aware that the emergency option actually made a difference, though I'm still not 100% sure that was what did it. I'm now pretty sure it DOESN'T make a difference when the problem is before you can mount root. The same generally applies to openrc - if the initramfs isn't mounting your root filesystem, then passing instructions to openrc won't do anything since in that case openrc isn't even running. But in this case, systemd *is* inside the initramfs: # ls usr/lib/systemd/ network systemd-cgroups-agent systemd-journald systemd-shutdown systemd-vconsole-setup system systemd-fsck systemd-modules-load systemd-sysctl system-generators systemd systemd-hibernate-resume systemd-reply-password systemd-udevd That's my initramfs. With dracut, systemd *is* the initramfs init system. Sure, and that is how mine works as well. But, obviously systemd in dracut is configured to ignore that parameter when root= is valid, No, it doesn't ignore it, even if root= is valid. otherwise you'd get a shell every time. I'd have to check the docs, but I suspect that the behavior is configurable, and systemd within the initramfs is configured differently. If nothing else they could just make the rescue target launch the default target/etc. As I said, I did the following tests: 1. Adding emergency to the kernel command line, with a valid root=. 2. Adding rescue to the kernel command line, with a valid root=. 2. Leaving root= invalid without adding neither emergency nor rescue. If root= is valid, with emergency systemd drops you to a shell with your root filesystem mounted read-only. With rescue, systemd drops you to a shell with all your filesystems mounted read-write. If root= is invalid, it doesn't matter if you use emergency, rescue, or neither, *dracut* drops you to a shell, still inside the initramfs obviously. It takes a while; I didn't took the time, but I think it was 3 minutes. Inside this shell, you can use systemd normally, and if you manage to mount the root filesystem, I'm sure you could continue the normal boot process. You'll have to pivot root manually, though. Hope that makes it clear. How do you pivot route manually? -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] problems debugging a systemd problem
On Thu, May 28, 2015 at 5:55 PM, cov...@ccs.covici.com wrote: How do you pivot route manually? With dracut I'm pretty sure that as long as you mount root under /sysroot you can just type exit and it will take care of the rest. I suspect it will even remount it for you if you don't use the same options as it finds in /sysroot/etc/fstab. -- Rich
Re: [gentoo-user] problems debugging a systemd problem
On Thu, May 28, 2015 at 1:36 PM, Rich Freeman ri...@gentoo.org wrote: On Thu, May 28, 2015 at 2:11 PM, Canek Peláez Valdés can...@gmail.com wrote: Actually, it does work (see attached screenshot). I set my root= kernel command line parameter wrong on purpose, and systemd (inside dracut) dropped me inside a rescue shell. Interesting. Perhaps it just enables shell access. There is a separate option that configures whether dracut drops to a shell at all, or if it just hangs on failure. The latter might be desirable for security purposes in some cases. Are you sure that you don't get a shell if you don't pass emergency on the command line, but still have an invalid root=? I wasn't sure, I did a couple of tests more. I comment them below. Usually when somebody wants a rescue shell, they want it in their root filesystem, and not in their initramfs before it has pivoted. That is why dracut has options like rd.break. But that doesn't help you at all when the problem is exactly that you cannot mount your root filesystem. With the rescue shell of systemd (inside dracut), you can analyze the problem, or perhaps even mount your root filesystem and continue the boot process; the initramfs should have all the necessary tools to do that. rd.break DOES give you a shell before root is mounted, if you tell it to. rd.shell tells dracut to give you a shell if something fails rd.break forces a shell at the specified point, whether something fails or not. The official docs do not list emergency as a valid dracut option. Obviously systemd uses it, but again the fact that you had to mangle your root= option sugests that systemd within dracut ignores it if it can mount your root. No, if you set emergency or rescue, systemd will go to emergency.target and rescue.target, respectively. If the problem were with systemd/services/etc in the actual root filesystem (once the actual distro has started booting), then putting emergency on the command line should get you a rescue shell. Again, what if the problem is before *that*? Then you tell dracut to drop to a shell. I wasn't aware that the emergency option actually made a difference, though I'm still not 100% sure that was what did it. I'm now pretty sure it DOESN'T make a difference when the problem is before you can mount root. The same generally applies to openrc - if the initramfs isn't mounting your root filesystem, then passing instructions to openrc won't do anything since in that case openrc isn't even running. But in this case, systemd *is* inside the initramfs: # ls usr/lib/systemd/ network systemd-cgroups-agent systemd-journald systemd-shutdown systemd-vconsole-setup system systemd-fsck systemd-modules-load systemd-sysctl system-generators systemd systemd-hibernate-resume systemd-reply-password systemd-udevd That's my initramfs. With dracut, systemd *is* the initramfs init system. Sure, and that is how mine works as well. But, obviously systemd in dracut is configured to ignore that parameter when root= is valid, No, it doesn't ignore it, even if root= is valid. otherwise you'd get a shell every time. I'd have to check the docs, but I suspect that the behavior is configurable, and systemd within the initramfs is configured differently. If nothing else they could just make the rescue target launch the default target/etc. As I said, I did the following tests: 1. Adding emergency to the kernel command line, with a valid root=. 2. Adding rescue to the kernel command line, with a valid root=. 2. Leaving root= invalid without adding neither emergency nor rescue. If root= is valid, with emergency systemd drops you to a shell with your root filesystem mounted read-only. With rescue, systemd drops you to a shell with all your filesystems mounted read-write. If root= is invalid, it doesn't matter if you use emergency, rescue, or neither, *dracut* drops you to a shell, still inside the initramfs obviously. It takes a while; I didn't took the time, but I think it was 3 minutes. Inside this shell, you can use systemd normally, and if you manage to mount the root filesystem, I'm sure you could continue the normal boot process. You'll have to pivot root manually, though. Hope that makes it clear. Regards. -- Canek Peláez Valdés Profesor de asignatura, Facultad de Ciencias Universidad Nacional Autónoma de México
Re: [gentoo-user] problems debugging a systemd problem
On Thu, May 28, 2015 at 5:23 PM, Rich Freeman ri...@gentoo.org wrote: On Thu, May 28, 2015 at 5:32 PM, Canek Peláez Valdés can...@gmail.com wrote: As I said, I did the following tests: 1. Adding emergency to the kernel command line, with a valid root=. 2. Adding rescue to the kernel command line, with a valid root=. 2. Leaving root= invalid without adding neither emergency nor rescue. If root= is valid, with emergency systemd drops you to a shell with your root filesystem mounted read-only. With rescue, systemd drops you to a shell with all your filesystems mounted read-write. If root= is invalid, it doesn't matter if you use emergency, rescue, or neither, *dracut* drops you to a shell, still inside the initramfs obviously. It takes a while; I didn't took the time, but I think it was 3 minutes. Inside this shell, you can use systemd normally, and if you manage to mount the root filesystem, I'm sure you could continue the normal boot process. You'll have to pivot root manually, though. That was basically my understanding of how dracut behaved. I think we're just having a communication gap or something, because you seem to be disagreeing with me when I'm basically trying to describe the behavior you just listed above. It's possible; I was wrong about emergency doing anything when root= is invalid, but I did not understood the above behavior from your previous mails. Anyway, if dracut cannot mount the root filesystem, it will drop you into a shell with a functional systemd. Eventually. Regards. -- Canek Peláez Valdés Profesor de asignatura, Facultad de Ciencias Universidad Nacional Autónoma de México
Re: [gentoo-user] problems debugging a systemd problem
On Thu, May 28, 2015 at 4:55 PM, cov...@ccs.covici.com wrote: Canek Peláez Valdés can...@gmail.com wrote: [...] As I said, I did the following tests: 1. Adding emergency to the kernel command line, with a valid root=. 2. Adding rescue to the kernel command line, with a valid root=. 2. Leaving root= invalid without adding neither emergency nor rescue. If root= is valid, with emergency systemd drops you to a shell with your root filesystem mounted read-only. With rescue, systemd drops you to a shell with all your filesystems mounted read-write. If root= is invalid, it doesn't matter if you use emergency, rescue, or neither, *dracut* drops you to a shell, still inside the initramfs obviously. It takes a while; I didn't took the time, but I think it was 3 minutes. Inside this shell, you can use systemd normally, and if you manage to mount the root filesystem, I'm sure you could continue the normal boot process. You'll have to pivot root manually, though. Hope that makes it clear. How do you pivot route manually? Basically, with pivot_root(8) [1]. Be aware that systemd does some things before and after pivot_root'ing; in particular, it switches from the instance running inside the initramfs to an instance running in the real filesystem. I'm not sure how it does it, but the switching code is relatively simple [2]. Regards. [1] http://linux.die.net/man/8/pivot_root [2] http://cgit.freedesktop.org/systemd/systemd/tree/src/shared/switch-root.c -- Canek Peláez Valdés Profesor de asignatura, Facultad de Ciencias Universidad Nacional Autónoma de México
Re: [gentoo-user] problems debugging a systemd problem
On Thu, May 28, 2015 at 2:11 PM, Canek Peláez Valdés can...@gmail.com wrote: Actually, it does work (see attached screenshot). I set my root= kernel command line parameter wrong on purpose, and systemd (inside dracut) dropped me inside a rescue shell. Interesting. Perhaps it just enables shell access. There is a separate option that configures whether dracut drops to a shell at all, or if it just hangs on failure. The latter might be desirable for security purposes in some cases. Are you sure that you don't get a shell if you don't pass emergency on the command line, but still have an invalid root=? Usually when somebody wants a rescue shell, they want it in their root filesystem, and not in their initramfs before it has pivoted. That is why dracut has options like rd.break. But that doesn't help you at all when the problem is exactly that you cannot mount your root filesystem. With the rescue shell of systemd (inside dracut), you can analyze the problem, or perhaps even mount your root filesystem and continue the boot process; the initramfs should have all the necessary tools to do that. rd.break DOES give you a shell before root is mounted, if you tell it to. rd.shell tells dracut to give you a shell if something fails rd.break forces a shell at the specified point, whether something fails or not. The official docs do not list emergency as a valid dracut option. Obviously systemd uses it, but again the fact that you had to mangle your root= option sugests that systemd within dracut ignores it if it can mount your root. If the problem were with systemd/services/etc in the actual root filesystem (once the actual distro has started booting), then putting emergency on the command line should get you a rescue shell. Again, what if the problem is before *that*? Then you tell dracut to drop to a shell. I wasn't aware that the emergency option actually made a difference, though I'm still not 100% sure that was what did it. The same generally applies to openrc - if the initramfs isn't mounting your root filesystem, then passing instructions to openrc won't do anything since in that case openrc isn't even running. But in this case, systemd *is* inside the initramfs: # ls usr/lib/systemd/ network systemd-cgroups-agent systemd-journaldsystemd-shutdown systemd-vconsole-setup system systemd-fsck systemd-modules-loadsystemd-sysctl system-generators systemd systemd-hibernate-resume systemd-reply-password systemd-udevd That's my initramfs. With dracut, systemd *is* the initramfs init system. Sure, and that is how mine works as well. But, obviously systemd in dracut is configured to ignore that parameter when root= is valid, otherwise you'd get a shell every time. I'd have to check the docs, but I suspect that the behavior is configurable, and systemd within the initramfs is configured differently. If nothing else they could just make the rescue target launch the default target/etc. -- Rich
Re: [gentoo-user] problems debugging a systemd problem
On Thu, May 28, 2015 at 11:57 AM, Canek Peláez Valdés can...@gmail.com wrote: Others have already answered, but I will add that if you put emergency anywhere in the kernel command line, then systemd will boot to the rescue target; that's why I suggested to do it in my first answer. I'm pretty sure that won't work for an initramfs - they're almost certainly designed to ignore that instruction. Usually when somebody wants a rescue shell, they want it in their root filesystem, and not in their initramfs before it has pivoted. That is why dracut has options like rd.break. If the problem were with systemd/services/etc in the actual root filesystem (once the actual distro has started booting), then putting emergency on the command line should get you a rescue shell. The same generally applies to openrc - if the initramfs isn't mounting your root filesystem, then passing instructions to openrc won't do anything since in that case openrc isn't even running. -- Rich
Re: [gentoo-user] problems debugging a systemd problem
Canek Peláez Valdés can...@gmail.com wrote: On Thu, May 28, 2015 at 3:30 AM, cov...@ccs.covici.com wrote: Stefan G. Weichinger li...@xunil.at wrote: On 28.05.2015 09:39, cov...@ccs.covici.com wrote: No, the journal is gone, it was only in /run which is on a tmpfs file system. I can boot from a cd all day long, but it would not help one bit. Hm, I think it could help for sure as you could chroot in and do something. For example build a new kernel or initrd or ... You removed openrc? Otherwise boot via openrc and (try to) fix stuff. You could even reinstall openrc from within chroot ... just to get bootin again etc etc I still have openrc, but Dracut won't work with it, at least maybe because I have systemd use flag enabled. Also, in retrospect, that would not have solved my specific problems, because it was related to an rd.lv command which is specific to dracut. But thanks for your suggestion. I wonder what the rescue target is -- I have never seen that before -- maybe I could configure it so I could boot into a shell and fix things and it would be sort of like a little system of its own. Others have already answered, but I will add that if you put emergency anywhere in the kernel command line, then systemd will boot to the rescue target; that's why I suggested to do it in my first answer. Also, as Rich said, if you wait it's possible that systemd (and/or dracut) will drop you into a rescue shell anyway. Unfortunately, thanks to very slow hardware in the wild, the timeout has been increased to three minutes, and I believe those are *per hardware unit*. So if you have five disks, in theory it could take fifteen minutes to get you to a rescue shell. Thanks much. Does the rescue target try to mount all the disks? Also, I would still like to get in touch with the dracut devs -- although I may never make that particular mistake again, but maybe other things will happen. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] problems debugging a systemd problem
Canek Peláez Valdés can...@gmail.com wrote: On Thu, May 28, 2015 at 2:11 PM, cov...@ccs.covici.com wrote: Also, as Rich said, if you wait it's possible that systemd (and/or dracut) will drop you into a rescue shell anyway. Unfortunately, thanks to very slow hardware in the wild, the timeout has been increased to three minutes, and I believe those are *per hardware unit*. So if you have five disks, in theory it could take fifteen minutes to get you to a rescue shell. Thanks much. Does the rescue target try to mount all the disks? Also, I would still like to get in touch with the dracut devs -- although I may never make that particular mistake again, but maybe other things will happen. As I said in my previous mail: emergency mounts the root filesystem read-only; rescue mounts all the filesystems read/write. If dracut cannot mount the root filesystem, it *WILL* drop you to a shell, but it will take some time while all the timeouts expire. This could be *several* minutes depending on hardware. The dracut mailing list is in [1]. Regards. OK, thanks much. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] problems debugging a systemd problem
Am 2015-05-28 um 08:15 schrieb cov...@ccs.covici.com: Thanks for your quick reply, but I do have rd.shell=1, but it did not drop to a shell,it just hung, so I could not do journalctl or anything -- the nearest break point was pre-initqueue which was maybe too early and the next one is pre-mount which it never got to. Unfortunately, I was in a position where I could not use an older kernel, because the older ones didn't have the configs to read gui type partitions-- I always keep several kernels around normally, but this was one of those transitional times when I was stuck. So do I need emergency aswell as rd.shell andis there any way to get a shell when the system appearsto be in some kind of a loop, like calling setl over andover again? http://freedesktop.org/wiki/Software/systemd/Debugging/ ? Do you try to boot into rescue.target? Why not use a live-cd, boot, mount, chroot and get on there? You can read the journal of your failing installation via the journalctl-binary of your booted live-system (I assume fedora live-media boots with systemd ... dunno ad hoc which one to use)
Re: [gentoo-user] problems debugging a systemd problem
Stefan G. Weichinger li...@xunil.at wrote: Am 2015-05-28 um 08:15 schrieb cov...@ccs.covici.com: Thanks for your quick reply, but I do have rd.shell=1, but it did not drop to a shell,it just hung, so I could not do journalctl or anything -- the nearest break point was pre-initqueue which was maybe too early and the next one is pre-mount which it never got to. Unfortunately, I was in a position where I could not use an older kernel, because the older ones didn't have the configs to read gui type partitions-- I always keep several kernels around normally, but this was one of those transitional times when I was stuck. So do I need emergency aswell as rd.shell andis there any way to get a shell when the system appearsto be in some kind of a loop, like calling setl over andover again? http://freedesktop.org/wiki/Software/systemd/Debugging/ ? Do you try to boot into rescue.target? Why not use a live-cd, boot, mount, chroot and get on there? You can read the journal of your failing installation via the journalctl-binary of your booted live-system (I assume fedora live-media boots with systemd ... dunno ad hoc which one to use) No, the journal is gone, it was only in /run which is on a tmpfs file system. I can boot from a cd all day long, but it would not help one bit. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] problems debugging a systemd problem
On 28.05.2015 09:39, cov...@ccs.covici.com wrote: No, the journal is gone, it was only in /run which is on a tmpfs file system. I can boot from a cd all day long, but it would not help one bit. Hm, I think it could help for sure as you could chroot in and do something. For example build a new kernel or initrd or ... You removed openrc? Otherwise boot via openrc and (try to) fix stuff. You could even reinstall openrc from within chroot ... just to get bootin again etc etc
Re: [gentoo-user] problems debugging a systemd problem
Stefan G. Weichinger li...@xunil.at wrote: On 28.05.2015 09:39, cov...@ccs.covici.com wrote: No, the journal is gone, it was only in /run which is on a tmpfs file system. I can boot from a cd all day long, but it would not help one bit. Hm, I think it could help for sure as you could chroot in and do something. For example build a new kernel or initrd or ... You removed openrc? Otherwise boot via openrc and (try to) fix stuff. You could even reinstall openrc from within chroot ... just to get bootin again etc etc I still have openrc, but Dracut won't work with it, at least maybe because I have systemd use flag enabled. Also, in retrospect, that would not have solved my specific problems, because it was related to an rd.lv command which is specific to dracut. But thanks for your suggestion. I wonder what the rescue target is -- I have never seen that before -- maybe I could configure it so I could boot into a shell and fix things and it would be sort of like a little system of its own. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] problems debugging a systemd problem
Canek Peláez Valdés can...@gmail.com wrote: On Thu, May 28, 2015 at 12:09 AM, cov...@ccs.covici.com wrote: Hi folks. I spent a very frustrating time last night trying to figure out why my systemd would not boot using systemd. I am using dracut and its version is 041r2. Now what was happening is that the system would get to the pre-init-queue -- and I even set the rd.break there, but after that the system would not boot -- when I used debug it endlessly said calling setl forever. Now it turned out that the problem was that I had mistyped an rd.lv= line -- instead of ssd-files/usr I had ssd-files/-usr . Now, what I would like to know is how could I tell that it was trying to look for a non-existent lv? At the point of the break. no lvm volumes were active, although strangely enough I saw a e2fsck for the real root file system which was an lvm volume. I am finding its generally hard to debug systemd problems, several other times the system just sat there till I figured it out some other way. Any observations on this would be appreciated, but I don't want to get into a flame war, I just want to minimize the down time. Usually if you can get an emergency shell by adding emergency to the kernel command line (both GRUB and Gummiboot allow you to edit the kernel command line), then is easy to see what the problem is. My experience with LVM has been consistently pretty awful, which is why I don't use in any of my machines, but I suppose a systemctl --all --full will tell you what unit files have failed, and then you can journalctl -b -u them. Also journalctl -b by itself would tell you many times what the problem is. The only problem with the emergency shell is that sometimes is too early in the boot process for the keyboard drivers to have been loaded, but that is easily solved by adding a drivers+= line to a conf file in /etc/dracut.conf.d. Also, and I cannot stress this enough, you never delete your old (and working) kernel+initramfs until you have tested the new one. I would also recommend to leave the entries for the old kernel+initramfs in the GRUB/Gummiboot menu, but you can manage without them. Finally, and this is tooting my own horn, maybe you could try kerninst[1]? It's a little script I started a couple of years ago to automatically compile and install my kernels and generate my initramfs'. I use it in all my machines, and now my kernel update is just a matter of eselecting the new version, and running kerninst. I follow ~amd64 vanilla-sources, so this is roughly every week or two. Beware, though, that I don't use LVM nor RAID nor Luks, but in theory if you have a working kernel+dracut+[grub|gummitboot] configuration, it should also work with them. Thanks for your quick reply, but I do have rd.shell=1, but it did not drop to a shell,it just hung, so I could not do journalctl or anything -- the nearest break point was pre-initqueue which was maybe too early and the next one is pre-mount which it never got to. Unfortunately, I was in a position where I could not use an older kernel, because the older ones didn't have the configs to read gui type partitions-- I always keep several kernels around normally, but this was one of those transitional times when I was stuck. So do I need emergency aswell as rd.shell andis there any way to get a shell when the system appearsto be in some kind of a loop, like calling setl over andover again? -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] problems debugging a systemd problem
On 28.05.2015 10:30, cov...@ccs.covici.com wrote: I still have openrc, but Dracut won't work with it, at least maybe because I have systemd use flag enabled. Also, in retrospect, that would not have solved my specific problems, because it was related to an rd.lv command which is specific to dracut. You could rebuild the initrd from live-cd maybe But thanks for your suggestion. I wonder what the rescue target is -- I have never seen that before -- maybe I could configure it so I could boot into a shell and fix things and it would be sort of like a little system of its own. It's a target of systemd, kind of a runlevel. see http://freedesktop.org/wiki/Software/systemd/Debugging/#index1h1 quote: To boot directly into rescue target add systemd.unit=rescue.target or just 1 to the kernel command line.
Re: [gentoo-user] problems debugging a systemd problem
Stefan G. Weichinger li...@xunil.at wrote: On 28.05.2015 10:30, cov...@ccs.covici.com wrote: I still have openrc, but Dracut won't work with it, at least maybe because I have systemd use flag enabled. Also, in retrospect, that would not have solved my specific problems, because it was related to an rd.lv command which is specific to dracut. You could rebuild the initrd from live-cd maybe But thanks for your suggestion. I wonder what the rescue target is -- I have never seen that before -- maybe I could configure it so I could boot into a shell and fix things and it would be sort of like a little system of its own. It's a target of systemd, kind of a runlevel. see http://freedesktop.org/wiki/Software/systemd/Debugging/#index1h1 quote: To boot directly into rescue target add systemd.unit=rescue.target or just 1 to the kernel command line. OK, thanks, I will definitely check that one out. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] problems debugging a systemd problem
On Thu, May 28, 2015 at 2:15 AM, cov...@ccs.covici.com wrote: Thanks for your quick reply, but I do have rd.shell=1, but it did not drop to a shell,it just hung, so I could not do journalctl or anything -- the nearest break point was pre-initqueue which was maybe too early and the next one is pre-mount which it never got to. It could very well be a dracut bug (consider bringing the issue to them), but how long did you wait while it was looping. I've seen cases where dracut looped for a few minutes before dropping to a shell. There are a few loops where dracut is waiting for udev to detect all devices, and if it is looking for a device that will never appear, then it will potentially loop forever. There should be a timeout, but I don't know what it is set to by default. Once you get a shell you should be able to inspect the journal/dmesg/etc and try to see what is going on. Sure, you could have booted a rescue CD, but I don't see what it would have gained you as far as troubleshooting the problem with your initramfs (though it would have allowed you to rebuild it if the initramfs itself was broken, or try out a different version/etc). As you point out any logs it creates are stored in tmpfs or ramfs - that is true of just about any initramfs since it won't have any place to store them until it mounts root. I don't know if setting the rescue target would have helped. I think that the behavior of that option is to still boot to your root filesystem and THEN drop to a shell. If you want to force a rescue shell within the initramfs you need to use rd.break or such, and as you point out you need to find the right breakpoint for this. I'd suggest talking to the dracut devs about how it should have behaved in those circumstances. At the very least they might be able to improve the error reporting. -- Rich
Re: [gentoo-user] problems debugging a systemd problem
Rich Freeman ri...@gentoo.org wrote: On Thu, May 28, 2015 at 2:15 AM, cov...@ccs.covici.com wrote: Thanks for your quick reply, but I do have rd.shell=1, but it did not drop to a shell,it just hung, so I could not do journalctl or anything -- the nearest break point was pre-initqueue which was maybe too early and the next one is pre-mount which it never got to. It could very well be a dracut bug (consider bringing the issue to them), but how long did you wait while it was looping. I've seen cases where dracut looped for a few minutes before dropping to a shell. There are a few loops where dracut is waiting for udev to detect all devices, and if it is looking for a device that will never appear, then it will potentially loop forever. There should be a timeout, but I don't know what it is set to by default. Once you get a shell you should be able to inspect the journal/dmesg/etc and try to see what is going on. Sure, you could have booted a rescue CD, but I don't see what it would have gained you as far as troubleshooting the problem with your initramfs (though it would have allowed you to rebuild it if the initramfs itself was broken, or try out a different version/etc). As you point out any logs it creates are stored in tmpfs or ramfs - that is true of just about any initramfs since it won't have any place to store them until it mounts root. I don't know if setting the rescue target would have helped. I think that the behavior of that option is to still boot to your root filesystem and THEN drop to a shell. If you want to force a rescue shell within the initramfs you need to use rd.break or such, and as you point out you need to find the right breakpoint for this. I'd suggest talking to the dracut devs about how it should have behaved in those circumstances. At the very least they might be able to improve the error reporting. Thanks, so how would I get in touch with dracut devs? -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] problems debugging a systemd problem
On Thu, May 28, 2015 at 3:30 AM, cov...@ccs.covici.com wrote: Stefan G. Weichinger li...@xunil.at wrote: On 28.05.2015 09:39, cov...@ccs.covici.com wrote: No, the journal is gone, it was only in /run which is on a tmpfs file system. I can boot from a cd all day long, but it would not help one bit. Hm, I think it could help for sure as you could chroot in and do something. For example build a new kernel or initrd or ... You removed openrc? Otherwise boot via openrc and (try to) fix stuff. You could even reinstall openrc from within chroot ... just to get bootin again etc etc I still have openrc, but Dracut won't work with it, at least maybe because I have systemd use flag enabled. Also, in retrospect, that would not have solved my specific problems, because it was related to an rd.lv command which is specific to dracut. But thanks for your suggestion. I wonder what the rescue target is -- I have never seen that before -- maybe I could configure it so I could boot into a shell and fix things and it would be sort of like a little system of its own. Others have already answered, but I will add that if you put emergency anywhere in the kernel command line, then systemd will boot to the rescue target; that's why I suggested to do it in my first answer. Also, as Rich said, if you wait it's possible that systemd (and/or dracut) will drop you into a rescue shell anyway. Unfortunately, thanks to very slow hardware in the wild, the timeout has been increased to three minutes, and I believe those are *per hardware unit*. So if you have five disks, in theory it could take fifteen minutes to get you to a rescue shell. Regards. -- Canek Peláez Valdés Profesor de asignatura, Facultad de Ciencias Universidad Nacional Autónoma de México
[gentoo-user] problems debugging a systemd problem
Hi folks. I spent a very frustrating time last night trying to figure out why my systemd would not boot using systemd. I am using dracut and its version is 041r2. Now what was happening is that the system would get to the pre-init-queue -- and I even set the rd.break there, but after that the system would not boot -- when I used debug it endlessly said calling setl forever. Now it turned out that the problem was that I had mistyped an rd.lv= line -- instead of ssd-files/usr I had ssd-files/-usr . Now, what I would like to know is how could I tell that it was trying to look for a non-existent lv? At the point of the break. no lvm volumes were active, although strangely enough I saw a e2fsck for the real root file system which was an lvm volume. I am finding its generally hard to debug systemd problems, several other times the system just sat there till I figured it out some other way. Any observations on this would be appreciated, but I don't want to get into a flame war, I just want to minimize the down time. -- Your life is like a penny. You're going to lose it. The question is: How do you spend it? John Covici cov...@ccs.covici.com
Re: [gentoo-user] problems debugging a systemd problem
On Thu, May 28, 2015 at 12:09 AM, cov...@ccs.covici.com wrote: Hi folks. I spent a very frustrating time last night trying to figure out why my systemd would not boot using systemd. I am using dracut and its version is 041r2. Now what was happening is that the system would get to the pre-init-queue -- and I even set the rd.break there, but after that the system would not boot -- when I used debug it endlessly said calling setl forever. Now it turned out that the problem was that I had mistyped an rd.lv= line -- instead of ssd-files/usr I had ssd-files/-usr . Now, what I would like to know is how could I tell that it was trying to look for a non-existent lv? At the point of the break. no lvm volumes were active, although strangely enough I saw a e2fsck for the real root file system which was an lvm volume. I am finding its generally hard to debug systemd problems, several other times the system just sat there till I figured it out some other way. Any observations on this would be appreciated, but I don't want to get into a flame war, I just want to minimize the down time. Usually if you can get an emergency shell by adding emergency to the kernel command line (both GRUB and Gummiboot allow you to edit the kernel command line), then is easy to see what the problem is. My experience with LVM has been consistently pretty awful, which is why I don't use in any of my machines, but I suppose a systemctl --all --full will tell you what unit files have failed, and then you can journalctl -b -u them. Also journalctl -b by itself would tell you many times what the problem is. The only problem with the emergency shell is that sometimes is too early in the boot process for the keyboard drivers to have been loaded, but that is easily solved by adding a drivers+= line to a conf file in /etc/dracut.conf.d. Also, and I cannot stress this enough, you never delete your old (and working) kernel+initramfs until you have tested the new one. I would also recommend to leave the entries for the old kernel+initramfs in the GRUB/Gummiboot menu, but you can manage without them. Finally, and this is tooting my own horn, maybe you could try kerninst[1]? It's a little script I started a couple of years ago to automatically compile and install my kernels and generate my initramfs'. I use it in all my machines, and now my kernel update is just a matter of eselecting the new version, and running kerninst. I follow ~amd64 vanilla-sources, so this is roughly every week or two. Beware, though, that I don't use LVM nor RAID nor Luks, but in theory if you have a working kernel+dracut+[grub|gummitboot] configuration, it should also work with them. Regards. [1] https://github.com/canek-pelaez/kerninst -- Canek Peláez Valdés Profesor de asignatura, Facultad de Ciencias Universidad Nacional Autónoma de México