Re: [systemd-devel] handling mount failure in initramfs context
On Mon, May 26, 2014 at 04:12:56PM +0800, WANG Chao wrote: Hi, all In a pure initramfs enviroment, I want to mount a filesystem and I put an mount entry in /etc/fstab, so that fstab-generator could generate a mount unit and systemd will mount it at some time. I have a question about mount failure in such case: How can I make sure that upon a mount failure, systemd would stop booting and switch to emergency handling? I will give little more context to the problem at hand. So kdump uses --mount option of dracut to mount file systems in initramfs context. dracut puts right values in /etc/fstab of initramfs which in turn are parsed by systemd and mount units are generated. Now question is what will happen if one of the mount failed? I think currently systemd does not drop us to emergency shell and instead continues to boot. We are trying to figure out how to change this behavior where we can tell systemd to drop into an emergency shell instead. I think Chao used x-initrd.mount option as mount option and that changes the behavior. With this option the mount unit becomes required by initrd-root-fs.target rather than it used to be local-fs.target. And the way these targets are configured, we drop into emergency shell with other untis isolated. Point being that usage of x-initrd.mount to achieve the desired behavior sounded hackish to me. Nowhere systemd guarantees that usage of this option will ensure certain dependencies. I think this option just means that file system will be mounted in initramfs. So is there a better way to achieve what we are looking for. Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: local-fs.target waits for nofail mounts
On Thu, Apr 10, 2014 at 06:38:59AM +0400, Andrey Borzenkov wrote: [..] So with nofail opion for rootfs we should have following situation. - sysroot.mount Before=initrd-root-fs.target - initrd-root-fs.target Requires=sysroot.mount OnFailure=emergency.target - initrd.target Wants=initrd-root-fs.target OnFailure=emergency.target - dracut-pre-pivot.service After=initrd.target sysroot.mount Now let us say sysroot.mount failed activation because root device did not show up. We waited for certain time interval, then time out. Now what will happen to initrd-root-fs.target and initrd.target states. Assuming initrd-root-fs.target Requires sysroot.mounts it enters Failed state and systemd effectively executes analog of systemctl start emergency.target. What happens after that is defined entirely by what emergency.target pulls in. initrd.target in your example does not depend on sysroot.mount in any way so unless there are further indirect dependencies it actually should be reached at this point. initrd.target Wants initrd-root-fs.target which inturn depends on sysroot.mount. systemd automatically generates a Requires=sysroot.mount in initrd-root-fs.target. So if sysroot.mount fails, that should start emergency.target as initrd-root-fs.target will fail. As initrd.target has Wants=initrd-root-fs.target, and initrd-root-fs.target activation has failed. So does that mean that initrd.target will reach the failed state too and we will try to launch emergency.target. What will happen to dracut-pre-pivot.service. It is supposed to run after intird.target has reached. Now initrd.target has failed activation. Will dracut-pre-pivot.service be activated? Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: local-fs.target waits for nofail mounts
On Wed, Apr 09, 2014 at 05:36:13PM +0800, WANG Chao wrote: On 04/08/14 at 06:02pm, Vivek Goyal wrote: On Tue, Apr 08, 2014 at 02:14:33AM +0200, Zbigniew Jędrzejewski-Szmek wrote: [..] Defining a new target which by default waits for all the local fs target sounds interesting. Again, I have the question, what will happen to local-fs-all.target if some device does not show up and say one of the mounts specified in /etc/fstab fails. It result is different for Requires= and for Wants=. Iff there's a chain of Requires= from the failing unit (.device in this case) to the target unit it will fail. Otherwise, it'll just be delayed. If, as I suggested above local-fs-all.target would have Requires= on the .mount units, then your unit could still have Wants=/After=local-fs-all.target, and it'll be started even if some mounts fail. Thanks now I understand the difference between Requires= and Wants= better. What we want is. - Wait for all devices to show up as specified in /etc/fstab. Run fsck on devices. Mount devices to mount points specified. - If everything is successful, things are fine and local-fs-all.target will be reached. - If some device does not show up, or if fsck fails or mount fails, still local-fs-all.target should reach so that kdump module can detect that failure happened and can take alternative action. Alternatively, you can specify a soft depenendency on local-fs-all.target by using Wants=local-fs-all.target. I think this is preferable, because we want local-fs-all.target to be as similar as possible to local-fs.target, which has Requires= on the mount points. With this caveat, this should all be satisfied with my proposal. Agreed. We could define Wants=local-fs-all.target and that would make sure that our unit will be started even if local-fs-all.target fails. You can use OnFailure= to define unit(s) started when local-fs-all.target fails. But it sounds like you are not really interested in *all* filesystems, but in specific fileststems defined in kdump configuration. Kdump scripts registers with dracut as pre-pivot hook. And I believe that in initramfs environments /etc/fstab does not contain all filesystems. It prmarily contains root and any file system specified on dracut command line using --mount option during initramfs generation. So my understanding that given the fact that /etc/fstab is minimal in initramfs, we should be fine waiting for all the fs specified. Given the fact that we run under dracut pre-pivot hook callback, I think dracut-pre-pivot.service wil have to create a dependency to run after local-fs-all.target is reached. Hm, maybe. It would be good to get some input from Harald here. This is pretty specialized, so maybe it'd be better to have a separate unit positioned before or after or parallel to dracut-pre-pivot.service. I am just thinking loud now. Taking a step back and going back to figure out why did we introduce nofail to begin with. If I go through kexec-tools logs, it says nofail was introduced otherwise we never reach initrd.target. I am wondering why that's the case. Current initrd.target seems to have following. [Unit] Description=Initrd Target Requires=basic.target Conflicts=rescue.service rescue.target After=basic.target rescue.service rescue.target AllowIsolate=yes OnFailure=emergency.target OnFailureIsolate=yes ConditionPathExists=/etc/initrd-release dracut doesn't use this initrd.target. It uses the stock one from systemd: [Unit] Description=Initrd Default Target Documentation=man:systemd.special(7) OnFailure=emergency.target OnFailureIsolate=yes ConditionPathExists=/etc/initrd-release Requires=basic.target Wants=initrd-root-fs.target initrd-fs.target initrd-parse-etc.service After=initrd-root-fs.target initrd-fs.target basic.target rescue.service rescue.target AllowIsolate=yes In sysroot.mount context, if we don't use nofail in case of root disk failure, we will never reach initrd-root-fs.target and hence we never reach initrd.target and dracut-pre-povit.service never get a chance to start. Ok, I want to understand what is never reach a target means. So with nofail opion for rootfs we should have following situation. - sysroot.mount Before=initrd-root-fs.target - initrd-root-fs.target Requires=sysroot.mount OnFailure=emergency.target - initrd.target Wants=initrd-root-fs.target OnFailure=emergency.target - dracut-pre-pivot.service After=initrd.target sysroot.mount Now let us say sysroot.mount failed activation because root device did not show up. We waited for certain time interval, then time out. Now what will happen to initrd-root
Re: [systemd-devel] [PATCH] fstab-generator: local-fs.target waits for nofail mounts
On Tue, Apr 08, 2014 at 02:14:33AM +0200, Zbigniew Jędrzejewski-Szmek wrote: [..] Defining a new target which by default waits for all the local fs target sounds interesting. Again, I have the question, what will happen to local-fs-all.target if some device does not show up and say one of the mounts specified in /etc/fstab fails. It result is different for Requires= and for Wants=. Iff there's a chain of Requires= from the failing unit (.device in this case) to the target unit it will fail. Otherwise, it'll just be delayed. If, as I suggested above local-fs-all.target would have Requires= on the .mount units, then your unit could still have Wants=/After=local-fs-all.target, and it'll be started even if some mounts fail. Thanks now I understand the difference between Requires= and Wants= better. What we want is. - Wait for all devices to show up as specified in /etc/fstab. Run fsck on devices. Mount devices to mount points specified. - If everything is successful, things are fine and local-fs-all.target will be reached. - If some device does not show up, or if fsck fails or mount fails, still local-fs-all.target should reach so that kdump module can detect that failure happened and can take alternative action. Alternatively, you can specify a soft depenendency on local-fs-all.target by using Wants=local-fs-all.target. I think this is preferable, because we want local-fs-all.target to be as similar as possible to local-fs.target, which has Requires= on the mount points. With this caveat, this should all be satisfied with my proposal. Agreed. We could define Wants=local-fs-all.target and that would make sure that our unit will be started even if local-fs-all.target fails. You can use OnFailure= to define unit(s) started when local-fs-all.target fails. But it sounds like you are not really interested in *all* filesystems, but in specific fileststems defined in kdump configuration. Kdump scripts registers with dracut as pre-pivot hook. And I believe that in initramfs environments /etc/fstab does not contain all filesystems. It prmarily contains root and any file system specified on dracut command line using --mount option during initramfs generation. So my understanding that given the fact that /etc/fstab is minimal in initramfs, we should be fine waiting for all the fs specified. Given the fact that we run under dracut pre-pivot hook callback, I think dracut-pre-pivot.service wil have to create a dependency to run after local-fs-all.target is reached. Hm, maybe. It would be good to get some input from Harald here. This is pretty specialized, so maybe it'd be better to have a separate unit positioned before or after or parallel to dracut-pre-pivot.service. I am just thinking loud now. Taking a step back and going back to figure out why did we introduce nofail to begin with. If I go through kexec-tools logs, it says nofail was introduced otherwise we never reach initrd.target. I am wondering why that's the case. Current initrd.target seems to have following. [Unit] Description=Initrd Target Requires=basic.target Conflicts=rescue.service rescue.target After=basic.target rescue.service rescue.target AllowIsolate=yes OnFailure=emergency.target OnFailureIsolate=yes ConditionPathExists=/etc/initrd-release So it Requires=basic.target. Now let us say basic.target fails, then I am assuming emergency.target will be activated. And if we hook into emergency-shell binary and make it run a registered error handler if it is available, then kdump can drop its handler and take action on failure. IOW, what if we stop passing nofail. Then local-fs.target practically becomes local-fs-all.target. Either services will start just fine (after a wait for deivces to show up). Or units will start failing and if boot can't cointinue then somewhere we will fall into emergency shell and then emergency shell will call into kdump handler. This is assuming that we have designed boot path in such a way that most of the time we will not do infinite wait (until and unless user asked us to do to. (x-systemd.device-timeout=0). So either we will wait for finite amount of time and then fail some services but continue to boot. And along the path if we can't make progress then we will drop into emergency shell. If above assumption is right, then hooking into emergency shell logic in initramfs should provide what I am looking for. Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: local-fs.target waits for nofail mounts
On Sat, Apr 05, 2014 at 04:24:01AM +0200, Zbigniew Jędrzejewski-Szmek wrote: On Fri, Apr 04, 2014 at 05:30:03PM -0400, Vivek Goyal wrote: On Fri, Apr 04, 2014 at 02:44:50PM +0800, WANG Chao wrote: In kdump kernel, we need mount certain file system, and we use nofail for all mounts specified in /etc/fstab. Because we don't want any mount failure to interrupt the boot process to arrive at dracut-pre-pivot.service (This is the point just before we switch root). And at that dracut-pre-pivot, we run our kernel dump capture script (called kdump.sh). Our kdump.sh needs every .mount is mounted (or started in systemd context) before it gets run by dracut-pre-pivot.service. And dracut-pre-pivot.service is configured to run after local-fs.target. So what we expect is no matter nofail is configured or not, local-fs.target should be delayed after all mount units are started. And it's also the same for remote nofail mounts and remote-fs.target. Chao, will this change not force boot to stop if fsck on said device failed? And in that case we will not get a chance to run default action in kdump. I think there is conflict in the definiton of nofail as defined by fstab/fsck and as interpreted by systemd fstab generator. Current behaviour has an important reason: it is not possible to implement wait mode without heavily penalising the case of missing devices. Since systemd boots rather quickly, it has not way of knowing whether the device in question is genuinly missing, or just slow to be detected, and has to wait the device timeout (3 min, iirc) before continuing. In the light of this, current behaviour seems to be a reasonable reinterpretation of nofail for an event-based boot system. I have couple of questions. - Assume nofail is not specified and device specified in /etc/fstab does not show up. How long will we wait before we give up? Looks like you think it is 3 mins? So by default it is time bound wait and not an infinite wait. - Say, we timed out and device was not found. I think foo.mount service will fail. Now what will happen to all dependent services or targets. For example, will initrd.target be reached or not. In the past looks like we faced the problem that initrd.target was not reached because device could not be found and kdump never got to run. Nevertheless, I understand the motivation for this patch, and this is something that has been discussed before. What about adding an local-fs-all.target, something like [Unit] Description=All local mount points configured in /etc/fstab [Install] WantedBy=multi-user.target and having fstab-generator add Before=local-fs-all.target, RequiredBy=local-fs-all.target to the units it generates. Then if someone wants to wait for all local mounts, they can use Requires=,After=local-fs-all.target. And thanks to the [Install] section, a user can do 'systemctl enable local-fs-all.target' to wait for 'nofail' devices. Defining a new target which by default waits for all the local fs target sounds interesting. Again, I have the question, what will happen to local-fs-all.target if some device does not show up and say one of the mounts specified in /etc/fstab fails. What we want is. - Wait for all devices to show up as specified in /etc/fstab. Run fsck on devices. Mount devices to mount points specified. - If everything is successful, things are fine and local-fs-all.target will be reached. - If some device does not show up, or if fsck fails or mount fails, still local-fs-all.target should reach so that kdump module can detect that failure happened and can take alternative action. For example, Asssume a user wants to save vmcore to nfs destination. Now for whatever reason, nfs target could not be mounted. In that case kdump will still like to get control and alternatively save dump to local root fs. If systemd just hangs because nfs mounting failed and local-fs-all.target was never reached, then we can't take backup action. Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: local-fs.target waits for nofail mounts
On Mon, Apr 07, 2014 at 10:07:20PM +0400, Andrey Borzenkov wrote: В Mon, 7 Apr 2014 13:40:17 -0400 Vivek Goyal vgo...@redhat.com пишет: Defining a new target which by default waits for all the local fs target sounds interesting. Again, I have the question, what will happen to local-fs-all.target if some device does not show up and say one of the mounts specified in /etc/fstab fails. What we want is. - Wait for all devices to show up as specified in /etc/fstab. Run fsck on devices. Mount devices to mount points specified. - If everything is successful, things are fine and local-fs-all.target will be reached. - If some device does not show up, or if fsck fails or mount fails, still local-fs-all.target should reach so that kdump module can detect that failure happened and can take alternative action. You can use OnFailure= to define unit(s) started when local-fs-all.target fails. But it sounds like you are not really interested in *all* filesystems, but in specific fileststems defined in kdump configuration. Kdump scripts registers with dracut as pre-pivot hook. And I believe that in initramfs environments /etc/fstab does not contain all filesystems. It prmarily contains root and any file system specified on dracut command line using --mount option during initramfs generation. So my understanding that given the fact that /etc/fstab is minimal in initramfs, we should be fine waiting for all the fs specified. Given the fact that we run under dracut pre-pivot hook callback, I think dracut-pre-pivot.service wil have to create a dependency to run after local-fs-all.target is reached. Now I am not sure who will generate local-fs-all.target. If dracut generates it then dracut will also specify OnFailure=. Question will still remain how dracut modules will communicate to dracut that what to run after local-fs-all.target fails. In fact if dracut is doing all this, we don't have to create a separate target. Right now we force nofail so that if mount fails, initrd.target is still reached. If we can create a separate service to just handle failures, then we probably should be able to spcify OnFailure=dracut-failure-hander.service in right file and as modules to register their failure handler hooks there. Something like create new hook called pre-pivot-failure and modules register a hook to handle pre-pivot-failure. Then kdump can get the control and handle failure. And this should allow dracut pre pivot service to specify to launch dracut-failure-handler.service upon failure. For example, Asssume a user wants to save vmcore to nfs destination. Now for whatever reason, nfs target could not be mounted. In that case kdump will still like to get control and alternatively save dump to local root fs. Without knowing details it sounds like RequiresMountsFor is more appropriate (and can be set by generator based on actual kdump configuration). I am not sure how is it useful for this case. dracut already generates all dependencies and puts them in /etc/fstab. And only entries in /etc/fstab should be which dracut wants. So I guess we should be fine and not need using RequiresMountsFor. Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: local-fs.target waits for nofail mounts
On Fri, Apr 04, 2014 at 02:44:50PM +0800, WANG Chao wrote: In kdump kernel, we need mount certain file system, and we use nofail for all mounts specified in /etc/fstab. Because we don't want any mount failure to interrupt the boot process to arrive at dracut-pre-pivot.service (This is the point just before we switch root). And at that dracut-pre-pivot, we run our kernel dump capture script (called kdump.sh). Our kdump.sh needs every .mount is mounted (or started in systemd context) before it gets run by dracut-pre-pivot.service. And dracut-pre-pivot.service is configured to run after local-fs.target. So what we expect is no matter nofail is configured or not, local-fs.target should be delayed after all mount units are started. And it's also the same for remote nofail mounts and remote-fs.target. Chao, will this change not force boot to stop if fsck on said device failed? And in that case we will not get a chance to run default action in kdump. I think there is conflict in the definiton of nofail as defined by fstab/fsck and as interpreted by systemd fstab generator. man fstab says following. nofail do not report errors for this device if it does not exist. man fsck says following. fsck normally does not check whether the device actually exists before calling a filesystem specific checker. Therefore non- existing devices may cause the system to enter filesystem repair mode during boot if the filesystem specific checker returns a fatal error. The /etc/fstab mount option nofail may be used to have fsck skip non-existing devices. fsck also skips non-exist‐ ing devices that have the special filesystem type auto. To me, that means one will still try to run fsck on the device and continue if device is not available, instead of forcing an error. I am not sure what error code fsck returns in this acse. But systemd seems to be implementing *do not wait for device to mount or do not wait for fsck results* http://www.freedesktop.org/software/systemd/man/systemd.mount.html If nofail is given, this mount will be only wanted, not required, by the local-fs.target. This means that the boot will continue even if this mount point is not mounted successfully. Option fail has the opposite meaning and is the default. * To me systemd seems to be implementing mountall nobootwait which seems to imply that boot does can continue without this mount being successful. If I go by fsck definition, nofail will imply that start the unit but if mount fails, still continue with rest of the units. It does not mean that other dependent units can be started before starting this unit. So I think nofail should map force Before=. What happens if nofail is specified and device is present and there are file system errors. Will fsck continue with boot or drop user into a shell during boot and force to fix file system failures? I think we also need an option which tells that continue to start dependent units even if you failed to mount a filesystem. If nofail semantics implies that, it is fine, otherwise we will have to create a systemd specific semantic to imply that to meet our needs. Thanks Vivek --- src/fstab-generator/fstab-generator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/fstab-generator/fstab-generator.c b/src/fstab-generator/fstab-generator.c index a9a5c02..55938b5 100644 --- a/src/fstab-generator/fstab-generator.c +++ b/src/fstab-generator/fstab-generator.c @@ -225,7 +225,7 @@ static int add_mount( Documentation=man:fstab(5) man:systemd-fstab-generator(8)\n, source); -if (post !noauto !nofail !automount) +if (post !noauto !automount) fprintf(f, Before=%s\n, post); if (passno != 0) { -- 1.8.5.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: respect noauto/nofail when adding sysroot mount
On Thu, Aug 08, 2013 at 03:18:11PM +0800, WANG Chao wrote: Currently we don't respect noauto/nofail root mount options (from rootflags kernel cmdline). We should map these two flags to the corresponding boolean variable noauto and nofail when calling add_mount(). Signed-off-by: WANG Chao chaow...@redhat.com Chao, Will this work for other mount points as specified by dracut command line --mount? IOW, if I specify a mount point using --mount and specify nofail in filesystem options, and if that file system can't be mounted, what will happen. Will we continue to run pre-pivot service? Thanks Vivek --- Days ago, I sent a patch to add rd.weak_sysroot. It seems you guys don't like it, even me neither :( So I come up with this update. It looks more reasonable. With this patch, I can set rootflags=nofail to bypass sysroot failure, which blocking initrd-root-fs.target. Please put your comments. Thanks! src/fstab-generator/fstab-generator.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/fstab-generator/fstab-generator.c b/src/fstab-generator/fstab-generator.c index c17299f..87b17cd 100644 --- a/src/fstab-generator/fstab-generator.c +++ b/src/fstab-generator/fstab-generator.c @@ -492,6 +492,7 @@ static int parse_new_root_from_proc_cmdline(void) { char *w, *state; int r; size_t l; +bool noauto, nofail; r = read_one_line_file(/proc/cmdline, line); if (r 0) { @@ -547,6 +548,9 @@ static int parse_new_root_from_proc_cmdline(void) { } } +noauto = !!strstr(opts, noauto); +nofail = !!strstr(opts, nofail); + if (!what) { log_debug(Could not find a root= entry on the kernel commandline.); return 0; @@ -558,7 +562,7 @@ static int parse_new_root_from_proc_cmdline(void) { } log_debug(Found entry what=%s where=/sysroot type=%s, what, type); -r = add_mount(what, /sysroot, type, opts, 0, false, false, false, +r = add_mount(what, /sysroot, type, opts, 0, noauto, nofail, false, false, NULL, NULL, NULL, SPECIAL_INITRD_ROOT_FS_TARGET, /proc/cmdline); return (r 0) ? r : 0; -- 1.8.3.1 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: introduce rd.weak_sysroot to bypass failures in sysroot.mount
On Fri, Aug 02, 2013 at 12:15:32PM +0200, Jan Engelhardt wrote: On Tuesday 2013-07-30 20:41, Vivek Goyal wrote: FYI, I don't see any CC's on the original mail as displayed on GMane via NNTP... Neither do I, with a normal (non-NTTP, non-Gmail) setup. I am CCed in original mail and that's why I got a copy of it in my Inbox. If you did, you should be able to locate the second copy, received from the mailing list software. I have received only one copy. Did not receive any copy from mailing list. I think that is because I have not changed default user options in mailing list settings which allow one to specifiy whether to send mailing list copy or not if one is explicitly CCed in mail. I am not sure how did Tom receive that mail. If my email id somehow automatically got stripped, I have no idea how that can happen. I am trying to look into systemd-devel archives but there does not seem to be any info who is CCed on the mail. Because there was no one CCed. Which either means that the original sender issued two different mail envelopes with two different mail bodies, or that the ML software stripped the CCs. I suspect that's the case here. Somehow ML stripped CC. In reply mails CC are intact. So not sure why it will happen only with first mail. Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: introduce rd.weak_sysroot to bypass failures in sysroot.mount
On Wed, Jul 31, 2013 at 12:19:06PM +0200, Harald Hoyer wrote: On 07/30/2013 09:14 PM, Vivek Goyal wrote: On Wed, Jul 31, 2013 at 12:46:22AM +0800, WANG Chao wrote: On 07/31/13 at 12:32am, WANG Chao wrote: On 07/30/13 at 03:46pm, Zbigniew Jędrzejewski-Szmek wrote: On Tue, Jul 30, 2013 at 09:43:16AM -0400, Vivek Goyal wrote: [CC harald] Not sure if this is right way to do or not but I will give more background about the issue. This assumption seems to be built into initramfs and systemd that root should always be mountable. If one can't mount root, it is a fatal failure. But in case of kdump initramfs, this assumption is no more valid. Core might be being saved to a target which is not root (say over ssh). And even if mounting root fails, it is ok. So we kind of need a mode (possibly driven by command line option) where if mouting root failed, it is ok and continue with mouting other targets and kdump module will then handle errors. Maybe rootfsflags=nofail could do be used as this flag? rootflags=nofail works. Thanks. Although it results in a little difference between my approach, I prefer use this one than adding another cmdline param. I just find nofail option only works when mnt device doesn't exists. What if the filesytem is corrupted? sysroot.mount will and initrd-root-fs.target will never reach. Right. In kdump environment, for most of the users default of dropping into a shell does not make sense. If some server crashes some where and we are not able to take dump due to some failure, most of the users will like that system reboots automatically and services are back up online. I see that right now rd.action_on_fail is parsed by emergency.service and service does not start if this parameter is specified. Can't we interpret this parameter little differently. That is this parameter modifies the OnFailure= behavior. So by default OnFailure= is emergency.service which is equivalent to a shell. A user can force change of behavior by specifying command line. rd.action_on_failure=shell (OnFailure=emergency.service) rd.action_on_failure=reboot (OnFailure=reboot) rd.action_on_failure=continue (OnFailure=continue) Now action_on_failure=continue will effectively pretend that unit start was successful and go ahead with starting next unit. This might be little contentious though as other dependent units will fail in unknown ways. Now by default kdump can use rd.acton_on_failure=continue and try to save dump. If it can't due to previous failures, then it will anyway reboot the system. Also if emergency.service stops parsing rd.action_on_failure, then kdump module will be able to start emergency.service too when it sees there is a problem. Right now when kdump module tries to start emergency.service it fails because it looks at acton_on_fail parameter (Another issue Bao is trying to solve). Thanks Vivek Why not install your own version of emergency.service in the kdump dracut module, which parses rd.action_on_failure and acts accordingly. Or replace emergency.service in the dracut cmdline hook according to rd.action_on_failure. That is doable but I think there is still one more issue. What happens to rest of the systemd services. Once a service fails, systemd will recognize it as failure and start emergency.service. Once emergency.service exits, what will systemd do. Will it continue to start other services which are dependent on failed service. I will guess it will not start. Because dependent services are supposed to be started only if previous service started successfully. If that's the case, then just replacing emergency.service is not a solution. In fact, I think that's how things are currently working. emergency.service does not start if acton_on_fail=continue is specified on command line. ConditionKernelCommandLine=!action_on_fail=continue So core of the problem here is that systemd needs to be aware that user wants to continue to start other services despite the fact that previous service failed. And using action_on_failure= command line to trigger change of behavior is one way of doing it. Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: introduce rd.weak_sysroot to bypass failures in sysroot.mount
On Wed, Jul 31, 2013 at 09:44:01AM -0400, Vivek Goyal wrote: On Wed, Jul 31, 2013 at 12:19:06PM +0200, Harald Hoyer wrote: On 07/30/2013 09:14 PM, Vivek Goyal wrote: On Wed, Jul 31, 2013 at 12:46:22AM +0800, WANG Chao wrote: On 07/31/13 at 12:32am, WANG Chao wrote: On 07/30/13 at 03:46pm, Zbigniew Jędrzejewski-Szmek wrote: On Tue, Jul 30, 2013 at 09:43:16AM -0400, Vivek Goyal wrote: [CC harald] Not sure if this is right way to do or not but I will give more background about the issue. This assumption seems to be built into initramfs and systemd that root should always be mountable. If one can't mount root, it is a fatal failure. But in case of kdump initramfs, this assumption is no more valid. Core might be being saved to a target which is not root (say over ssh). And even if mounting root fails, it is ok. So we kind of need a mode (possibly driven by command line option) where if mouting root failed, it is ok and continue with mouting other targets and kdump module will then handle errors. Maybe rootfsflags=nofail could do be used as this flag? rootflags=nofail works. Thanks. Although it results in a little difference between my approach, I prefer use this one than adding another cmdline param. I just find nofail option only works when mnt device doesn't exists. What if the filesytem is corrupted? sysroot.mount will and initrd-root-fs.target will never reach. Right. In kdump environment, for most of the users default of dropping into a shell does not make sense. If some server crashes some where and we are not able to take dump due to some failure, most of the users will like that system reboots automatically and services are back up online. I see that right now rd.action_on_fail is parsed by emergency.service and service does not start if this parameter is specified. Can't we interpret this parameter little differently. That is this parameter modifies the OnFailure= behavior. So by default OnFailure= is emergency.service which is equivalent to a shell. A user can force change of behavior by specifying command line. rd.action_on_failure=shell (OnFailure=emergency.service) rd.action_on_failure=reboot (OnFailure=reboot) rd.action_on_failure=continue (OnFailure=continue) Now action_on_failure=continue will effectively pretend that unit start was successful and go ahead with starting next unit. This might be little contentious though as other dependent units will fail in unknown ways. Now by default kdump can use rd.acton_on_failure=continue and try to save dump. If it can't due to previous failures, then it will anyway reboot the system. Also if emergency.service stops parsing rd.action_on_failure, then kdump module will be able to start emergency.service too when it sees there is a problem. Right now when kdump module tries to start emergency.service it fails because it looks at acton_on_fail parameter (Another issue Bao is trying to solve). Thanks Vivek Why not install your own version of emergency.service in the kdump dracut module, which parses rd.action_on_failure and acts accordingly. Or replace emergency.service in the dracut cmdline hook according to rd.action_on_failure. That is doable but I think there is still one more issue. What happens to rest of the systemd services. Once a service fails, systemd will recognize it as failure and start emergency.service. Once emergency.service exits, what will systemd do. Will it continue to start other services which are dependent on failed service. I will guess it will not start. Because dependent services are supposed to be started only if previous service started successfully. If that's the case, then just replacing emergency.service is not a solution. In fact, I think that's how things are currently working. emergency.service does not start if acton_on_fail=continue is specified on command line. ConditionKernelCommandLine=!action_on_fail=continue So core of the problem here is that systemd needs to be aware that user wants to continue to start other services despite the fact that previous service failed. And using action_on_failure= command line to trigger change of behavior is one way of doing it. Ok, I noticed the commit to not run emergency.service in action_on_fail=continue is mentioned. commit dcae873414ff643e1de790f256e414923e2aef8b Author: Harald Hoyer har...@redhat.com Date: Thu May 30 11:14:39 2013 +0200 systemd/emergency.service: do not run for action_on_fail=continue same as for dracut-emergency.service Apart from the issue of other services not starting, we are facing another issue. And that is, kdump module wants to start a bash shell upon failure and start emergency.service. And now that fails because we are booted with action_on_fail=continue
Re: [systemd-devel] [PATCH] fstab-generator: introduce rd.weak_sysroot to bypass failures in sysroot.mount
[CC harald] Not sure if this is right way to do or not but I will give more background about the issue. This assumption seems to be built into initramfs and systemd that root should always be mountable. If one can't mount root, it is a fatal failure. But in case of kdump initramfs, this assumption is no more valid. Core might be being saved to a target which is not root (say over ssh). And even if mounting root fails, it is ok. So we kind of need a mode (possibly driven by command line option) where if mouting root failed, it is ok and continue with mouting other targets and kdump module will then handle errors. Thanks Vivek On Tue, Jul 30, 2013 at 07:53:11PM +0800, WANG Chao wrote: If specified kernel command line rd.weak_sysroot, fstab-generate will generate a weaker version of sysroot.mount: - It's not required by initrd-root-fs.target. - It's not before initrd-root-fs.target. So that failure in the weaker sysroot.mount will not fail initrd-root-fs.target. And systemd will try continue rather than entering isolated emergency mode. Signed-off-by: WANG Chao chaow...@redhat.com --- man/kernel-command-line.xml | 10 ++ man/systemd-fstab-generator.xml | 10 ++ src/fstab-generator/fstab-generator.c | 5 - 3 files changed, 24 insertions(+), 1 deletion(-) diff --git a/man/kernel-command-line.xml b/man/kernel-command-line.xml index a4b7d13..0c2e97d 100644 --- a/man/kernel-command-line.xml +++ b/man/kernel-command-line.xml @@ -274,6 +274,16 @@ /varlistentry varlistentry + termvarnamerd.weak_sysroot/varname/term + +listitem +paraConfigures the sysroot.mount +logic in initrd. For details, see + citerefentryrefentrytitlesystemd-fstab-generator/refentrytitlemanvolnum8/manvolnum/citerefentry./para +/listitem +/varlistentry + +varlistentry termvarnamemodules-load=/varname/term termvarnamerd.modules-load=/varname/term diff --git a/man/systemd-fstab-generator.xml b/man/systemd-fstab-generator.xml index 4bd25bf..de0ed2f 100644 --- a/man/systemd-fstab-generator.xml +++ b/man/systemd-fstab-generator.xml @@ -101,6 +101,16 @@ the initrd. /para/listitem /varlistentry +varlistentry + termvarnamerd.weak_sysroot/varname/term + +listitemparaIf specified, systemd will +ingore failures in sysroot.mount and try to +continue rather than enter emergency mode. +It is honored only by initial RAM disk +(initrd). /para/listitem +/varlistentry + /variablelist /refsect1 diff --git a/src/fstab-generator/fstab-generator.c b/src/fstab-generator/fstab-generator.c index c17299f..449e725 100644 --- a/src/fstab-generator/fstab-generator.c +++ b/src/fstab-generator/fstab-generator.c @@ -492,6 +492,7 @@ static int parse_new_root_from_proc_cmdline(void) { char *w, *state; int r; size_t l; +bool weak = false; r = read_one_line_file(/proc/cmdline, line); if (r 0) { @@ -544,6 +545,8 @@ static int parse_new_root_from_proc_cmdline(void) { free(opts); opts = o; +} else if (streq(word, rd.weak_sysroot)) { +weak=true; } } @@ -558,7 +561,7 @@ static int parse_new_root_from_proc_cmdline(void) { } log_debug(Found entry what=%s where=/sysroot type=%s, what, type); -r = add_mount(what, /sysroot, type, opts, 0, false, false, false, +r = add_mount(what, /sysroot, type, opts, 0, false, weak, false, false, NULL, NULL, NULL, SPECIAL_INITRD_ROOT_FS_TARGET, /proc/cmdline); return (r 0) ? r : 0; -- 1.8.3.1 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: introduce rd.weak_sysroot to bypass failures in sysroot.mount
On Tue, Jul 30, 2013 at 02:05:08PM +0200, Tom Gundersen wrote: On Tue, Jul 30, 2013 at 1:53 PM, WANG Chao chaow...@redhat.com wrote: If specified kernel command line rd.weak_sysroot, fstab-generate will generate a weaker version of sysroot.mount: - It's not required by initrd-root-fs.target. - It's not before initrd-root-fs.target. So that failure in the weaker sysroot.mount will not fail initrd-root-fs.target. And systemd will try continue rather than entering isolated emergency mode. Can you give an example case where this is useful? I.e., what is the setup and how is boot supposed to succeed with a failing sysroot? Hi, Can you please not drop people listed in CC in original thread from conversation. Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: introduce rd.weak_sysroot to bypass failures in sysroot.mount
On Tue, Jul 30, 2013 at 04:02:17PM +0200, Tom Gundersen wrote: On Tue, Jul 30, 2013 at 2:27 PM, WANG Chao chaow...@redhat.com wrote: On 07/30/13 at 02:05pm, Tom Gundersen wrote: On Tue, Jul 30, 2013 at 1:53 PM, WANG Chao chaow...@redhat.com wrote: - It's not before initrd-root-fs.target. In case of kdump, 2nd kernel initrd is used to mount non-root local/remote filesystem and dump vmcore there. The kdump script is running right before switch-root and will reboot after saving vmcore. So mounting sysroot isn't quite justified in this case. But it's still acceptable (since it's readonly mount), as long as it's not keeping systemd from reaching initrd.target (so kdump script can run later). If you don't have the Before=initrd-root-fs.target it means that you'll have a race: sometimes the rootfs will be mounted before kdump does whatever it does, and sometimes it won't. Would an option be to not specify a root= at all in your case? Not specifying root= is not an option as it serves as backup dump target for us. So our primary target might be send dump over network but for some reason it fails, based on user config option, we will dump core to root in /var/crash. Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: introduce rd.weak_sysroot to bypass failures in sysroot.mount
On Tue, Jul 30, 2013 at 07:34:01PM +0100, Colin Guthrie wrote: 'Twas brillig, and Vivek Goyal at 30/07/13 15:26 did gyre and gimble: On Tue, Jul 30, 2013 at 02:05:08PM +0200, Tom Gundersen wrote: On Tue, Jul 30, 2013 at 1:53 PM, WANG Chao chaow...@redhat.com wrote: If specified kernel command line rd.weak_sysroot, fstab-generate will generate a weaker version of sysroot.mount: - It's not required by initrd-root-fs.target. - It's not before initrd-root-fs.target. So that failure in the weaker sysroot.mount will not fail initrd-root-fs.target. And systemd will try continue rather than entering isolated emergency mode. Can you give an example case where this is useful? I.e., what is the setup and how is boot supposed to succeed with a failing sysroot? Hi, Can you please not drop people listed in CC in original thread from conversation. FYI, I don't see any CC's on the original mail as displayed on GMane via NNTP... I am CCed in original mail and that's why I got a copy of it in my Inbox. I am not sure how did Tom receive that mail. If my email id somehow automatically got stripped, I have no idea how that can happen. I am trying to look into systemd-devel archives but there does not seem to be any info who is CCed on the mail. Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] fstab-generator: introduce rd.weak_sysroot to bypass failures in sysroot.mount
On Wed, Jul 31, 2013 at 12:46:22AM +0800, WANG Chao wrote: On 07/31/13 at 12:32am, WANG Chao wrote: On 07/30/13 at 03:46pm, Zbigniew Jędrzejewski-Szmek wrote: On Tue, Jul 30, 2013 at 09:43:16AM -0400, Vivek Goyal wrote: [CC harald] Not sure if this is right way to do or not but I will give more background about the issue. This assumption seems to be built into initramfs and systemd that root should always be mountable. If one can't mount root, it is a fatal failure. But in case of kdump initramfs, this assumption is no more valid. Core might be being saved to a target which is not root (say over ssh). And even if mounting root fails, it is ok. So we kind of need a mode (possibly driven by command line option) where if mouting root failed, it is ok and continue with mouting other targets and kdump module will then handle errors. Maybe rootfsflags=nofail could do be used as this flag? rootflags=nofail works. Thanks. Although it results in a little difference between my approach, I prefer use this one than adding another cmdline param. I just find nofail option only works when mnt device doesn't exists. What if the filesytem is corrupted? sysroot.mount will and initrd-root-fs.target will never reach. Right. In kdump environment, for most of the users default of dropping into a shell does not make sense. If some server crashes some where and we are not able to take dump due to some failure, most of the users will like that system reboots automatically and services are back up online. I see that right now rd.action_on_fail is parsed by emergency.service and service does not start if this parameter is specified. Can't we interpret this parameter little differently. That is this parameter modifies the OnFailure= behavior. So by default OnFailure= is emergency.service which is equivalent to a shell. A user can force change of behavior by specifying command line. rd.action_on_failure=shell (OnFailure=emergency.service) rd.action_on_failure=reboot (OnFailure=reboot) rd.action_on_failure=continue (OnFailure=continue) Now action_on_failure=continue will effectively pretend that unit start was successful and go ahead with starting next unit. This might be little contentious though as other dependent units will fail in unknown ways. Now by default kdump can use rd.acton_on_failure=continue and try to save dump. If it can't due to previous failures, then it will anyway reboot the system. Also if emergency.service stops parsing rd.action_on_failure, then kdump module will be able to start emergency.service too when it sees there is a problem. Right now when kdump module tries to start emergency.service it fails because it looks at acton_on_fail parameter (Another issue Bao is trying to solve). Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Systemd and cgroup management (and libcgroup)
On Mon, Oct 31, 2011 at 10:32:30PM +0100, Lennart Poettering wrote: On Mon, 31.10.11 14:43, Jan Safranek (jsafr...@redhat.com) wrote: Even if there is, then it looks like systemd is better place to manage it as it already is setting up the whole system and top level hierarchies. Thanks to Jason for the suggestion. Systemd pretty much covers most of the use cases. The main reason to keep separate cgconfig is mounting several controllers together in one hierarchy - AFAIK systemd won't support this mounting. Still, systemd will happily put services to cgroups there. We actually do support mounting hierarchies jointly these days. Use JoinControllers= to achieve that. By default we moint and cpu and cpuacct together. Lennart wrote once on previous discussion [1]: cite systemd will create automatic groups for users and services, but will not help you to set up any more complex hierarchy then just 1:1 service to cgroup mappings. As soon as you want a more complex tree, with multiple levels or something like this you will need something like cgconfig which allows you to create any tree you want. /cite Question is, if we really need complex cgroup hierarchies and/or multiple controllers in a hierarchy. I am quite sure that sooner or later some folks will need complex cgroup hierarchies, for example if they want to give a group teachers a more resources than the group students or so. I am very sure that some people want features like that, but I am also quite sure I don't want to cover that in systemd, which is why I am happy if cgconfig could fill in that void. I think systemd will cover 90% of the use cases, but the 10% that are left are valid too, and cgconfig sounds like a useful tool to make that work. I think it is little problematic from user's experience point of view. For some things/functionality, go talk to systemd or use its APIs and for rest, go talk to cgconfig/libcgroup. If systemd is managing users, then it should not be too hard to provide a command line to put teacher users in specified cgroups and student users in another specified cgroups. To me cgconfig infrastructure has a big shortcomings that it can only create cgroups and after that it can't enforce anything. One can create teacher or student cgroups but with a pam plugin, one can't enforce where these sessions are actually run. We tried to compensate for that using pam_cgroup pam plugin. But given the fact that systemd will come with a default policy of putting every user in a cgroup of its own, any user configuration also becomes dicy. One needs to override the default settings of systemd. I think for the sake of better user experience as well as conceptually it makes sense that systemd takes over all the cgconfig functionality. I don't think that leaving 10% use cases to cgconfig and let it remain in a separate subsystem is a good idea. It will conflict with systemd in various interesting ways and user will atbest be confused then who to talk to for what functionality. Especially for me, I am trying to write matahari agent for better cgroup management. Now if a user wants to change default cgroup of a service, who should it talk to. This cgroup agent or to the service agent. (There is already a service agent providing basic services like start, stop services etc). Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Systemd and cgroup management (and libcgroup)
Hi, We have talked a lot of about libcgroup and systemd in the past and when I thought that debate is settled, here comes some more things to discuss. Previously libcg was doing cgroup management and now sytemd is taken over a lot of it. - Creation of hierarchies. Taking control of hierarchies have taken away part of the cgconfig functionality. - Providing automatic cgroups for users has taken away part of the functionality provided by cgroup pam plugin. - Providing automatic cgroups for services has taken away the service management which in the past potentially could be done with the help of cgconfig. Now systemd is managing services and users from cgroup point of view which past init system never did and that was part of the reason to have pam_cgroup plugin and cgconfig. Given the fact that new init system has taken over a lot of cgroup management, few things come to mind. - Should systemd provide a way to change default cgroups of users as it provides for services. - Should systemd provide a way to change default resources of cgroups of users as it provides for services. - cgroup and associated resources now become properties of objects being managed by systemd (services and users). To me it will make sense to provide an API so that an application can call into those APIs and manage it. Should systemd provide an API to manage cgroups and resources of cgroups of services and users it is managing. Lennart, I know you had said that editing the unit file is an API, but a real will API will help. Should cgconfig equivalent functionality be owned by systemd --- This is contentious one and I think there are two parts to it. - Is cgconfig really needed. What are the use cases given the fact that systemd now takes care of setting up the system. - If cgconfig is needed, then who should really manage it. systemd or libcg. I am finding it hard to think of any good use cases of cgconfig now given the fact systemd has taken over service and user management. What else is left out. Everything is children of either user sessions of services and they should manage their own children cgroups. Where's the need of statically defining cgroups and resources and what will be launched in those. Even if there is, then it looks like systemd is better place to manage it as it already is setting up the whole system and top level hierarchies. Thanks to Jason for the suggestion. Any thougths on above issues are appreciated. Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Systemd and cgroup management (and libcgroup)
On Tue, Oct 18, 2011 at 04:45:59PM -0400, Vivek Goyal wrote: Hi, Oops, got old addresses of dhaval and bablir. Fixing it. Please reply to this mail instead of previous one. Thanks Vivek We have talked a lot of about libcgroup and systemd in the past and when I thought that debate is settled, here comes some more things to discuss. Previously libcg was doing cgroup management and now sytemd is taken over a lot of it. - Creation of hierarchies. Taking control of hierarchies have taken away part of the cgconfig functionality. - Providing automatic cgroups for users has taken away part of the functionality provided by cgroup pam plugin. - Providing automatic cgroups for services has taken away the service management which in the past potentially could be done with the help of cgconfig. Now systemd is managing services and users from cgroup point of view which past init system never did and that was part of the reason to have pam_cgroup plugin and cgconfig. Given the fact that new init system has taken over a lot of cgroup management, few things come to mind. - Should systemd provide a way to change default cgroups of users as it provides for services. - Should systemd provide a way to change default resources of cgroups of users as it provides for services. - cgroup and associated resources now become properties of objects being managed by systemd (services and users). To me it will make sense to provide an API so that an application can call into those APIs and manage it. Should systemd provide an API to manage cgroups and resources of cgroups of services and users it is managing. Lennart, I know you had said that editing the unit file is an API, but a real will API will help. Should cgconfig equivalent functionality be owned by systemd --- This is contentious one and I think there are two parts to it. - Is cgconfig really needed. What are the use cases given the fact that systemd now takes care of setting up the system. - If cgconfig is needed, then who should really manage it. systemd or libcg. I am finding it hard to think of any good use cases of cgconfig now given the fact systemd has taken over service and user management. What else is left out. Everything is children of either user sessions of services and they should manage their own children cgroups. Where's the need of statically defining cgroups and resources and what will be launched in those. Even if there is, then it looks like systemd is better place to manage it as it already is setting up the whole system and top level hierarchies. Thanks to Jason for the suggestion. Any thougths on above issues are appreciated. Thanks Vivek ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel