On Wed, Jul 31, 2013 at 09:44:01AM -0400, Vivek Goyal wrote: > On Wed, Jul 31, 2013 at 12:19:06PM +0200, Harald Hoyer wrote: > > On 07/30/2013 09:14 PM, Vivek Goyal wrote: > > > On Wed, Jul 31, 2013 at 12:46:22AM +0800, WANG Chao wrote: > > >> On 07/31/13 at 12:32am, WANG Chao wrote: > > >>> On 07/30/13 at 03:46pm, Zbigniew Jędrzejewski-Szmek wrote: > > >>>> On Tue, Jul 30, 2013 at 09:43:16AM -0400, Vivek Goyal wrote: > > >>>>> [CC harald] > > >>>>> > > >>>>> Not sure if this is right way to do or not but I will give more > > >>>>> background about the issue. > > >>>>> > > >>>>> This assumption seems to be built into initramfs and systemd that root > > >>>>> should always be mountable. If one can't mount root, it is a fatal > > >>>>> failure. > > >>>>> > > >>>>> But in case of kdump initramfs, this assumption is no more valid. Core > > >>>>> might be being saved to a target which is not root (say over ssh). And > > >>>>> even if mounting root fails, it is ok. > > >>>>> > > >>>>> So we kind of need a mode (possibly driven by command line option) > > >>>>> where > > >>>>> if mouting root failed, it is ok and continue with mouting other > > >>>>> targets > > >>>>> and kdump module will then handle errors. > > >>>> Maybe rootfsflags=nofail could do be used as this flag? > > >>> > > >>> rootflags=nofail works. Thanks. > > >>> > > >>> Although it results in a little difference between my approach, I prefer > > >>> use this one than adding another cmdline param. > > >> > > >> I just find nofail option only works when mnt device doesn't exists. > > >> > > >> What if the filesytem is corrupted? sysroot.mount will and > > >> initrd-root-fs.target will never reach. > > > > > > Right. > > > > > > In kdump environment, for most of the users default of dropping into a > > > shell does not make sense. If some server crashes some where and we are > > > not able to take dump due to some failure, most of the users will like > > > that > > > system reboots automatically and services are back up online. > > > > > > I see that right now rd.action_on_fail is parsed by emergency.service and > > > service does not start if this parameter is specified. > > > > > > Can't we interpret this parameter little differently. That is this > > > parameter modifies the OnFailure= behavior. > > > > > > So by default OnFailure= is emergency.service which is equivalent to > > > a shell. > > > > > > A user can force change of behavior by specifying command line. > > > > > > rd.action_on_failure=shell (OnFailure=emergency.service) > > > rd.action_on_failure=reboot (OnFailure=reboot) > > > rd.action_on_failure=continue (OnFailure=continue) > > > > > > Now action_on_failure=continue will effectively pretend that unit start > > > was successful and go ahead with starting next unit. This might be little > > > contentious though as other dependent units will fail in unknown ways. > > > > > > Now by default kdump can use rd.acton_on_failure=continue and try to > > > save dump. If it can't due to previous failures, then it will anyway > > > reboot the system. > > > > > > Also if emergency.service stops parsing rd.action_on_failure, then kdump > > > module will be able to start emergency.service too when it sees there > > > is a problem. Right now when kdump module tries to start emergency.service > > > it fails because it looks at acton_on_fail parameter (Another issue Bao is > > > trying to solve). > > > > > > Thanks > > > Vivek > > > > > > > Why not install your own version of emergency.service in the kdump dracut > > module, which parses rd.action_on_failure and acts accordingly. Or replace > > emergency.service in the dracut cmdline hook according to > > rd.action_on_failure. > > That is doable but I think there is still one more issue. What happens > to rest of the systemd services. Once a service fails, systemd will > recognize it as failure and start emergency.service. Once > emergency.service exits, what will systemd do. Will it continue to > start other services which are dependent on failed service. > > I will guess it will not start. Because dependent services are supposed > to be started only if previous service started successfully. > > If that's the case, then just replacing emergency.service is not a > solution. In fact, I think that's how things are currently working. > emergency.service does not start if acton_on_fail=continue is specified > on command line. > > ConditionKernelCommandLine=!action_on_fail=continue > > So core of the problem here is that systemd needs to be aware that > user wants to continue to start other services despite the fact that > previous service failed. And using action_on_failure= command line to > trigger change of behavior is one way of doing it.
Ok, I noticed the commit to not run emergency.service in action_on_fail=continue is mentioned. commit dcae873414ff643e1de790f256e414923e2aef8b Author: Harald Hoyer <har...@redhat.com> Date: Thu May 30 11:14:39 2013 +0200 systemd/emergency.service: do not run for action_on_fail=continue same as for dracut-emergency.service Apart from the issue of other services not starting, we are facing another issue. And that is, kdump module wants to start a bash shell upon failure and start emergency.service. And now that fails because we are booted with action_on_fail=continue. If we make systemd aware of acton_on_fail=continue, then we can take this out of emergency.service and problem will be solved. I guess other option could be to modify emergency.service on the fly (remove ConditionKernelCommandLine=!action_on_fail=continue) and reload systemd configuration and then start emergency.service. If this works, it will take care of second problem but not the first one. Thanks Vivek _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel