'Twas brillig, and Colin Guthrie at 03/02/12 16:00 did gyre and gimble: > 'Twas brillig, and David W. Hodgins at 03/02/12 08:04 did gyre and gimble: >> On Tue, 17 Jan 2012 07:22:30 -0500, Colin Guthrie >> <[email protected]> wrote: >> >>> Are things working OK for you now with dracut or is it still busted? >> >> Just to clarify why I think the problem is happening on single >> core systems. >> >> On a multi-core system, the bash and udevd processes will be >> running on different cores. >> When the script executes the udev settle command, it continues >> to execute, so the loop checking to see if udev is done finds >> it isn't, so it then looks for/runs the initqueue jobs. >> >> On a single core system, the bash script waits for the settle >> command to finish, so then finds it's done, and exits without >> even trying to run the initqueue jobs. >> >> The patch in my prior message is effectively changing the script >> from "udev done or jobs done" to "udev done and jobs done". > > Hmm, actually thinking about this more, I'm not 100% sure I agree with > this argument. The number of cores should be irrelevant here as the > program itself should be dealing with things synchronously anyway. > > I'm wondering if it's more of an issue relating to the fact that it's > not specifically waiting for the LVM device to be ready. I guess your / > is either not on LVM or is in a different Volume Group? In my tests it > worked, but perhaps the dual core machine is simply that bit faster (and > it's speed, not #cores that is important)? > > In the file parse-lvm.sh, it does a for loop and has a wait_for_dev > call. This function will put stuff into the initqueue that should > prevent the exiting of the loop until that device exists... > > for dev in $(getargs rd.lvm.vg rd_LVM_VG=) $(getargs rd.lvm.lv > rd_LVM_LV=); do > wait_for_dev "/dev/$dev" > done > > Now according to the man page, these options are only meant to be used > to restrict what devices are activated so they shouldn't be needed per-se. > > But it brings an important point... there does not appear to be any > "wait_for_dev" calls for the usrmount module So nothing is going to be > waiting for the device to exist. If it takes a little while to come up > it could lead to your error. > > And herein we have chicken and egg... we don't know where /usr is (i.e. > which /dev/foo) until we mount / (as we have to read /etc/fstab). But > by the time we've mounted /, we've already exited this loop and thus > cannot re-enter the loop to wait for more devices. > > Tricky, and certainly something I'll discuss with Harald this weekend. > He does have a separate branch that deals with usr mounting in a more > holistic way (i.e. it handles /usr/bin being a separate mount if that > floats your boat!), but I've not looked at this for a while to see if > he's progressed any with it. > > All in all, it's perhaps just the fact that the first call to udevadm > settle is skipped due to there being nothing in your initqueue/finished/ > folder? You can check via passing rd.break=initqueue and looking in the > folder. > > If so, then all that should be needed to get this into shape is to put a > dummy file in there as part of the 98usrmount module, have that file > delete itself and return and error code, thus causing check_finished() > to return non zero and thus the call to udevsettle will be reached. > > > If this is NOT the issue, then it should just be a timing thing plain > and simple. To confirm, this you should simply be able to pass > rd.break=pre-pivot to the command line, wait a little while and then > just type exit to continue the boot process. This extra time should be > sufficient for udev to "see" the LVM stuff and for the mount command to > succeed (I hope!) > > Sorry for the long reply. You will likely have to poke in the dracut > code to understand everything I'm saying, but it looks like you're doing > that happily already :D
OK, so I sadly didn't get a chance to speak to Harald in Brussels (only saw him briefly during a talk so couldn't go through my list of issues :)) but think my comments above were correct. To summarise, a problem would occur if / was on ext4 and /usr was on LVM. The LVM would never get activated. If / was on LVM too (but a different VG to /usr) then all would be fine. I think this is the scenario you had issues with. Looking at the new code in dracut 015, I think it writes out the variables I mentioned above (rd.lvm.vg) into a cmdline.d folder and thus the LVM for /usr should now get activated. In short, can you test the new dracut version just submitted? Cheers Col -- Colin Guthrie colin(at)mageia.org http://colin.guthr.ie/ Day Job: Tribalogic Limited http://www.tribalogic.net/ Open Source: Mageia Contributor http://www.mageia.org/ PulseAudio Hacker http://www.pulseaudio.org/ Trac Hacker http://trac.edgewall.org/
