On 25/08/2014 17:51, Peter Humphrey wrote:
On Monday 25 August 2014 13:35:11 Kerin Millar wrote:
On 25/08/2014 12:17, Peter Humphrey wrote:
<snip>
Well, it was simple. I just said "rc-update del mdraid boot" and all is
now
well. I'd better revisit the docs to see if they still give the same
advice.
Very interesting indeed.
You wrote this e-mail after the other two, so I'll stick to this route,
leaving the other idea for later if needed.
I now wonder if this is a race condition between the init script running
`mdadm -As` and the fact that the mdadm package installs udev rules that
allow for automatic incremental assembly?
Isn't it just that, with the kernel auto-assembly of the root partition, and
udev rules having assembled the rest, all the work's been done by the time the
mdraid init script is called? I had wondered about the time that udev startup
takes; assembling the raids would account for it.
Yes, it's a possibility and would constitute a race condition - even
though it might ultimately be a harmless one. As touched upon in the
preceding post, I'd really like to know why mdadm sees fit to return a
non-zero exit code given that the arrays are actually assembled
successfully.
After all, even if the arrays are assembled at the point that mdadm is
executed by the mdraid init script, partially or fully, it surely ought
not to matter. As long as the arrays are fully assembled by the time
mdadm exits, it should return 0 to signify success. Nothing else makes
sense, in my opinion. It's absurd that the mdraid script is drawn into
printing a blank error message where nothing has gone wrong.
Further, the mdadm ebuild still prints elog messages stating that mdraid
is a requirement for the boot runlevel but, with udev rules, I don't see
how that can be true. With udev being event-driven and calling mdadm
upon the introduction of a new device, the array should be up and
running as of the very moment that all the disks are seen, no matter
whether the mdraid init script is executed or not.
Refer to /lib/udev/rules.d/64-md-raid.rules and you'll see that it calls
`mdadm --incremental` for newly added devices.
# ls -l /lib/udev/rules.d | grep raid
-rw-r--r-- 1 root root 2.1K Aug 23 10:34 63-md-raid-arrays.rules
-rw-r--r-- 1 root root 1.4K Aug 23 10:34 64-md-raid-assembly.rules
With that in mind, here's something else for you to try. Doing this will
render these udev rules null and void:
# touch /etc/udev/rules.d/64-md-raid.rules
I did that, but I think I need instead to
# touch /etc/udev/rules.d/63-md-raid-arrays.rules
# touch /etc/udev/rules.d/64-md-raid-assembly.rules
Ah, yes. Looks like the rules have changed in >=mdadm-3.3. I'm still
using mdadm-3.2.6-r1.
I'll try it now...
Thereafter, the mdraid script will be the only agent trying to assemble
the 1.x metadata arrays so make sure that it is re-enabled.
Right. Here's the position:
1. I've left /etc/init.d/mdraid out of all run levels. I have nothing but
comments in mdadm.conf, but then it's not likely to be read anyway if
the
init script isn't running.
2. I have empty /etc/udev rules files as above.
3. I have kernel auto-assembly of raid enabled.
4. I don't use an init ram disk.
5. The root partition is on /dev/md5 (0.99 metadata)
6. All other partitions except /boot are under /dev/vg7 which is built on
top of /dev/md7 (1.x metadata).
7. The system boots normally.
I must confess that this boggles my mind. Under these circumstances, I
cannot fathom how - or when - the 1.x arrays are being assembled.
Something has to be executing mdadm at some point.
I'm not actually sure that there is any point in calling mdadm -As where
the udev rules are present. I would expect it to be one approach or the
other, but not both at the same time.
That makes sense to me too. Do I even need sys-fs/mdadm installed? Maybe I'll
try removing it. I have a little rescue system in the same box, so it'd be
easy to put it back if necessary.
Yes, you need mdadm because 1.x metadata arrays must be assembled in
userspace. In Gentoo, there are three contexts I know of in which this
may occur:-
1) Within an initramfs
2) As a result of the udev rules
3) As a result of the mdraid script
Incidentally, the udev rules were a source of controversy in the
following bug. Not everyone appreciates that they are installed by default.
https://bugs.gentoo.org/show_bug.cgi?id=401707
I'll have a look at that - thanks.
--Kerin