On 25/08/2014 17:51, Peter Humphrey wrote:
On Monday 25 August 2014 13:35:11 Kerin Millar wrote:
On 25/08/2014 12:17, Peter Humphrey wrote:

<snip>

Well, it was simple. I just said "rc-update del mdraid boot" and all is
now
well. I'd better revisit the docs to see if they still give the same
advice.

Very interesting indeed.

You wrote this e-mail after the other two, so I'll stick to this route,
leaving the other idea for later if needed.

I now wonder if this is a race condition between the init script running
`mdadm -As` and the fact that the mdadm package installs udev rules that
allow for automatic incremental assembly?

Isn't it just that, with the kernel auto-assembly of the root partition, and
udev rules having assembled the rest, all the work's been done by the time the
mdraid init script is called? I had wondered about the time that udev startup
takes; assembling the raids would account for it.

Yes, it's a possibility and would constitute a race condition - even though it might ultimately be a harmless one. As touched upon in the preceding post, I'd really like to know why mdadm sees fit to return a non-zero exit code given that the arrays are actually assembled successfully.

After all, even if the arrays are assembled at the point that mdadm is executed by the mdraid init script, partially or fully, it surely ought not to matter. As long as the arrays are fully assembled by the time mdadm exits, it should return 0 to signify success. Nothing else makes sense, in my opinion. It's absurd that the mdraid script is drawn into printing a blank error message where nothing has gone wrong.

Further, the mdadm ebuild still prints elog messages stating that mdraid is a requirement for the boot runlevel but, with udev rules, I don't see how that can be true. With udev being event-driven and calling mdadm upon the introduction of a new device, the array should be up and running as of the very moment that all the disks are seen, no matter whether the mdraid init script is executed or not.

Refer to /lib/udev/rules.d/64-md-raid.rules and you'll see that it calls
`mdadm --incremental` for newly added devices.

# ls -l /lib/udev/rules.d | grep raid
-rw-r--r-- 1 root root 2.1K Aug 23 10:34 63-md-raid-arrays.rules
-rw-r--r-- 1 root root 1.4K Aug 23 10:34 64-md-raid-assembly.rules

With that in mind, here's something else for you to try. Doing this will
render these udev rules null and void:

# touch /etc/udev/rules.d/64-md-raid.rules

I did that, but I think I need instead to
# touch /etc/udev/rules.d/63-md-raid-arrays.rules
# touch /etc/udev/rules.d/64-md-raid-assembly.rules

Ah, yes. Looks like the rules have changed in >=mdadm-3.3. I'm still using mdadm-3.2.6-r1.


I'll try it now...

Thereafter, the mdraid script will be the only agent trying to assemble
the 1.x metadata arrays so make sure that it is re-enabled.

Right. Here's the position:
1.      I've left /etc/init.d/mdraid out of all run levels. I have nothing but
        comments in mdadm.conf, but then it's not likely to be read anyway if 
the
        init script isn't running.
2.      I have empty /etc/udev rules files as above.
3.      I have kernel auto-assembly of raid enabled.
4.      I don't use an init ram disk.
5.      The root partition is on /dev/md5 (0.99 metadata)
6.      All other partitions except /boot are under /dev/vg7 which is built on
        top of /dev/md7 (1.x metadata).
7.      The system boots normally.

I must confess that this boggles my mind. Under these circumstances, I cannot fathom how - or when - the 1.x arrays are being assembled. Something has to be executing mdadm at some point.


I'm not actually sure that there is any point in calling mdadm -As where
the udev rules are present. I would expect it to be one approach or the
other, but not both at the same time.

That makes sense to me too. Do I even need sys-fs/mdadm installed? Maybe I'll
try removing it. I have a little rescue system in the same box, so it'd be
easy to put it back if necessary.

Yes, you need mdadm because 1.x metadata arrays must be assembled in userspace. In Gentoo, there are three contexts I know of in which this may occur:-

  1) Within an initramfs
  2) As a result of the udev rules
  3) As a result of the mdraid script


Incidentally, the udev rules were a source of controversy in the
following bug. Not everyone appreciates that they are installed by default.

https://bugs.gentoo.org/show_bug.cgi?id=401707

I'll have a look at that - thanks.


--Kerin

Reply via email to