Now with trying if raid5 makes problem I looked further into this whole raid stuff in grub.
Am Freitag, den 01.08.2008, 17:22 +0200 schrieb Felix Zielcke: > Am Freitag, den 01.08.2008, 15:36 +0200 schrieb Robert Millan: > > > The existing code is confusing. What is total_devs for? If we need to > > iterate up to 32, sounds like this variable is pointless? > > > > If it's useless, it should be removed (but maybe it isn't!). > > > > total_devs is a fast check if the array is readable. total_devs is the number of disks according to the superblock nr_devs is the number grub found. The value from the superblock is needed. nr_devs could be removed but as I already said above it's faster with it. If it doestn't exist you have to iterate over all 32 devices and then check the found number of disks with total_devs to find out if the array is readable (grub_is_array_readable in disk/raid.c) And the raid5 has a check, if one disk in the array is missing and then an error occurs while reading from a different one then stops reading further from the whole array. So I would suggest to just keep both, as long as you don't need the sizeof(unsigned int) this would save. But with the super 0.90 format both could be just an unsigned char, it should never happen that we have more then 28 devices which 0.90 supports and the current code just assumes 32 so there'll be problems anyway. > ... To make that a bit more clear: mdadm allocates the disk number stored in superblock begining with 0 If you remove the disk with number 0 then the others still keep there number. If you add it back, then it gets while it's syncing the first highest disk number which is free. So if you make a raid1 with 2 disks then the numbers are 0 and 1 if you remove the one with has 0 and then add it back then it becomes 2 If it's fully synced then it get's changed back to 0 And the problem in grub is that the disk number in superblock is used for the array->device[] so there can be holes with missing disks. I thought about changing grub_raid_scan_device so the disks get added to a temporary array with the superblock number just as it's currently done and then copy from that one to the used array->devices[] skipping all invalid entrys. This would be fine for Raid 0 and 1 But there's Raid 5 :( On Raid 5 one disk can be missing and everything can be still read it's just slower. The problem is we need every disk which is in the array and not just the working ones to calculate from which disk shall be read. So the Raid 1 case can be fixed fine with my previous patch which shouldn't slow down that much All 32 devices are just checked on grub-probe real grub stops when it found a working one which should be for the most users always a very low number Maybe I can find a solution for this Raid 5 problem, but I doubt. So we should probable warn users if their /boot is on a raid 5 Raid 5 worked for me fine with qemu just using 2 of the 3 disk images, when it was fully synced, the order of them didn't even matter. But if I removed a disk with mdadm then grub failed with the other 2 used. _______________________________________________ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel