Re: Bug in 2.6.17 / mdadm 2.5.1

2006-06-26 Thread Neil Brown
On Monday June 26, [EMAIL PROTECTED] wrote:
> Neil Brown wrote:
> 
> > Alternately you can apply the following patch to the kernel and
> > version-1 superblocks should work better.
> 
> -stable material?

Maybe.  I'm not sure it exactly qualifies, but I might try sending it
to them and see what they think.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in 2.6.17 / mdadm 2.5.1

2006-06-26 Thread Andre Tomt

Neil Brown wrote:


Alternately you can apply the following patch to the kernel and
version-1 superblocks should work better.


-stable material?
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in 2.6.17 / mdadm 2.5.1

2006-06-25 Thread Neil Brown
On Monday June 26, [EMAIL PROTECTED] wrote:
> On Sunday June 25, [EMAIL PROTECTED] wrote:
> > Hi!
> > 
> > There's a bug in Kernel 2.6.17 and / or mdadm which prevents (re)adding
> > a disk to a degraded RAID5-Array.
> 
> Thank you for the detailed report.
> The bug is in the md driver in the kernel (not in mdadm), and only
> affects version-1 superblocks.  Debian recently changed the default
> (in /etc/mdadm.conf) to use version-1 superblocks which I thought
> would be OK (I've some testing) but obviously I missed something. :-(
> 
> If you remove the "metadata=1" (or whatever it is) from
> /etc/mdadm/mdadm.conf and then create the array, it will be created
> with a version-0.90 superblock has had more testing.
> 
> Alternately you can apply the following patch to the kernel and
> version-1 superblocks should work better.

And as a third alternate, you can apply this patch to mdadm-2.5.1
It will work-around the kernel bug.

NeilBrown

diff .prev/Manage.c ./Manage.c
--- .prev/Manage.c  2006-06-20 10:01:17.0 +1000
+++ ./Manage.c  2006-06-26 11:46:56.0 +1000
@@ -271,8 +271,14 @@ int Manage_subdevs(char *devname, int fd
 * If so, we can simply re-add it.
 */
st->ss->uuid_from_super(duuid, dsuper);
-   
-   if (osuper) {
+
+   /* re-add doesn't work for version-1 superblocks
+* before 2.6.18 :-(
+*/
+   if (array.major_version == 1 &&
+   get_linux_version() <= 2006018)
+   ;
+   else if (osuper) {
st->ss->uuid_from_super(ouuid, osuper);
if (memcmp(duuid, ouuid, 
sizeof(ouuid))==0) {
/* look close enough for now.  
Kernel
@@ -295,7 +301,10 @@ int Manage_subdevs(char *devname, int fd
}
}
}
-   for (j=0; j< st->max_devs; j++) {
+   /* due to a bug in 2.6.17 and earlier, we start
+* looking from raid_disks, not 0
+*/
+   for (j = array.raid_disks ; j< st->max_devs; j++) {
disc.number = j;
if (ioctl(fd, GET_DISK_INFO, &disc))
break;

diff .prev/super1.c ./super1.c
--- .prev/super1.c  2006-06-20 10:01:46.0 +1000
+++ ./super1.c  2006-06-26 11:47:12.0 +1000
@@ -277,6 +277,18 @@ static void examine_super1(void *sbv, ch
default: break;
}
printf("\n");
+   printf("Array Slot : %d (", __le32_to_cpu(sb->dev_number));
+   for (i= __le32_to_cpu(sb->max_dev); i> 0 ; i--)
+   if (__le16_to_cpu(sb->dev_roles[i-1]) != 0x)
+   break;
+   for (d=0; d < i; d++) {
+   int role = __le16_to_cpu(sb->dev_roles[d]);
+   if (d) printf(", ");
+   if (role == 0x) printf("empty");
+   else if(role == 0xfffe) printf("failed");
+   else printf("%d", role);
+   }
+   printf(")\n");
printf("   Array State : ");
for (d=0; d<__le32_to_cpu(sb->raid_disks); d++) {
int cnt = 0;
@@ -767,7 +779,8 @@ static int write_init_super1(struct supe
if (memcmp(sb->set_uuid, refsb->set_uuid, 16)==0) {
/* same array, so preserve events and dev_number */
sb->events = refsb->events;
-   sb->dev_number = refsb->dev_number;
+   if (get_linux_version() >= 2006018)
+   sb->dev_number = refsb->dev_number;
}
free(refsb);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in 2.6.17 / mdadm 2.5.1

2006-06-25 Thread Neil Brown
On Sunday June 25, [EMAIL PROTECTED] wrote:
> Hi!
> 
> There's a bug in Kernel 2.6.17 and / or mdadm which prevents (re)adding
> a disk to a degraded RAID5-Array.

Thank you for the detailed report.
The bug is in the md driver in the kernel (not in mdadm), and only
affects version-1 superblocks.  Debian recently changed the default
(in /etc/mdadm.conf) to use version-1 superblocks which I thought
would be OK (I've some testing) but obviously I missed something. :-(

If you remove the "metadata=1" (or whatever it is) from
/etc/mdadm/mdadm.conf and then create the array, it will be created
with a version-0.90 superblock has had more testing.

Alternately you can apply the following patch to the kernel and
version-1 superblocks should work better.

NeilBrown

-
Set desc_nr correctly for version-1 superblocks.

This has to be done in ->load_super, not ->validate_super

### Diffstat output
 ./drivers/md/md.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c   2006-06-26 11:02:43.0 +1000
+++ ./drivers/md/md.c   2006-06-26 11:02:46.0 +1000
@@ -1057,6 +1057,11 @@ static int super_1_load(mdk_rdev_t *rdev
if (rdev->sb_size & bmask)
rdev-> sb_size = (rdev->sb_size | bmask)+1;
 
+   if (sb->level == cpu_to_le32(LEVEL_MULTIPATH))
+   rdev->desc_nr = -1;
+   else
+   rdev->desc_nr = le32_to_cpu(sb->dev_number);
+
if (refdev == 0)
ret = 1;
else {
@@ -1165,7 +1170,6 @@ static int super_1_validate(mddev_t *mdd
 
if (mddev->level != LEVEL_MULTIPATH) {
int role;
-   rdev->desc_nr = le32_to_cpu(sb->dev_number);
role = le16_to_cpu(sb->dev_roles[rdev->desc_nr]);
switch(role) {
case 0x: /* spare */
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html