Re: Raid 1, new disk can't be added after replacing faulty disk

2008-01-08 Thread Radu Rendec
On Tue, 2008-01-08 at 15:35 +1100, Neil Brown wrote:
 On Monday January 7, [EMAIL PROTECTED] wrote:
  Looks like you are running into the issue described here:
  http://marc.info/?l=linux-raidm=119892098129022w=2

Dan, thanks for your quick reply. It's exactly the same issue: the array
was created with mdadm 2.5.6.

 I cannot easily reproduce this.  I suspect it is sensitive to the
 exact size of the devices involved.

I reproduced this with 3 different device sizes: 518160384 bytes,
4005711360 bytes and 155021851648 bytes. Is it possible that I hit the
wrong size every time? Maybe it's related to the disk geometry
(because all partitions lay on exact cylinder boundaries, and thus all
partitions sizes are multiples of the same cylinder size).

 Please test this patch and see if it fixes the problem.
 If not, please tell me the exact sizes of the partition being used
 (e.g. cat /proc/partitions) and I will try harder to reproduce it.

I'll test the patch and get back to you soon.

Thanks,

Radu Rendec


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid 1, new disk can't be added after replacing faulty disk

2008-01-08 Thread Radu Rendec
On Tue, 2008-01-08 at 15:35 +1100, Neil Brown wrote:
 On Monday January 7, [EMAIL PROTECTED] wrote:
  On Jan 7, 2008 6:44 AM, Radu Rendec [EMAIL PROTECTED] wrote:
   I'm experiencing trouble when trying to add a new disk to a raid 1 array
   after having replaced a faulty disk.

 I cannot easily reproduce this.  I suspect it is sensitive to the
 exact size of the devices involved.
 
 Please test this patch and see if it fixes the problem.

I successfully applied the patch to v2.6.2 and it fixes the problem.
Many thanks for taking the time to look into this issue.

Radu Rendec


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Raid 1, new disk can't be added after replacing faulty disk

2008-01-07 Thread Radu Rendec
I'm experiencing trouble when trying to add a new disk to a raid 1 array
after having replaced a faulty disk.

A few details about my configuration:

# cat /proc/mdstat 
Personalities : [raid1] [raid6] [raid5] [raid4] 
md1 : active raid1 sdb3[1]
  151388452 blocks super 1.0 [2/1] [_U]
  
md0 : active raid1 sdb2[1]
  3911816 blocks super 1.0 [2/1] [_U]
  
unused devices: none

# uname -a
Linux i.ines.ro 2.6.23.8-63.fc8 #1 SMP Wed Nov 21 18:51:08 EST 2007 i686
i686 i386 GNU/Linux

# mdadm --version
mdadm - v2.6.2 - 21st May 2007

So the story is this: disk sda failed and was physically replaced with a
new one. The new disk is identical and was partitioned exactly the same
way (as the old one and sdb). Getting sda2 (from the fresh empty disk)
to the array does not work. This is what happens:

# mdadm /dev/md0 -a /dev/sda2
mdadm: add new device failed for /dev/sda2 as 2: Invalid argument

Kernel messages follow:
md: sda2 does not have a valid v1.0 superblock, not importing!
md: md_import_device returned -22

It's obvious that sda2 does not have a superblock (at all) since it's a
fresh empty disk. But I expected mdadm to create the superblock and
start rebuilding the array immediately.

However, this happens with both mdadm 2.6.2 and 2.6.4. I downgraded to
2.5.4 and it works like a charm.

If you reply, please add me to cc - I am not subscribed to the list.
Should I provide you further details or any kind of assistance for
testing, please let me know.

Thanks,

Radu Rendec

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid 1, new disk can't be added after replacing faulty disk

2008-01-07 Thread Dan Williams
On Jan 7, 2008 6:44 AM, Radu Rendec [EMAIL PROTECTED] wrote:
 I'm experiencing trouble when trying to add a new disk to a raid 1 array
 after having replaced a faulty disk.

[..]
 # mdadm --version
 mdadm - v2.6.2 - 21st May 2007

[..]
 However, this happens with both mdadm 2.6.2 and 2.6.4. I downgraded to
 2.5.4 and it works like a charm.

Looks like you are running into the issue described here:
http://marc.info/?l=linux-raidm=119892098129022w=2
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Raid 1, new disk can't be added after replacing faulty disk

2008-01-07 Thread Neil Brown
On Monday January 7, [EMAIL PROTECTED] wrote:
 On Jan 7, 2008 6:44 AM, Radu Rendec [EMAIL PROTECTED] wrote:
  I'm experiencing trouble when trying to add a new disk to a raid 1 array
  after having replaced a faulty disk.
 
 [..]
  # mdadm --version
  mdadm - v2.6.2 - 21st May 2007
 
 [..]
  However, this happens with both mdadm 2.6.2 and 2.6.4. I downgraded to
  2.5.4 and it works like a charm.
 
 Looks like you are running into the issue described here:
 http://marc.info/?l=linux-raidm=119892098129022w=2

I cannot easily reproduce this.  I suspect it is sensitive to the
exact size of the devices involved.

Please test this patch and see if it fixes the problem.
If not, please tell me the exact sizes of the partition being used
(e.g. cat /proc/partitions) and I will try harder to reproduce it.

Thanks,
NeilBrown



diff --git a/super1.c b/super1.c
index 2b096d3..9eec460 100644
--- a/super1.c
+++ b/super1.c
@@ -903,7 +903,7 @@ static int write_init_super1(struct supertype *st, void 
*sbv,
 * for a bitmap.
 */
array_size = __le64_to_cpu(sb-size);
-   /* work out how much space we left of a bitmap */
+   /* work out how much space we left for a bitmap */
bm_space = choose_bm_space(array_size);
 
switch(st-minor_version) {
@@ -913,6 +913,8 @@ static int write_init_super1(struct supertype *st, void 
*sbv,
sb_offset = ~(4*2-1);
sb-super_offset = __cpu_to_le64(sb_offset);
sb-data_offset = __cpu_to_le64(0);
+   if (sb_offset - bm_space  array_size)
+   bm_space = sb_offset - array_size;
sb-data_size = __cpu_to_le64(sb_offset - bm_space);
break;
case 1:
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html