Re: Deleting mdadm RAID arrays

2008-02-08 Thread Marcin Krol
Thursday 07 February 2008 22:35:45 Bill Davidsen napisał(a):
  As you may remember, I have configured udev to associate /dev/d_* devices 
  with
  serial numbers (to keep them from changing depending on boot module loading 
  sequence). 

 Why do you care? 

Because /dev/sd* devices get swapped randomly depending on boot module insertion
sequence, as I explained earlier.

 If you are using UUID for all the arrays and mounts  
 does this buy you anything? 

This is exactly what is not clear for me: what is it that identifies 
drive/partition as part of 
the array? /dev/sd name? UUID as part of superblock? /dev/d_n?

If it's UUID I should be safe regardless of /dev/sd* designation? Yes or no?

 And more to the point, the first time a  
 drive fails and you replace it, will it cause you a problem? Require 
 maintaining the serial to name data manually?

That's not the problem. I just want my array to be intact.

 I miss the benefit of forcing this instead of just building the 
 information at boot time and dropping it in a file.

I would prefer that, too - if it worked. I was getting both arrays messed 
up randomly on boot. messed up in the sense of arrays being composed
of different /dev/sd devices.


  And I made *damn* sure I zeroed all the superblocks before reassembling 
  the arrays. Yet it still shows the old partitions on those arrays!

 As I noted before, you said you had these on whole devices before, did 
 you zero the superblocks on the whole devices or the partitions? From 
 what I read, it was the partitions.

I tried it both ways actually (rebuilt arrays a few times, just udev didn't want
to associate WD-serialnumber-part1 as /dev/d_1p1 as it was told, it still 
claimed
it was /dev/d_1). 

Regards,
Marcin Krol
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-08 Thread Marcin Krol
Friday 08 February 2008 13:44:18 Bill Davidsen napisał(a):

  This is exactly what is not clear for me: what is it that identifies 
  drive/partition as part of 
  the array? /dev/sd name? UUID as part of superblock? /dev/d_n?
 
  If it's UUID I should be safe regardless of /dev/sd* designation? Yes or no?

 Yes, absolutely.

OK, that's what I needed to know. 


Regards,
Marcin Krol
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-08 Thread Bill Davidsen

Marcin Krol wrote:

Thursday 07 February 2008 22:35:45 Bill Davidsen napisał(a):
  

As you may remember, I have configured udev to associate /dev/d_* devices with
serial numbers (to keep them from changing depending on boot module loading 
sequence). 
  


  
Why do you care? 



Because /dev/sd* devices get swapped randomly depending on boot module insertion
sequence, as I explained earlier.

  

So there's no functional problem, just cosmetic?
If you are using UUID for all the arrays and mounts  
does this buy you anything? 



This is exactly what is not clear for me: what is it that identifies drive/partition as part of 
the array? /dev/sd name? UUID as part of superblock? /dev/d_n?


If it's UUID I should be safe regardless of /dev/sd* designation? Yes or no?

  

Yes, absolutely.
And more to the point, the first time a  
drive fails and you replace it, will it cause you a problem? Require 
maintaining the serial to name data manually?



That's not the problem. I just want my array to be intact.

  
I miss the benefit of forcing this instead of just building the 
information at boot time and dropping it in a file.



I would prefer that, too - if it worked. I was getting both arrays messed 
up randomly on boot. messed up in the sense of arrays being composed

of different /dev/sd devices.

  
Different devices? Or just different names for the same devices? I 
assume just the names change, and I still don't see why you care... 
subtle beyond my understanding.
  
And I made *damn* sure I zeroed all the superblocks before reassembling 
the arrays. Yet it still shows the old partitions on those arrays!
  
  
As I noted before, you said you had these on whole devices before, did 
you zero the superblocks on the whole devices or the partitions? From 
what I read, it was the partitions.



I tried it both ways actually (rebuilt arrays a few times, just udev didn't want
to associate WD-serialnumber-part1 as /dev/d_1p1 as it was told, it still 
claimed
it was /dev/d_1). 
  


I'm not talking about building the array, but zeroing the superblocks. 
Did you use the partition name, /dev/sdb1, when you ran mdadm with 
zero-super or did you zero the whole device, /dev/sdb, which is what 
you were using when you first built the array with whole devices. If you 
didn't zero the superblock for the whole device it may explain why a 
superblock is still found.


--
Bill Davidsen [EMAIL PROTECTED]
 Woe unto the statesman who makes war without a reason that will still
 be valid when the war is over... Otto von Bismark 




-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-07 Thread Bill Davidsen

Marcin Krol wrote:

Thursday 07 February 2008 03:36:31 Neil Brown napisał(a):

  

   8 0  390711384 sda
   8 1  390708801 sda1
   816  390711384 sdb
   817  390708801 sdb1
   832  390711384 sdc
   833  390708801 sdc1
   848  390710327 sdd
   849  390708801 sdd1
   864  390711384 sde
   865  390708801 sde1
   880  390711384 sdf
   881  390708801 sdf1
   364   78150744 hdb
   3651951866 hdb1
   3667815622 hdb2
   3674883760 hdb3
   368  1 hdb4
   369 979933 hdb5
   370 979933 hdb6
   371   61536951 hdb7
   9 1  781417472 md1
   9 0  781417472 md0
  

So all the expected partitions are known to the kernel - good.



It 's not good really!!

I can't trust /dev/sd* devices - they get swapped randomly depending 
on sequence of module loading!! I have two drivers, ahci for onboard

SATA controllers and sata_sil for additional controller.

Sometimes the system boots ahci first and sata_sil later, sometimes 
in reverse sequence. 

Then, sda becomes sdc, sdb becomes sdd, etc. 


It is exactly the problem that I cannot rely on kernel's information which
physical drive is which logical drive!

  

Then
  mdadm /dev/md0 -f /dev/d_1

will fail d_1, abort the recovery, and release d_1.

Then
  mdadm --zero-superblock /dev/d_1

should work.



Thanks, though I managed to fail the drives, remove them, zero superblocks 
and reassemble the arrays anyway. 

The problem I have now is that mdadm seems to be of 'two minds' when it comes 
to where it gets the info on which disk is what part of the array. 


As you may remember, I have configured udev to associate /dev/d_* devices with
serial numbers (to keep them from changing depending on boot module loading 
sequence). 

  
Why do you care? If you are using UUID for all the arrays and mounts 
does this buy you anything? And more to the point, the first time a 
drive fails and you replace it, will it cause you a problem? Require 
maintaining the serial to name data manually?


I miss the benefit of forcing this instead of just building the 
information at boot time and dropping it in a file.


Now, when I swap two (random) drives in order to test if it keeps device names 
associated with serial numbers I get the following effect:


1. mdadm -Q --detail /dev/md* gives correct results before *and* after the 
swapping:

% mdadm -Q --detail /dev/md0
/dev/md0:
[...]
Number   Major   Minor   RaidDevice State
   0   810  active sync   /dev/d_1
   1   8   171  active sync   /dev/d_2
   2   8   812  active sync   /dev/d_3

% mdadm -Q --detail /dev/md1
/dev/md1:
[...]
Number   Major   Minor   RaidDevice State
   0   8   490  active sync   /dev/d_4
   1   8   651  active sync   /dev/d_5
   2   8   332  active sync   /dev/d_6


2. However, cat /proc/mdstat gives shows different layout of the arrays!

BEFORE the swap:

% cat mdstat-16_51
Personalities : [raid6] [raid5] [raid4]
md1 : active raid5 sdb1[2] sdf1[0] sda1[1]
  781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

md0 : active raid5 sde1[2] sdc1[0] sdd1[1]
  781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

unused devices: none


AFTER the swap:

% cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4]
md1 : active(auto-read-only) raid5 sdd1[0] sdc1[2] sde1[1]
  781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

md0 : active(auto-read-only) raid5 sda1[0] sdf1[2] sdb1[1]
  781417472 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]

unused devices: none

I have no idea now if the array is functioning (it keeps the drives
according to /dev/d_* devices and superblock info is unimportant)
or if my arrays fell apart because of that swapping. 

And I made *damn* sure I zeroed all the superblocks before reassembling 
the arrays. Yet it still shows the old partitions on those arrays!
  
As I noted before, you said you had these on whole devices before, did 
you zero the superblocks on the whole devices or the partitions? From 
what I read, it was the partitions.


--
Bill Davidsen [EMAIL PROTECTED]
 Woe unto the statesman who makes war without a reason that will still
 be valid when the war is over... Otto von Bismark 




-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-06 Thread Marcin Krol
Tuesday 05 February 2008 21:12:32 Neil Brown napisał(a):

  % mdadm --zero-superblock /dev/sdb1
  mdadm: Couldn't open /dev/sdb1 for write - not zeroing
 
 That's weird.
 Why can't it open it?

Hell if I know. First time I see such a thing. 

 Maybe you aren't running as root (The '%' prompt is suspicious).

I am running as root, the % prompt is the obfuscation part (I have
configured bash to display IP as part of prompt).

 Maybe the kernel has  been told to forget about the partitions of
 /dev/sdb.

But fdisk/cfdisk has no problem whatsoever finding the partitions .

 mdadm will sometimes tell it to do that, but only if you try to
 assemble arrays out of whole components.

 If that is the problem, then
blockdev --rereadpt /dev/sdb

I deleted LVM devices that were sitting on top of RAID and reinstalled mdadm.

% blockdev --rereadpt /dev/sdf
BLKRRPART: Device or resource busy

% mdadm /dev/md2 --fail /dev/sdf1
mdadm: set /dev/sdf1 faulty in /dev/md2

% blockdev --rereadpt /dev/sdf
BLKRRPART: Device or resource busy

% mdadm /dev/md2 --remove /dev/sdf1
mdadm: hot remove failed for /dev/sdf1: Device or resource busy

lsof /dev/sdf1 gives ZERO results.

arrrRRRGH

Regards,
Marcin Krol
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-06 Thread Marcin Krol
Tuesday 05 February 2008 12:43:31 Moshe Yudkowsky napisał(a):

  1. Where this info on array resides?! I have deleted /etc/mdadm/mdadm.conf 
  and /dev/md devices and yet it comes seemingly out of nowhere.

 /boot has a copy of mdadm.conf so that / and other drives can be started 
 and then mounted. update-initramfs will update /boot's copy of mdadm.conf.

Yeah, I found that while deleting mdadm package...

Thanks for answers everyone anyway.

Regards,
Marcin Krol


-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-06 Thread Peter Rabbitson

Marcin Krol wrote:

Tuesday 05 February 2008 21:12:32 Neil Brown napisał(a):


% mdadm --zero-superblock /dev/sdb1
mdadm: Couldn't open /dev/sdb1 for write - not zeroing

That's weird.
Why can't it open it?


Hell if I know. First time I see such a thing. 


Maybe you aren't running as root (The '%' prompt is suspicious).


I am running as root, the % prompt is the obfuscation part (I have
configured bash to display IP as part of prompt).


Maybe the kernel has  been told to forget about the partitions of
/dev/sdb.


But fdisk/cfdisk has no problem whatsoever finding the partitions .


mdadm will sometimes tell it to do that, but only if you try to
assemble arrays out of whole components.



If that is the problem, then
   blockdev --rereadpt /dev/sdb


I deleted LVM devices that were sitting on top of RAID and reinstalled mdadm.

% blockdev --rereadpt /dev/sdf
BLKRRPART: Device or resource busy

% mdadm /dev/md2 --fail /dev/sdf1
mdadm: set /dev/sdf1 faulty in /dev/md2

% blockdev --rereadpt /dev/sdf
BLKRRPART: Device or resource busy

% mdadm /dev/md2 --remove /dev/sdf1
mdadm: hot remove failed for /dev/sdf1: Device or resource busy

lsof /dev/sdf1 gives ZERO results.



What does this say:

dmsetup table

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-06 Thread Neil Brown
On Wednesday February 6, [EMAIL PROTECTED] wrote:
 
  Maybe the kernel has  been told to forget about the partitions of
  /dev/sdb.
 
 But fdisk/cfdisk has no problem whatsoever finding the partitions .

It is looking at the partition table on disk.  Not at the kernel's
idea of partitions, which is initialised from that table...

What does

  cat /proc/partitions

say?

 
  mdadm will sometimes tell it to do that, but only if you try to
  assemble arrays out of whole components.
 
  If that is the problem, then
 blockdev --rereadpt /dev/sdb
 
 I deleted LVM devices that were sitting on top of RAID and reinstalled mdadm.
 
 % blockdev --rereadpt /dev/sdf
 BLKRRPART: Device or resource busy
 

Implies that some partition is in use.

 % mdadm /dev/md2 --fail /dev/sdf1
 mdadm: set /dev/sdf1 faulty in /dev/md2
 
 % blockdev --rereadpt /dev/sdf
 BLKRRPART: Device or resource busy
 
 % mdadm /dev/md2 --remove /dev/sdf1
 mdadm: hot remove failed for /dev/sdf1: Device or resource busy

OK, that's weird.  If sdf1 is faulty, then you should be able to
remove it.  What does
  cat /proc/mdstat
  dmesg | tail

say at this point?

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-06 Thread Marcin Krol
Wednesday 06 February 2008 11:11:51 Peter Rabbitson napisał(a):
  lsof /dev/sdf1 gives ZERO results.
  
 
 What does this say:
 
   dmsetup table


% dmsetup table
vg-home: 0 61440 linear 9:2 384

Regards,
Marcin Krol
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-06 Thread David Greaves
Marcin Krol wrote:
 Hello everyone,
 
 I have had a problem with RAID array (udev messed up disk names, I've had 
 RAID on
 disks only, without raid partitions)

Do you mean that you originally used /dev/sdb for the RAID array? And now you
are using /dev/sdb1?

Given the system seems confused I wonder if this may be relevant?

David

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-06 Thread Marcin Krol
Wednesday 06 February 2008 12:22:00:
  I have had a problem with RAID array (udev messed up disk names, I've had 
  RAID on
  disks only, without raid partitions)
 
 Do you mean that you originally used /dev/sdb for the RAID array? And now you
 are using /dev/sdb1?

That's reconfigured now, it doesn't matter (started up the host in single user, 
created
partitions as opposed to running RAID previously on whole disks).
 
 Given the system seems confused I wonder if this may be relevant?

I don't think so, I tried most mdadm operations (fail, remove, etc) on disks 
(like sdb) and 
partitions (like sdb1) and get identical messages for either.


-- 
Marcin Krol

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-06 Thread Marcin Krol
Wednesday 06 February 2008 11:43:12:
 On Wednesday February 6, [EMAIL PROTECTED] wrote:
  
   Maybe the kernel has  been told to forget about the partitions of
   /dev/sdb.
  
  But fdisk/cfdisk has no problem whatsoever finding the partitions .
 
 It is looking at the partition table on disk.  Not at the kernel's
 idea of partitions, which is initialised from that table...

Aha! Thanks for this bit. I get it now.

 What does
 
   cat /proc/partitions
 
 say?

Note: I have reconfigured udev now to associate device names with serial
numbers (below)

% cat /proc/partitions
major minor  #blocks  name

   8 0  390711384 sda
   8 1  390708801 sda1
   816  390711384 sdb
   817  390708801 sdb1
   832  390711384 sdc
   833  390708801 sdc1
   848  390710327 sdd
   849  390708801 sdd1
   864  390711384 sde
   865  390708801 sde1
   880  390711384 sdf
   881  390708801 sdf1
   364   78150744 hdb
   3651951866 hdb1
   3667815622 hdb2
   3674883760 hdb3
   368  1 hdb4
   369 979933 hdb5
   370 979933 hdb6
   371   61536951 hdb7
   9 1  781417472 md1
   9 0  781417472 md0



/dev/disk/by-id % ls -l

total 0
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 ata-ST380023A_3KB0MV22 - ../../hdb
lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part1 - 
../../hdb1
lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part2 - 
../../hdb2
lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part3 - 
../../hdb3
lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part4 - 
../../hdb4
lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part5 - 
../../hdb5
lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part6 - 
../../hdb6
lrwxrwxrwx 1 root root 10 2008-02-06 13:34 ata-ST380023A_3KB0MV22-part7 - 
../../hdb7
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1696130 
- ../../d_6
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 
ata-WDC_WD4000KD-00N-WD-WMAMY1696130-part1 - ../../d_6
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1707974 
- ../../d_5
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 
ata-WDC_WD4000KD-00N-WD-WMAMY1707974-part1 - ../../d_5
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1795228 
- ../../d_1
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 
ata-WDC_WD4000KD-00N-WD-WMAMY1795228-part1 - ../../d_1
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1795364 
- ../../d_3
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 
ata-WDC_WD4000KD-00N-WD-WMAMY1795364-part1 - ../../d_3
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1798692 
- ../../d_2
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 
ata-WDC_WD4000KD-00N-WD-WMAMY1798692-part1 - ../../d_2
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 ata-WDC_WD4000KD-00N-WD-WMAMY1800255 
- ../../d_4
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 
ata-WDC_WD4000KD-00N-WD-WMAMY1800255-part1 - ../../d_4
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1696130 - ../../d_6
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1696130-part1 - 
../../d_6
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1707974 - ../../d_5
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1707974-part1 - 
../../d_5
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1795228 - ../../d_1
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1795228-part1 - 
../../d_1
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1795364 - ../../d_3
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1795364-part1 - 
../../d_3
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1798692 - ../../d_2
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1798692-part1 - 
../../d_2
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1800255 - ../../d_4
lrwxrwxrwx 1 root root  9 2008-02-06 13:34 scsi-S_WD-WMAMY1800255-part1 - 
../../d_4

I have no idea why udev can't allocate /dev/d_1p1 to partition 1 on disk d_1. I 
have
explicitly asked it to do that:

/etc/udev/rules.d % cat z24_disks_domeny.rules


KERNEL==sd*, SUBSYSTEM==block, ENV{ID_SERIAL_SHORT}==WD-WMAMY1795228, 
NAME=d_1
KERNEL==sd*, SUBSYSTEM==block, 
ENV{ID_SERIAL_SHORT}==WD-WMAMY1795228-part1, NAME=d_1p1

KERNEL==sd*, SUBSYSTEM==block, ENV{ID_SERIAL_SHORT}==WD-WMAMY1798692, 
NAME=d_2
KERNEL==sd*, SUBSYSTEM==block, 
ENV{ID_SERIAL_SHORT}==WD-WMAMY1798692-part1, NAME=d_2p1

KERNEL==sd*, SUBSYSTEM==block, ENV{ID_SERIAL_SHORT}==WD-WMAMY1795364, 
NAME=d_3
KERNEL==sd*, SUBSYSTEM==block, 
ENV{ID_SERIAL_SHORT}==WD-WMAMY1795364-part1, NAME=d_3p1

KERNEL==sd*, SUBSYSTEM==block, ENV{ID_SERIAL_SHORT}==WD-WMAMY1800255, 
NAME=d_4
KERNEL==sd*, SUBSYSTEM==block, 
ENV{ID_SERIAL_SHORT}==WD-WMAMY1800255-part1, NAME=d_4p1

KERNEL==sd*, SUBSYSTEM==block, ENV{ID_SERIAL_SHORT}==WD-WMAMY1707974, 
NAME=d_5
KERNEL==sd*, SUBSYSTEM==block, 

Re: Deleting mdadm RAID arrays

2008-02-06 Thread Neil Brown
On Wednesday February 6, [EMAIL PROTECTED] wrote:
 
 % cat /proc/partitions
 major minor  #blocks  name
 
8 0  390711384 sda
8 1  390708801 sda1
816  390711384 sdb
817  390708801 sdb1
832  390711384 sdc
833  390708801 sdc1
848  390710327 sdd
849  390708801 sdd1
864  390711384 sde
865  390708801 sde1
880  390711384 sdf
881  390708801 sdf1
364   78150744 hdb
3651951866 hdb1
3667815622 hdb2
3674883760 hdb3
368  1 hdb4
369 979933 hdb5
370 979933 hdb6
371   61536951 hdb7
9 1  781417472 md1
9 0  781417472 md0

So all the expected partitions are known to the kernel - good.

 
 /etc/udev/rules.d % cat /proc/mdstat
 Personalities : [raid1] [raid6] [raid5] [raid4]
 md0 : active(auto-read-only) raid5 sdc1[0] sde1[3](S) sdd1[1]
   781417472 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_]
 
 md1 : active(auto-read-only) raid5 sdf1[0] sdb1[3](S) sda1[1]
   781417472 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_]
 
 md0 consists of sdc1, sde1 and sdd1 even though when creating I asked it to 
 use d_1, d_2 and d_3 (this is probably written on the particular 
 disk/partition itself,
 but I have no idea how to clean this up - mdadm --zero-superblock /dev/d_1
 again produces mdadm: Couldn't open /dev/d_1 for write - not zeroing)
 

I suspect it is related to the (auto-read-only).
The array is degraded and has a spare, so it wants to do a recovery to
the spare.  But it won't start the recovery until the array is not
read-only.

But the recovery process has partly started (you'll see an md1_resync
thread) so it won't let go of any fail devices at the moment.
If you 
  mdadm -w /dev/md0

the recovery will start.
Then
  mdadm /dev/md0 -f /dev/d_1

will fail d_1, abort the recovery, and release d_1.

Then
  mdadm --zero-superblock /dev/d_1

should work.

It is currently failing with EBUSY - --zero-superblock opens the
device with O_EXCL to ensure that it isn't currently in use, and as
long as it is part of an md array, O_EXCL will fail.
I should make that more explicit in the error message.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Deleting mdadm RAID arrays

2008-02-05 Thread Marcin Krol
Hello everyone,

I have had a problem with RAID array (udev messed up disk names, I've had RAID 
on
disks only, without raid partitions) on Debian Etch server with 6 disks and so 
I decided 
to rearrange this. 

Deleted the disks from (2 RAID-5) arrays, deleted the md* devices from /dev,
created /dev/sd[a-f]1 Linux raid auto-detect partitions and rebooted the host.

Now the mdadm startup script is writing in loop a message like mdadm: warning: 
/dev/sda1 and 
/dev/sdb1 have similar superblocks. If they are not identical, --zero the 
superblock ... 

The host can't boot up now because of this.

If I boot the server with some disks, I can't even zero that superblock:

% mdadm --zero-superblock /dev/sdb1
mdadm: Couldn't open /dev/sdb1 for write - not zeroing

It's the same even after:

% mdadm --manage /dev/md2 --fail /dev/sdb1
mdadm: set /dev/sdb1 faulty in /dev/md2


Now, I have NEVER created /dev/md2 array, yet it show up automatically!

% cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [raid1]
md2 : active(auto-read-only) raid1 sdb1[1]
  390708736 blocks [3/1] [_U_]

md1 : inactive sda1[2]
  390708736 blocks

unused devices: none


Questions:

1. Where this info on array resides?! I have deleted /etc/mdadm/mdadm.conf 
and /dev/md devices and yet it comes seemingly out of nowhere.

2. How can I delete that damn array so it doesn't hang my server up in a loop?


-- 
Marcin Krol

-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-05 Thread Moshe Yudkowsky


1. Where this info on array resides?! I have deleted /etc/mdadm/mdadm.conf 
and /dev/md devices and yet it comes seemingly out of nowhere.


/boot has a copy of mdadm.conf so that / and other drives can be started 
and then mounted. update-initramfs will update /boot's copy of mdadm.conf.


--
Moshe Yudkowsky * [EMAIL PROTECTED] * www.pobox.com/~moshe
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-05 Thread Janek Kozicki
Marcin Krol said: (by the date of Tue, 5 Feb 2008 11:42:19 +0100)

 2. How can I delete that damn array so it doesn't hang my server up in a loop?

dd if=/dev/zero of=/dev/sdb1 bs=1M count=10

I'm not using mdadm.conf at all. Everything is stored in the
superblock of the device. So if you don't erase it - info about raid
array will be still automatically found.

-- 
Janek Kozicki |
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-05 Thread Michael Tokarev
Janek Kozicki wrote:
 Marcin Krol said: (by the date of Tue, 5 Feb 2008 11:42:19 +0100)
 
 2. How can I delete that damn array so it doesn't hang my server up in a 
 loop?
 
 dd if=/dev/zero of=/dev/sdb1 bs=1M count=10

This works provided the superblocks are at the beginning of the
component devices.  Which is not the case by default (0.90
superblocks, at the end of components), or with 1.0 superblocks.

  mdadm --zero-superblock /dev/sdb1

is the way to go here.

 I'm not using mdadm.conf at all. Everything is stored in the
 superblock of the device. So if you don't erase it - info about raid
 array will be still automatically found.

That's wrong, as you need at least something to identify the array
components.  UUID is the most reliable and commonly used.  You
assemble the arrays as

  mdadm --assemble /dev/md1 --uuid=123456789

or something like that anyway.  If not, your arrays may not start
properly in case you shuffled disks (e.g replaced a bad one), or
your disks were renumbered after a kernel or other hardware change
and so on.  The most convient place to store that info is mdadm.conf.
Here, it looks just like:

DEVICE partitions
ARRAY /dev/md1 UUID=4ee58096:e5bc04ac:b02137be:3792981a
ARRAY /dev/md2 UUID=b4dec03f:24ec8947:1742227c:761aa4cb

By default mdadm offers additional information which helps to
diagnose possible problems, namely:

ARRAY /dev/md5 level=raid5 num-devices=4 
UUID=6dc4e503:85540e55:d935dea5:d63df51b

This new info isn't necessary for mdadm to work (but UUID is),
yet it comes handy sometimes.

/mjt
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Auto generation of mdadm.conf (was: Deleting mdadm RAID arrays)

2008-02-05 Thread Janek Kozicki
Michael Tokarev said: (by the date of Tue, 05 Feb 2008 16:52:18 +0300)

 Janek Kozicki wrote:
  I'm not using mdadm.conf at all. 
 
 That's wrong, as you need at least something to identify the array
 components. 

I was afraid of that ;-) So, is that a correct way to automatically
generate a correct mdadm.conf ? I did it after some digging in man pages:

  echo 'DEVICE partitions'  mdadm.conf
  mdadm  --examine  --scan --config=mdadm.conf  ./mdadm.conf 

Now, when I do 'cat mdadm.conf' i get:

 DEVICE partitions
 ARRAY /dev/md/0 level=raid1 metadata=1 num-devices=3 
UUID=75b0f87879:539d6cee:f22092f4:7a6e6f name='backup':0
 ARRAY /dev/md/2 level=raid1 metadata=1 num-devices=3 
UUID=4fd340a6c4:db01d6f7:1e03da2d:bdd574 name=backup:2
 ARRAY /dev/md/1 level=raid5 metadata=1 num-devices=3 
UUID=22f22c3599:613d5231:d407a655:bdeb84 name=backup:1

Looks quite reasonable. Should I append it to /etc/mdadm/mdadm.conf ?
This file currently contains: (commented lines are left out)

  DEVICE partitions
  CREATE owner=root group=disk mode=0660 auto=yes
  HOMEHOST system
  MAILADDR root

This is the default content of /etc/mdadm/mdadm.conf on fresh debian
etch install.

best regards
-- 
Janek Kozicki
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-05 Thread Michael Tokarev
Moshe Yudkowsky wrote:
 Michael Tokarev wrote:
 Janek Kozicki wrote:
 Marcin Krol said: (by the date of Tue, 5 Feb 2008 11:42:19 +0100)

 2. How can I delete that damn array so it doesn't hang my server up
 in a loop?
 dd if=/dev/zero of=/dev/sdb1 bs=1M count=10

 This works provided the superblocks are at the beginning of the
 component devices.  Which is not the case by default (0.90
 superblocks, at the end of components), or with 1.0 superblocks.

   mdadm --zero-superblock /dev/sdb1
 
 Would that work if even if he doesn't update his mdadm.conf inside the
 /boot image? Or would mdadm attempt to build the array according to the
 instructions in mdadm.conf? I expect that it might depend on whether the
 instructions are given in terms of UUID or in terms of devices.

After zeroing superblocks, mdadm will NOT assemble the array,
regardless if using UUIDs or devices or whatever.  In order
to assemble the array, all component devices MUST have valid
superblocks and the superblocks must match each other.

mdadm --assemble in initramfs will simple fail to do its work.

/mjt
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-05 Thread Moshe Yudkowsky

Michael Tokarev wrote:

Janek Kozicki wrote:

Marcin Krol said: (by the date of Tue, 5 Feb 2008 11:42:19 +0100)


2. How can I delete that damn array so it doesn't hang my server up in a loop?

dd if=/dev/zero of=/dev/sdb1 bs=1M count=10


This works provided the superblocks are at the beginning of the
component devices.  Which is not the case by default (0.90
superblocks, at the end of components), or with 1.0 superblocks.

  mdadm --zero-superblock /dev/sdb1


Would that work if even if he doesn't update his mdadm.conf inside the 
/boot image? Or would mdadm attempt to build the array according to the 
instructions in mdadm.conf? I expect that it might depend on whether the 
instructions are given in terms of UUID or in terms of devices.


--
Moshe Yudkowsky * [EMAIL PROTECTED] * www.pobox.com/~moshe
 I think it a greater honour to have my head standing on the ports
  of this town for this quarrel, than to have my portrait in the
  King's bedchamber. -- Montrose, 20 May 1650
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Deleting mdadm RAID arrays

2008-02-05 Thread Neil Brown
On Tuesday February 5, [EMAIL PROTECTED] wrote:
 
 % mdadm --zero-superblock /dev/sdb1
 mdadm: Couldn't open /dev/sdb1 for write - not zeroing

That's weird.
Why can't it open it?

Maybe you aren't running as root (The '%' prompt is suspicious).
Maybe the kernel has  been told to forget about the partitions of
/dev/sdb.
mdadm will sometimes tell it to do that, but only if you try to
assemble arrays out of whole components.

If that is the problem, then
   blockdev --rereadpt /dev/sdb

will fix it.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html