Re: Strange intermittant errors + RAID doesn't fail the disk.

2006-07-06 Thread Christian Pernegger

 I suggest you find a SATA related mailing list to post this to (Look
 in the MAINTAINERS file maybe) or post it to linux-kernel.


linux-ide couldn't help much, aside from recommending a bleeding-edge
patchset which should fix a lot of things SATA:
http://home-tj.org/files/libata-tj-stable/

What fixed the error, though, was exchanging one of the cables. (Just
my luck, it was new and supposedly quality, ... oh well)

I'm still interested in why the md code didn't fail the disk. While it
was 'up' any access to the array would hang for a long time,
ultimately fail and corrupt the fs to boot. When I failed the disk
manually everything was fine (if degraded) again.

Regards,

C.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Can't get md array to shut down cleanly

2006-07-06 Thread Christian Pernegger

Still more problems ... :(

My md raid5 still does not always shut down cleanly. The last few
lines of the shutdown sequence are always as follows:

[...]
Will now halt.
md: stopping all md devices.
md: md0 still in use.
Synchronizing SCSI cache for disk /dev/sdd:
Synchronizing SCSI cache for disk /dev/sdc:
Synchronizing SCSI cache for disk /dev/sdb:
Synchronizing SCSI cache for disk /dev/sda:
Shutdown: hde
System halted.

Most of the time the md array comes up clean on the next boot, often
enough it does not. Having the array rebuild after every other reboot
is not my idea of fun, because the only reason to take it down is to
exchange a failing disk.

Again, help appreciated - I don't dare putting the system into
production like that.

Regards

C.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RAID 5 crash, is there any way to recover some data ?

2006-07-06 Thread Sevrin Robstad

Is there not any way for me to recover my data

As I said, i have already rebuilt the raid with wrong disc setup... Now 
i have built the raid with the right setup, and one missing disk.
But there's not any ext3 partition on it anymore. Is there any way to 
search trough the md0 device and find some of the data ???



Sevrin
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


issue with internal bitmaps

2006-07-06 Thread Luca Berra

hello, i just realized that internal bitmaps do not seem to work
anymore.

kernel 2.6.17
mdadm 2.5.2

[EMAIL PROTECTED] ~]# mdadm --create --level=1 -n 2 -e 1 --bitmap=internal 
/dev/md100 /dev/sda1 /dev/sda2
mdadm: array /dev/md100 started.

... wait awhile ...

[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid1]
md100 : active raid1 sda2[1] sda1[0]
 1000424 blocks super 1.0 [2/2] [UU]
 bitmap: 4/4 pages [16KB], 128KB chunk

unused devices: none
[EMAIL PROTECTED] ~]# mdadm -X /dev/md100
   Filename : /dev/md100
  Magic : 
mdadm: invalid bitmap magic 0x0, the bitmap file appears to be corrupted
Version : 0
mdadm: unknown bitmap version 0, either the bitmap file is corrupted or you 
need to upgrade your tools

[EMAIL PROTECTED] ~]# cat /proc/mdstat
Personalities : [raid1]
md100 : active raid1 sda2[1] sda1[0]
 1000424 blocks super 1.0 [2/2] [UU]
 bitmap: 0/4 pages [0KB], 128KB chunk

unused devices: none

[EMAIL PROTECTED] ~]# mdadm -D /dev/md100
/dev/md100:
   Version : 01.00.03
 Creation Time : Thu Jul  6 16:05:10 2006
Raid Level : raid1
Array Size : 1000424 (977.14 MiB 1024.43 MB)
   Device Size : 1000424 (977.14 MiB 1024.43 MB)
  Raid Devices : 2
 Total Devices : 2
Preferred Minor : 100
   Persistence : Superblock is persistent

 Intent Bitmap : Internal

   Update Time : Thu Jul  6 16:07:11 2006
 State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
 Spare Devices : 0

  Name : 100
  UUID : 60cd0dcb:fde52377:699453f7:da96b9d4
Events : 1

   Number   Major   Minor   RaidDevice State
  0   810  active sync   /dev/sda1
  1   821  active sync   /dev/sda2



--
Luca Berra -- [EMAIL PROTECTED]
   Communication Media  Services S.r.l.
/\
\ / ASCII RIBBON CAMPAIGN
 XAGAINST HTML MAIL
/ \
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't get md array to shut down cleanly

2006-07-06 Thread Niccolo Rigacci
 My md raid5 still does not always shut down cleanly. The last few
 lines of the shutdown sequence are always as follows:
 
 [...]
 Will now halt.
 md: stopping all md devices.
 md: md0 still in use.
 Synchronizing SCSI cache for disk /dev/sdd:
 Synchronizing SCSI cache for disk /dev/sdc:
 Synchronizing SCSI cache for disk /dev/sdb:
 Synchronizing SCSI cache for disk /dev/sda:
 Shutdown: hde
 System halted.


May be your shutdown script is doing halt -h? Halting the disk 
immediately without letting the RAID to settle to a clean state 
can be the cause?

I see that my Debian avoids the -h option if running RAID,
from /etc/init.d/halt:


# Don't shut down drives if we're using RAID.
hddown=-h
if grep -qs '^md.*active' /proc/mdstat
then
hddown=
fi

# If INIT_HALT=HALT don't poweroff.
poweroff=-p
if [ $INIT_HALT = HALT ]
then
poweroff=
fi

log_action_msg Will now halt
sleep 1
halt -d -f -i $poweroff $hddown


-- 
Niccolo Rigacci
Firenze - Italy

Iraq, missione di pace: 38839 morti - www.iraqbodycount.net
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't get md array to shut down cleanly

2006-07-06 Thread Christian Pernegger

May be your shutdown script is doing halt -h? Halting the disk
immediately without letting the RAID to settle to a clean state
can be the cause?


I'm using Debian as well and my halt script has the fragment you posted.
Besides, shouldn't the array be marked clean at this point:


md: stopping all md devices.


Apparently it isn't ... :


md: md0 still in use.


If someone thinks it might make a difference I could remove everything
evms and create a pure md array with mdadm. (Directly on the disks
or on partitions? Which partition type?)

How does a normal shutdown look?

Will try 2.6.16 and 2.6.15 now ... the boring part is that I have to
wait for the resync to complete before the next test ...

Thank you,

C.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


second controller: what will my discs be called, and does it matter?

2006-07-06 Thread Dexter Filmore
Currently I have 4 discs on a 4 channel sata controller which does its job 
quite well for 20 bucks. 
Now, if I wanted to grow the array I'd probably go for another one of these.

How can I tell if the discs on the new controller will become sd[e-h] or if 
they'll be the new a-d and push the existing ones back?

Next question: assembling by UUID, does that matter at all?
(And while talking UUID - can I safely migrate to a udev-kernel? Someone on 
this list recently ran into trouble because of such an issue.)

Dex

-- 
-BEGIN GEEK CODE BLOCK-
Version: 3.12
GCS d--(+)@ s-:+ a- C UL++ P+++ L+++ E-- W++ N o? K-
w--(---) !O M+ V- PS+ PE Y++ PGP t++(---)@ 5 X+(++) R+(++) tv--(+)@ 
b++(+++) DI+++ D- G++ e* h++ r* y?
--END GEEK CODE BLOCK--

http://www.stop1984.com
http://www.againsttcpa.com
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't get md array to shut down cleanly

2006-07-06 Thread thunder7
From: Christian Pernegger [EMAIL PROTECTED]
Date: Thu, Jul 06, 2006 at 07:18:06PM +0200
 May be your shutdown script is doing halt -h? Halting the disk
 immediately without letting the RAID to settle to a clean state
 can be the cause?
 
 I'm using Debian as well and my halt script has the fragment you posted.
 Besides, shouldn't the array be marked clean at this point:
 
 md: stopping all md devices.
 
 Apparently it isn't ... :
 
 md: md0 still in use.
 
 If someone thinks it might make a difference I could remove everything
 evms and create a pure md array with mdadm. (Directly on the disks
 or on partitions? Which partition type?)
 
 How does a normal shutdown look?
 
 Will try 2.6.16 and 2.6.15 now ... the boring part is that I have to
 wait for the resync to complete before the next test ...
 
I get these messages too on Debian Unstable, but since enabling the
bitmaps on my devices, resyncing is so fast that I don't even notice it
on booting. Waiting for resync is not happening here. I'm seeing it on
my raid-1 root partition.

Good luck,
Jurriaan
-- 
Debian (Unstable) GNU/Linux 2.6.17-rc4-mm3 2815 bogomips load 2.02
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't get md array to shut down cleanly

2006-07-06 Thread Christian Pernegger

I get these messages too on Debian Unstable, but since enabling the
bitmaps on my devices, resyncing is so fast that I don't even notice it
on booting.


Bitmaps are great, but the speed of the rebuild is not the problem.
The box doesn't have hotswap bays, so I have to shut it down to
replace a failed disk. If the array decides that it wasn't clean after
the exchange I'm suddenly looking at a dead array. Yes, forcing
assembly _should_ work but I'd rather have it shutdown cleanly in the
first place.

Regards,

C.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: second controller: what will my discs be called, and does it matter?

2006-07-06 Thread John Stoffel

Dexter Currently I have 4 discs on a 4 channel sata controller which
Dexter does its job quite well for 20 bucks.  Now, if I wanted to
Dexter grow the array I'd probably go for another one of these.

So, which SATA controller are you using?  I'm thinking my next box
will go SATA, but I'm still deciding

Dexter How can I tell if the discs on the new controller will become
Dexter sd[e-h] or if they'll be the new a-d and push the existing
Dexter ones back?

Dexter Next question: assembling by UUID, does that matter at all?
Dexter (And while talking UUID - can I safely migrate to a
Dexter udev-kernel? Someone on this list recently ran into trouble
Dexter because of such an issue.)

I'm using udev and it's not a problem, but I admit I'm also
auto-assembling my arrays using the kernel detection stuff.  All I
have in my mdadm.conf file is 

MAILADDR john
PARTITIONS

and it just works for me.  But using the UUID is the way to go.

Though I admit my system may not be setup like you think.  I have a
pair of 120gb drives which are mirrored.  On top of them I have LVM
configured and a pair of partitions setup, so I can grow/move/shrink
them at some point.  Hasn't happened too often yet.  

So I'm not using ext3 - MD - disks, I'm adding in the LVM layer.

John
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange intermittant errors + RAID doesn't fail the disk.

2006-07-06 Thread Neil Brown
On Thursday July 6, [EMAIL PROTECTED] wrote:
   I suggest you find a SATA related mailing list to post this to (Look
   in the MAINTAINERS file maybe) or post it to linux-kernel.
 
 linux-ide couldn't help much, aside from recommending a bleeding-edge
 patchset which should fix a lot of things SATA:
 http://home-tj.org/files/libata-tj-stable/
 
 What fixed the error, though, was exchanging one of the cables. (Just
 my luck, it was new and supposedly quality, ... oh well)
 
 I'm still interested in why the md code didn't fail the disk. While it
 was 'up' any access to the array would hang for a long time,
 ultimately fail and corrupt the fs to boot. When I failed the disk
 manually everything was fine (if degraded) again.

md is very dependant on the driver doing the right thing.  It doesn't
do any timeouts or anything like that - it assumes the driver will. 
md simply trusts the return status from the drive, and fails a drive
if and only if a write to the drive is reported as failing (if a read
fails, md trys to over-write with good data first).

I don't know exactly how the driver was responding to the bad cable,
but it clearly wasn't returning an error, so md didn't fail it.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't get md array to shut down cleanly

2006-07-06 Thread Neil Brown
On Thursday July 6, [EMAIL PROTECTED] wrote:
 Still more problems ... :(
 
 My md raid5 still does not always shut down cleanly. The last few
 lines of the shutdown sequence are always as follows:
 
 [...]
 Will now halt.
 md: stopping all md devices.
 md: md0 still in use.
 Synchronizing SCSI cache for disk /dev/sdd:
 Synchronizing SCSI cache for disk /dev/sdc:
 Synchronizing SCSI cache for disk /dev/sdb:
 Synchronizing SCSI cache for disk /dev/sda:
 Shutdown: hde
 System halted.
 
 Most of the time the md array comes up clean on the next boot, often
 enough it does not. Having the array rebuild after every other reboot
 is not my idea of fun, because the only reason to take it down is to
 exchange a failing disk.
 

How are you shutting down the machine?  If something sending SIGKILL
to all processes?  If it does, then md really should shut down cleanly
every time

That said, I do see some room for improvement in the md shutdown
sequence - it shouldn't give up at that point just because the device
seems to be in use  I'll look into that.
You could try the following patch.  I think it should be safe.

NeilBrown

Signed-off-by: Neil Brown [EMAIL PROTECTED]

### Diffstat output
 ./drivers/md/md.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c   2006-07-07 08:11:43.0 +1000
+++ ./drivers/md/md.c   2006-07-07 08:12:15.0 +1000
@@ -3217,7 +3217,7 @@ static int do_md_stop(mddev_t * mddev, i
struct gendisk *disk = mddev-gendisk;
 
if (mddev-pers) {
-   if (atomic_read(mddev-active)2) {
+   if (mode != 1  atomic_read(mddev-active)2) {
printk(md: %s still in use.\n,mdname(mddev));
return -EBUSY;
}
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: issue with internal bitmaps

2006-07-06 Thread Neil Brown
On Thursday July 6, [EMAIL PROTECTED] wrote:
 hello, i just realized that internal bitmaps do not seem to work
 anymore.

I cannot imagine why.  Nothing you have listed show anything wrong
with md...

Maybe you were expecting
   mdadm -X /dev/md100
to do something useful.  Like -E, -X must be applied to a component
device.  Try
   mdadm -X /dev/sda1

Patch to documentation are always welcome :-)

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: second controller: what will my discs be called, and does it matter?

2006-07-06 Thread Neil Brown
On Thursday July 6, [EMAIL PROTECTED] wrote:
 Currently I have 4 discs on a 4 channel sata controller which does its job 
 quite well for 20 bucks. 
 Now, if I wanted to grow the array I'd probably go for another one of these.
 
 How can I tell if the discs on the new controller will become sd[e-h] or if 
 they'll be the new a-d and push the existing ones back?

No idea.  Probably depends on where you plug it in, but also on the
phase of the moon :-)

 
 Next question: assembling by UUID, does that matter at all?

No.

 (And while talking UUID - can I safely migrate to a udev-kernel? Someone on 
 this list recently ran into trouble because of such an issue.)

Depends on what you mean by 'safely'.

If you mean can I change something that fundamental and expect
everything will just work perfectly without me having to think , then
no. 

If you mean should it be fairly easy to fix anything that breaks, and
can I be confident that my data will be safe even if something does
break, then yes.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


zeroing old superblocks upgrading...

2006-07-06 Thread John Stoffel

Neil,

First off, thanks for all your hard work on this software, it's really
a great thing to have.

But I've got some interesting issues here.  Though not urgent.  As
I've said in other messages, I've got a pair of 120gb HDs mirrored.
I'm using MD across partitions, /dev/hde1 and /dev/hdg1.  Works
nicely.

But I see that I have an old superblock sitting around on /dev/hde
(notice, no partition here!) which I'd like to clean up.

# mdadm -E /dev/hde
/dev/hde:
  Magic : a92b4efc
Version : 00.90.00
   UUID : 9835ebd0:5d02ebf0:907edc91:c4bf97b2
  Creation Time : Fri Oct 24 19:11:02 2003
 Raid Level : raid1
Device Size : 117220736 (111.79 GiB 120.03 GB)
 Array Size : 117220736 (111.79 GiB 120.03 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

Update Time : Fri Oct 24 19:21:59 2003
  State : clean
 Active Devices : 1
Working Devices : 1
 Failed Devices : 1
  Spare Devices : 0
   Checksum : 79d2a6fd - correct
 Events : 0.2


  Number   Major   Minor   RaidDevice State
this 0   300  active sync   /dev/hda

   0 0   300  active sync   /dev/hda
   1 1   001  faulty


Here's the correct ones:

# mdadm -E /dev/hde1
/dev/hde1:
  Magic : a92b4efc
Version : 00.90.00
   UUID : 2e078443:42b63ef5:cc179492:aecf0094
  Creation Time : Fri Oct 24 19:23:41 2003
 Raid Level : raid1
Device Size : 117218176 (111.79 GiB 120.03 GB)
 Array Size : 117218176 (111.79 GiB 120.03 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0

Update Time : Thu Jul  6 18:21:08 2006
  State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0
   Checksum : 210069e5 - correct
 Events : 0.7762540


  Number   Major   Minor   RaidDevice State
this 0  3310  active sync   /dev/hde1

   0 0  3310  active sync   /dev/hde1
   1 1  3411  active sync   /dev/hdg1


I can't seem to zero it out:

  # mdadm --misc --zero-superblock /dev/hde
  mdadm: Couldn't open /dev/hde for write - not zeroing

Should I just ignore this, or should I break off /dev/hde from the
array and scrub the disk and then re-add it back in?

Also, can I upgrade my superblock to the latest version with out any
problems?

Thanks,
John
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Strange intermittant errors + RAID doesn't fail the disk.

2006-07-06 Thread Christian Pernegger

md is very dependant on the driver doing the right thing.  It doesn't
do any timeouts or anything like that - it assumes the driver will.
md simply trusts the return status from the drive, and fails a drive
if and only if a write to the drive is reported as failing (if a read
fails, md trys to over-write with good data first).

I don't know exactly how the driver was responding to the bad cable,
but it clearly wasn't returning an error, so md didn't fail it.


There were a lot of errors in dmesg -- seems like they did not get
passed up to md? I find it surprising that the md layer doesn't have
its own timeouts, but then I know nothing about such things :)

Thanks for clearing this up for me,

C.

[...]
ata2: port reset, p_is 800 is 2 pis 0 cmd 44017 tf d0 ss 123 se 0
ata2: status=0x50 { DriveReady SeekComplete }
sdc: Current: sense key: No Sense
  Additional sense: No additional sense information
ata2: handling error/timeout
ata2: port reset, p_is 0 is 0 pis 0 cmd 44017 tf 150 ss 123 se 0
ata2: status=0x50 { DriveReady SeekComplete }
ata2: error=0x01 { AddrMarkNotFound }
sdc: Current: sense key: No Sense
  Additional sense: No additional sense information
[repeat]
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zeroing old superblocks upgrading...

2006-07-06 Thread Neil Brown
On Thursday July 6, [EMAIL PROTECTED] wrote:
 
 Neil,
 
 First off, thanks for all your hard work on this software, it's really
 a great thing to have.
 
 But I've got some interesting issues here.  Though not urgent.  As
 I've said in other messages, I've got a pair of 120gb HDs mirrored.
 I'm using MD across partitions, /dev/hde1 and /dev/hdg1.  Works
 nicely.
 
 But I see that I have an old superblock sitting around on /dev/hde
 (notice, no partition here!) which I'd like to clean up.
 
...
 
 I can't seem to zero it out:
 
   # mdadm --misc --zero-superblock /dev/hde
   mdadm: Couldn't open /dev/hde for write - not zeroing
 
 Should I just ignore this, or should I break off /dev/hde from the
 array and scrub the disk and then re-add it back in?

You could ignore it - it shouldn't hurt.
But if you wanted to (and were running a fairly recent kernel) you
could
  mdadm --grow --bitmap=internal /dev/md0
  mdadm /dev/md0 --fail /dev/hde1 --remove /dev/hde1
  mdadm --zero-superblock /dev/hde
  mdadm /dev/md0 --add /dev/hde1
  mdadm --grow --bitmap=none /dev/md0

and it should work with minimal resync...

Though thinking about it - after the first --grow, check that the
unwanted bitmap is still there.  It is quite possible that the
internal bitmap will over-write the unwanted superblock (depending on
the exact size and aligment of hde1 compared with hde).
If it is gone, then don't bother with the rest of the sequence. 


 
 Also, can I upgrade my superblock to the latest version with out any
 problems?

The only problem with superblock version numbers is that they are
probably confusing.  If you don't worry about them, they should just
do the right thing.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Array will not assemble

2006-07-06 Thread Richard Scobie
Perhaps I am misunderstanding how assemble works, but I have created a 
new RAID 1 array on a pair of SCSI drives and am having difficulty 
re-assembling it after a reboot.


The relevent mdadm.conf entry looks like this:


ARRAY /dev/md3 level=raid1 num-devices=2 
UUID=72189255:acddbac3:316abdb0:9152808d devices=/dev/sdc,/dev/sdd


The above is all one line.

If I run mdadm -As, I get the error:

mdadm: no devices found for /dev/md3.

If I then run:

mdadm --assemble /dev/md3 /dev/sdc /dev/sdd

it all starts up fine.

Thanks,

Richard
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Array will not assemble

2006-07-06 Thread Neil Brown
On Friday July 7, [EMAIL PROTECTED] wrote:
 Perhaps I am misunderstanding how assemble works, but I have created a 
 new RAID 1 array on a pair of SCSI drives and am having difficulty 
 re-assembling it after a reboot.
 
 The relevent mdadm.conf entry looks like this:
 
 
 ARRAY /dev/md3 level=raid1 num-devices=2 
 UUID=72189255:acddbac3:316abdb0:9152808d devices=/dev/sdc,/dev/sdd

Add
  DEVICE /dev/sd?
or similar on a separate line.
Remove
  devices=/dev/sdc,/dev/sdd

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can't get md array to shut down cleanly

2006-07-06 Thread Christian Pernegger

How are you shutting down the machine?  If something sending SIGKILL
to all processes?


First SIGTERM, then SIGKILL, yes.


You could try the following patch.  I think it should be safe.


Hmm, it said chunk failed, so I replaced the line by hand. That didn't
want to compile because mode supposedly wasn't defined ... was that
supposed to be mddev-safemode? Closest thing to a mode I could find
...

Anyway, this is much better: (lines with * are new)

Done unmounting local file systems.
*md: md0 stopped
*md: unbind sdf
*md: export_rdevsdf
*[last two lines for each disk.]
*Stopping RAID arrays ... done (1 array(s) stopped).
Mounting root filesystem read-only ... done
Will now halt.
md: stopping all md devices
* md: md0 switched to read-only mode
Synchronizing SCSI cache for disk /dev/sdf:
[...]

As you can see the error message is gone now. Much more interesting
are the lines before the Will now halt. line. Those were not there
before -- apparently this first attempt by whatever to shutdown the
array failed silently.

Not sure if this actually fixes the resync problem (I sure hope so,
after the last of these no fs could be found anymore on the device)
but it's 5 past 3 already, will try tomorrow.

Thanks,

C.
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Array will not assemble

2006-07-06 Thread Richard Scobie

Neil Brown wrote:


Add
  DEVICE /dev/sd?
or similar on a separate line.
Remove
  devices=/dev/sdc,/dev/sdd


Thanks.

My mistake, I thought after having assembled the arrays initially, that 
the output of:


 mdadm --detail --scan  mdadm.conf

could be used directly.

I'm using Centos 4.3, which I believe is the latest RHEL 4 and they are 
only on mdadm 1.6  :(


Regards,

Richard
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Mounting array was read write for about 3 minutes, then Read-only file system error

2006-07-06 Thread Brian Bonner

I created a raid1 array using /dev/disk/by-id with (2) 250GB USB 2.0
Drives.  It was working for about 2 minutes until I tried to copy a
directory tree from one drive to the array and then cancelled it
midstream.  After cancelling the copy, when I list the contents of the
directory it doesn't show anything there.

When I try to create a file, I get the following error msg:

[EMAIL PROTECTED] ~]# cd /mnt/usb250
[EMAIL PROTECTED] usb250]# ls
lost+found
[EMAIL PROTECTED] usb250]# touch test.txt
touch: cannot touch `test.txt': Read-only file system
[EMAIL PROTECTED] usb250]#

even though when I show the mounts, it displays:

[EMAIL PROTECTED] ~]# mount -l
/dev/mapper/VolGroup01-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/md0 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/mapper/VolGroup01-LogVol02 on /home type ext3 (rw)
/dev/mapper/VolGroup01-LogVol01 on /tmp type ext3 (rw)
/dev/mapper/VolGroup01-LogVol03 on /usr type ext3 (rw)
/dev/mapper/VolGroup01-LogVol04 on /var type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
automount(pid2195) on /net type autofs (rw,fd=4,pgrp=2195,minproto=2,maxproto=4)

/dev/md3 on /mnt/usb250 type ext3 (rw)

Here's my mdadm.conf file, and the details on the array.

Does anyone have any thoughts on why this is acting this way?  I'm stumped.

Thanks,

Brian

/etc/mdadm.conf

DEVICE partitions /dev/disk/by-id/usb-ST325082_3A_000DCC /dev/disk/by-id/usb-ST3
25082_3A_001A41
MAILADDR root
ARRAY /dev/md3 level=raid1 num-devices=2 devices=/dev/disk/by-id/usb-ST325082_3A
_000DCC,/dev/disk/by-id/usb-ST325082_3A_001A41
ARRAY /dev/md2 level=raid1 num-devices=2 uuid=a94ec909:c3c5386b:5e8b311e:6eea3fb
4
ARRAY /dev/md0 level=raid1 num-devices=2 uuid=8406:0792eb6b:aa28aac9:6217b69
5
ARRAY /dev/md1 level=raid1 num-devices=2 uuid=0e5beeb3:49ffc108:4b09e081:5966b20
c



[EMAIL PROTECTED] ~]# mdadm --detail /dev/md3
/dev/md3:
   Version : 00.90.03
 Creation Time : Wed Jul  5 06:28:18 2006
Raid Level : raid1
Array Size : 244198464 (232.89 GiB 250.06 GB)
   Device Size : 244198464 (232.89 GiB 250.06 GB)
  Raid Devices : 2
 Total Devices : 2
Preferred Minor : 3
   Persistence : Superblock is persistent

   Update Time : Thu Jul  6 09:07:44 2006
 State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
 Spare Devices : 0

  UUID : ff7d7a3c:a458efba:16a174b4:bd4b002e
Events : 0.70

   Number   Major   Minor   RaidDevice State
  0   8   480  active sync   /dev/sdd
  1   8   641  active sync   /dev/sde
[EMAIL PROTECTED] ~]#

[EMAIL PROTECTED] ~]# more /proc/mdstat
Personalities : [raid1]
md3 : active raid1 sdd[0] sde[1]
 244198464 blocks [2/2] [UU]

md0 : active raid1 sdb1[1] sda1[0]
 200704 blocks [2/2] [UU]

md1 : active raid1 sdb2[1] sda2[0]
 2096384 blocks [2/2] [UU]

md2 : active raid1 sdb3[1] sda3[0]
 75826688 blocks [2/2] [UU]

unused devices: none
-
To unsubscribe from this list: send the line unsubscribe linux-raid in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html