Re: raid problem

2013-11-30 Thread François Patte
Le 30/11/2013 06:39, Stan Hoeppner a écrit :
 On 11/29/2013 4:43 PM, François Patte wrote:
 Bonsoir,

 I have a problem with 2 raid arrays: I have 2 disks (sdc and sdd) in
 raid1 arrays.

 One disk (sdc) failed and I replaced it by a new one. Copying the
 partition table from sdd disk using sfdisk:

 sfdisk -d /dev/sdd | sfdisk /dev/sdc

 then I added the 2 partitions (sdc1 and sdc3) to the arrays md0 and md1:

 mdadm --add /dev/md0 /dev/sdc1

 mdadm --add /dev/md1 /dev/sdc3

 There were no problem with the md0 array:


 cat /proc/mdstat gives:

 md0 : active raid1 sdc1[1] sdd1[0]
   1052160 blocks [2/2] [UU]


 But for the md1 array, I get:

 md1 : active raid1 sdc3[2](S) sdd3[0]
   483138688 blocks [2/1] [U_]


 And mdadm --detail /dev/md1 returns:

 /dev/md1:
 Version : 0.90
   Creation Time : Sat Mar  7 11:48:30 2009
  Raid Level : raid1
  Array Size : 483138688 (460.76 GiB 494.73 GB)
   Used Dev Size : 483138688 (460.76 GiB 494.73 GB)
Raid Devices : 2
   Total Devices : 2
 Preferred Minor : 1
 Persistence : Superblock is persistent

 Update Time : Fri Nov 29 21:23:25 2013
   State : clean, degraded
  Active Devices : 1
 Working Devices : 2
  Failed Devices : 0
   Spare Devices : 1

UUID : 2e8294de:9b0d8d96:680a5413:2aac5c13
  Events : 0.72076

 Number   Major   Minor   RaidDevice State
0   8   510  active sync   /dev/sdd3
2   002  removed

2   8   35-  spare   /dev/sdc3

 While mdadm --examine /dev/sdc3 returns:

 /dev/sdc3:
   Magic : a92b4efc
 Version : 0.90.00
UUID : 2e8294de:9b0d8d96:680a5413:2aac5c13
   Creation Time : Sat Mar  7 11:48:30 2009
  Raid Level : raid1
   Used Dev Size : 483138688 (460.76 GiB 494.73 GB)


  Array Size : 483138688 (460.76 GiB 494.73 GB)
Raid Devices : 2
   Total Devices : 2
 Preferred Minor : 1

 Update Time : Fri Nov 29 23:03:41 2013
   State : clean
  Active Devices : 1
 Working Devices : 2
  Failed Devices : 1
   Spare Devices : 1
Checksum : be8bd27f - correct
  Events : 72078


   Number   Major   Minor   RaidDevice State
 this 2   8   352  spare   /dev/sdc3

0 0   8   510  active sync   /dev/sdd3
1 1   001  faulty removed
2 2   8   352  spare   /dev/sdc3


 What is the problem? And how can I recover a correct md1 array?
 
 IIRC Linux md rebuilds multiple degraded arrays sequentially, not in
 parallel.  This is due to system performance impact and other reasons.
 When the rebuild of md0 is finished, the rebuild of md1/sdc3 should
 start automatically.  If this did not occur please let us know and we'll
 go from there.

I thought it was clear enough that the result of commands mdadm
--details or mdadm --examine were what they return *after* the rebuild
of array md1.

On reboot, I am warned that md1 is started with one disk out of two and
one spare and recovery starts immediately:

md1 : active raid1 sdd3[0] sdc3[2]
  483138688 blocks [2/1] [U_]
  [=...]  recovery =  7.5% (36521408/483138688)
finish=89.1min speed=83445K/sec

After that, the situation is what is quoted in my previous message

Regards.


-- 
François Patte
UFR de mathématiques et informatique
Laboratoire CNRS MAP5, UMR 8145
Université Paris Descartes
45, rue des Saints Pères
F-75270 Paris Cedex 06
Tél. +33 (0)1 8394 5849
http://www.math-info.univ-paris5.fr/~patte



signature.asc
Description: OpenPGP digital signature


Re: raid problem

2013-11-30 Thread Andre Majorel
On 2013-11-29 23:43 +0100, François Patte wrote:

 I have a problem with 2 raid arrays: I have 2 disks (sdc and sdd) in
 raid1 arrays.
 
 One disk (sdc) failed and I replaced it by a new one. Copying the
 partition table from sdd disk using sfdisk:
 
 sfdisk -d /dev/sdd | sfdisk /dev/sdc
 
 then I added the 2 partitions (sdc1 and sdc3) to the arrays md0 and md1:
 
 mdadm --add /dev/md0 /dev/sdc1
 
 mdadm --add /dev/md1 /dev/sdc3
 
 There were no problem with the md0 array:
 
 
 cat /proc/mdstat gives:
 
 md0 : active raid1 sdc1[1] sdd1[0]
   1052160 blocks [2/2] [UU]
 
 
 But for the md1 array, I get:
 
 md1 : active raid1 sdc3[2](S) sdd3[0]
   483138688 blocks [2/1] [U_]

 What is the problem? And how can I recover a correct md1 array?

The root of your problem would be that /dev/sdc3 is considered
spare, not active. Not sure why.

Guess #1 : before physically changing the disks, you forgot
  mdadm /dev/md1 --fail   /dev/sdc3
  mdadm /dev/md1 --remove /dev/sdc3

Guess #2 : maybe there were I/O errors during the add. How far
did the sync go ? Run smartctl -d ata -A /dev/sdc3 and look for
non-zero raw values for Reallocated_Sector_Ct and
Current_Pending_Sector. What does badblocks /dev/sdc3 say ?

Guess #3 : it's a software hiccup and all /dev/sdc3 needs is to
be removed from /dev/md1 and re-added.

-- 
André Majorel http://www.teaser.fr/~amajorel/
bugs.debian.org, a spammer's favourite.


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20131130115635.gc26...@aym.net2.nerim.net



Re: raid problem

2013-11-30 Thread François Patte
Le 30/11/2013 12:56, Andre Majorel a écrit :
 On 2013-11-29 23:43 +0100, François Patte wrote:
 
 I have a problem with 2 raid arrays: I have 2 disks (sdc and sdd) in
 raid1 arrays.

 One disk (sdc) failed and I replaced it by a new one. Copying the
 partition table from sdd disk using sfdisk:

 sfdisk -d /dev/sdd | sfdisk /dev/sdc

 then I added the 2 partitions (sdc1 and sdc3) to the arrays md0 and md1:

 mdadm --add /dev/md0 /dev/sdc1

 mdadm --add /dev/md1 /dev/sdc3

 There were no problem with the md0 array:


 cat /proc/mdstat gives:

 md0 : active raid1 sdc1[1] sdd1[0]
   1052160 blocks [2/2] [UU]


 But for the md1 array, I get:

 md1 : active raid1 sdc3[2](S) sdd3[0]
   483138688 blocks [2/1] [U_]

 What is the problem? And how can I recover a correct md1 array?
 
 The root of your problem would be that /dev/sdc3 is considered
 spare, not active. Not sure why.

Thank you for answering

 
 Guess #1 : before physically changing the disks, you forgot
   mdadm /dev/md1 --fail   /dev/sdc3
   mdadm /dev/md1 --remove /dev/sdc3

No, I didn't!

 
 Guess #2 : maybe there were I/O errors during the add. How far
 did the sync go ? Run smartctl -d ata -A /dev/sdc3 and look for
 non-zero raw values for Reallocated_Sector_Ct and
 Current_Pending_Sector. What does badblocks /dev/sdc3 say ?

No non-zero values for these two... no badblocks on sdc3

 
 Guess #3 : it's a software hiccup and all /dev/sdc3 needs is to
 be removed from /dev/md1 and re-added.

I tried without any success...

But something is strange: there are some badblocks on sdd3! logwatch
returs errors on sdd disk:

md/raid1:md1: sdd: unrecoverable I/O read error for block 834749 ...:  3
Time(s)
res 41/40:00:6f:56:61/00:00:32:00:00/40 Emask 0x409 (media error) F
...:  24 Time(s)
sd 5:0:0:0: [sdd]  Add. Sense: Unrecovered read error - auto reallocat
...:  6 Time(s)
sd 5:0:0:0: [sdd]  Sense Key : Medium Error [current] [descr ...:  6 Time(s)

mdmonitor returns:

This is an automatically generated mail message from mdadm
running on dipankar

A FailSpare event had been detected on md device /dev/md1.

It could be related to component device /dev/sdc3.


If I summarize the situation: the faulty disk (with badblocks) is sdd3,
but it is the only active disk in the md1 array and I can fully access
the data of this disk which is normally mounted at boot time, while the
disk sdc3 has no badblocks and is declared as faulty by mdadm!!

I don't understand something!

Anyway. I can delete this array and create a new one from scratch (after
replacing the faulty disk).

Is it enough to run these commands:

mdadm --zero-superblock /dev/sdc3

mdadm --zero-superblock /dev/sdd3

Or do I have also to modify the /etc/mdadm/mdadm.conf file?

Thank you for your answer.


-- 
François Patte
UFR de mathématiques et informatique
Laboratoire CNRS MAP5, UMR 8145
Université Paris Descartes
45, rue des Saints Pères
F-75270 Paris Cedex 06
Tél. +33 (0)1 8394 5849
http://www.math-info.univ-paris5.fr/~patte



signature.asc
Description: OpenPGP digital signature


Re: raid problem

2013-11-30 Thread Andre Majorel
On 2013-11-30 19:48 +0100, François Patte wrote:

 If I summarize the situation: the faulty disk (with badblocks)
 is sdd3, but it is the only active disk in the md1 array and I
 can fully access the data of this disk which is normally
 mounted at boot time, while the disk sdc3 has no badblocks and
 is declared as faulty by mdadm!!

To me it sounds like mdadm is confused and thinks the old sdc3
is still here. When you add the new sdc3, mdadm views it as a
third device. Which is one more device than md1 was created
with, so the new sdc3 is marked as spare.

I know the output of mdadm --detail /dev/md1 contradicts this
but it's my best shot.

 I don't understand something!
 
 Anyway. I can delete this array and create a new one from
 scratch (after replacing the faulty disk).

Yes, re-creating the array from scratch seems the next thing to
try.

Note that unless you first manually copy the contents of sdd3 on
sdc3, you better create md1 in two steps (first with with just
sdd3, then --add sdc3).

 Is it enough to run these commands:
 
 mdadm --zero-superblock /dev/sdc3
 
 mdadm --zero-superblock /dev/sdd3
 
 Or do I have also to modify the /etc/mdadm/mdadm.conf file?

No idea, I don't use mdadm.conf.

You may get better help from linux-r...@vger.kernel.org

-- 
André Majorel http://www.teaser.fr/~amajorel/
Thanks to the Debian project for going to such lengths never to
disclose the email addresses of their users. Think of all the
spam we would get if they didn't !


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20131201070740.gf26...@aym.net2.nerim.net



raid problem

2013-11-29 Thread François Patte
Bonsoir,

I have a problem with 2 raid arrays: I have 2 disks (sdc and sdd) in
raid1 arrays.

One disk (sdc) failed and I replaced it by a new one. Copying the
partition table from sdd disk using sfdisk:

sfdisk -d /dev/sdd | sfdisk /dev/sdc

then I added the 2 partitions (sdc1 and sdc3) to the arrays md0 and md1:

mdadm --add /dev/md0 /dev/sdc1

mdadm --add /dev/md1 /dev/sdc3

There were no problem with the md0 array:


cat /proc/mdstat gives:

md0 : active raid1 sdc1[1] sdd1[0]
  1052160 blocks [2/2] [UU]


But for the md1 array, I get:

md1 : active raid1 sdc3[2](S) sdd3[0]
  483138688 blocks [2/1] [U_]


And mdadm --detail /dev/md1 returns:

/dev/md1:
Version : 0.90
  Creation Time : Sat Mar  7 11:48:30 2009
 Raid Level : raid1
 Array Size : 483138688 (460.76 GiB 494.73 GB)
  Used Dev Size : 483138688 (460.76 GiB 494.73 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Fri Nov 29 21:23:25 2013
  State : clean, degraded
 Active Devices : 1
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 1

   UUID : 2e8294de:9b0d8d96:680a5413:2aac5c13
 Events : 0.72076

Number   Major   Minor   RaidDevice State
   0   8   510  active sync   /dev/sdd3
   2   002  removed

   2   8   35-  spare   /dev/sdc3

While mdadm --examine /dev/sdc3 returns:

/dev/sdc3:
  Magic : a92b4efc
Version : 0.90.00
   UUID : 2e8294de:9b0d8d96:680a5413:2aac5c13
  Creation Time : Sat Mar  7 11:48:30 2009
 Raid Level : raid1
  Used Dev Size : 483138688 (460.76 GiB 494.73 GB)


 Array Size : 483138688 (460.76 GiB 494.73 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 1

Update Time : Fri Nov 29 23:03:41 2013
  State : clean
 Active Devices : 1
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 1
   Checksum : be8bd27f - correct
 Events : 72078


  Number   Major   Minor   RaidDevice State
this 2   8   352  spare   /dev/sdc3

   0 0   8   510  active sync   /dev/sdd3
   1 1   001  faulty removed
   2 2   8   352  spare   /dev/sdc3


What is the problem? And how can I recover a correct md1 array?

Thank you.


-- 
François Patte
UFR de mathématiques et informatique
Laboratoire CNRS MAP5, UMR 8145
Université Paris Descartes
45, rue des Saints Pères
F-75270 Paris Cedex 06
Tél. +33 (0)1 8394 5849
http://www.math-info.univ-paris5.fr/~patte



signature.asc
Description: OpenPGP digital signature


Re: raid problem

2013-11-29 Thread Stan Hoeppner
On 11/29/2013 4:43 PM, François Patte wrote:
 Bonsoir,
 
 I have a problem with 2 raid arrays: I have 2 disks (sdc and sdd) in
 raid1 arrays.
 
 One disk (sdc) failed and I replaced it by a new one. Copying the
 partition table from sdd disk using sfdisk:
 
 sfdisk -d /dev/sdd | sfdisk /dev/sdc
 
 then I added the 2 partitions (sdc1 and sdc3) to the arrays md0 and md1:
 
 mdadm --add /dev/md0 /dev/sdc1
 
 mdadm --add /dev/md1 /dev/sdc3
 
 There were no problem with the md0 array:
 
 
 cat /proc/mdstat gives:
 
 md0 : active raid1 sdc1[1] sdd1[0]
   1052160 blocks [2/2] [UU]
 
 
 But for the md1 array, I get:
 
 md1 : active raid1 sdc3[2](S) sdd3[0]
   483138688 blocks [2/1] [U_]
 
 
 And mdadm --detail /dev/md1 returns:
 
 /dev/md1:
 Version : 0.90
   Creation Time : Sat Mar  7 11:48:30 2009
  Raid Level : raid1
  Array Size : 483138688 (460.76 GiB 494.73 GB)
   Used Dev Size : 483138688 (460.76 GiB 494.73 GB)
Raid Devices : 2
   Total Devices : 2
 Preferred Minor : 1
 Persistence : Superblock is persistent
 
 Update Time : Fri Nov 29 21:23:25 2013
   State : clean, degraded
  Active Devices : 1
 Working Devices : 2
  Failed Devices : 0
   Spare Devices : 1
 
UUID : 2e8294de:9b0d8d96:680a5413:2aac5c13
  Events : 0.72076
 
 Number   Major   Minor   RaidDevice State
0   8   510  active sync   /dev/sdd3
2   002  removed
 
2   8   35-  spare   /dev/sdc3
 
 While mdadm --examine /dev/sdc3 returns:
 
 /dev/sdc3:
   Magic : a92b4efc
 Version : 0.90.00
UUID : 2e8294de:9b0d8d96:680a5413:2aac5c13
   Creation Time : Sat Mar  7 11:48:30 2009
  Raid Level : raid1
   Used Dev Size : 483138688 (460.76 GiB 494.73 GB)
 
 
  Array Size : 483138688 (460.76 GiB 494.73 GB)
Raid Devices : 2
   Total Devices : 2
 Preferred Minor : 1
 
 Update Time : Fri Nov 29 23:03:41 2013
   State : clean
  Active Devices : 1
 Working Devices : 2
  Failed Devices : 1
   Spare Devices : 1
Checksum : be8bd27f - correct
  Events : 72078
 
 
   Number   Major   Minor   RaidDevice State
 this 2   8   352  spare   /dev/sdc3
 
0 0   8   510  active sync   /dev/sdd3
1 1   001  faulty removed
2 2   8   352  spare   /dev/sdc3
 
 
 What is the problem? And how can I recover a correct md1 array?

IIRC Linux md rebuilds multiple degraded arrays sequentially, not in
parallel.  This is due to system performance impact and other reasons.
When the rebuild of md0 is finished, the rebuild of md1/sdc3 should
start automatically.  If this did not occur please let us know and we'll
go from there.

-- 
Stan


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/529979f8.1050...@hardwarefreak.com



Re: Marvell SATA/Raid problem

2012-06-25 Thread Neal Murphy
On Saturday 23 June 2012 05:28:17 Camaleón wrote:
  Are there any known problems with Marvell SATA and Wheezy 64-bit?
 
 (...)
 
 None that I'm aware of :-?
 
 Anyway, something that was working fine in Squeeze is expected to be
 working in upcoming kernel versions. Unless you missed something,
 consider in opening a bug report for a possible regression.

The Highpoint's problem persisted when I tried Squeeze 32-bit. So I think it's 
hardware. I'll ponder a bug report for that (low probability) as well as one 
for inotifywait (a separate thread).

Monday, I replaced the Highpoint with a 'Best Connectivity' SIL3132-based 2-
port RAID board and it's just like the farmer of song: it's working happily in 
the Dell. Boots and reboots with and without the drive in the dataport turned 
on, hot plug is working as expected, and my plug-n-play backup is nearly 
flawless (with a workaround for the inotifywait problem).

N


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201206260132.07745.neal.p.mur...@alum.wpi.edu



Marvell SATA/Raid problem

2012-06-23 Thread Neal Murphy
Wheezy 64-bit. Marvell PCIE SATA/Raid card with one drive (in a CRU DP10), 
non-RAID. Main board has two identical Hitachi 1TB drives on on-board SATA 
ports used in md RAID.

Running Squeeze 32-bit, it was handling hot-plugged drives just fine. Switched 
to Wheezy 64-bit and it no longer detects hot-plugged drives. In fact, it 
won't boot with a drive connected to the Marvell card (plugged into the DP10).

Are there any known problems with Marvell SATA and Wheezy 64-bit?

I have the system here; I'll re-try Wheezy 64-bit this weekend, and try Wheezy 
32-bit and Squeeze. (If Squeeze 32-bit doesn't work, it'll be a good clue, 
since it worked before.)


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201206230304.16148.neal.p.mur...@alum.wpi.edu



Re: Marvell SATA/Raid problem

2012-06-23 Thread Camaleón
On Sat, 23 Jun 2012 03:04:16 -0400, Neal Murphy wrote:

 Wheezy 64-bit. Marvell PCIE SATA/Raid card with one drive (in a CRU
 DP10), non-RAID. Main board has two identical Hitachi 1TB drives on
 on-board SATA ports used in md RAID.

So you have a total of 3 hard disks, one connected to the PCI-e card and 
the other two attached to the stock mainboard sata ports, right?

 Running Squeeze 32-bit, it was handling hot-plugged drives just fine.
 Switched to Wheezy 64-bit and it no longer detects hot-plugged drives.
 In fact, it won't boot with a drive connected to the Marvell card
 (plugged into the DP10).

The change of the architecture (32 → 64 bits) seems to indicate that you 
did a clean install, from scratch, right?

Questions:

- Did the installer even see any of the 3 drives?

- Output of lspci (so we can see the Marvell chipsets involved)

- Output of messages (when you cannot boot, kernel logs, dmesg...)

- What's your hdd layout? Where is the system installed, what raid level 
are you using? I guess is a raid 1 but I prefer to ask. Give more details 
about your system, you provided sparse data and precision is a must for 
these kind of problems :-)

 Are there any known problems with Marvell SATA and Wheezy 64-bit?

(...)

None that I'm aware of :-?

Anyway, something that was working fine in Squeeze is expected to be 
working in upcoming kernel versions. Unless you missed something, 
consider in opening a bug report for a possible regression.
 
Greetings,

-- 
Camaleón


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/js427h$io1$3...@dough.gmane.org



Re: Can't reboot after power failure (RAID problem?)

2011-02-02 Thread David Gaudine

On 11-01-31 8:47 PM, Andrew Reid wrote:


   The easy way out is to boot from a rescue disk, fix the mdadm.conf
file, rebuild the initramfs, and reboot.

   The Real Sysadmin way is to start the array by hand from inside
the initramfs.  You want mdadm -A /dev/md0 (or possibly
mdadm -A -uyour-uuid) to start it, and once it's up, ctrl-d out
of the initramfs and hope.  The part I don't remember is whether or
not this creates the symlinks in /dev/disk that your root-fs-finder
is looking for.


All's well.  After the Real Sysadmin way got me into the system 
one-time-only, I could do the easy way which is more permanent without 
needing a rescue disk.  Thank you so much.


I have one more question, just out of curiousity so bottom priority.  
Why does this work?  mdadm.conf is in the initramfs which is in /boot 
which is on /dev/md0, but /dev/md0 doesn't exist until the arrays are 
assembled, which requires mdadm.conf.


David


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4d49816b.1040...@alcor.concordia.ca



Re: Can't reboot after power failure (RAID problem?)

2011-02-02 Thread Boyd Stephen Smith Jr.
In 4d49816b.1040...@alcor.concordia.ca, David Gaudine wrote:
I have one more question, just out of curiousity so bottom priority.
Why does this work?  mdadm.conf is in the initramfs which is in /boot
which is on /dev/md0, but /dev/md0 doesn't exist until the arrays are
assembled, which requires mdadm.conf.

Finding the initramfs on disk and copying it into RAM is not actually done by 
the kernel.  It is done by the boot loader, the same way the boot loader finds 
the kernel image on disk on copies it into RAM.

As such, it doesn't use kernel features to load the initramfs.  There are a 
number of techniques that boot loaders take to be able to do this magic.  
GRUB normally uses the gap between the partition table and the first partition 
to store enough modules to emulate the kernel's dm/md layer and one or more of 
the kernel's file system modules in order to do the loading.  If those modules 
are not available or not in sync with how the kernel handles things, GRUB 
could fail to read the kernel image or initramfs or it could think it read 
both and transfer control to a kernel that is just random data from the 
disk.
-- 
Boyd Stephen Smith Jr.   ,= ,-_-. =.
b...@iguanasuicide.net   ((_/)o o(\_))
ICQ: 514984 YM/AIM: DaTwinkDaddy `-'(. .)`-'
http://iguanasuicide.net/\_/


signature.asc
Description: This is a digitally signed message part.


Re: Can't reboot after power failure (RAID problem?)

2011-02-01 Thread Pascal Hambourg
Hello,

dav...@alcor.concordia.ca a écrit :
 My system went down because of a power failure, and now it won't start.  I
 use RAID 1, and I don't know if that's related to the problem.  The screen
 shows the following.
 
 Loading, please wait...
 Gave up waiting for rood device.  Common problems:
 - Boot args (cat /proc/cmdline)
   - Check rootdelay- (did the system wait long enough?)
   - Check root- (did the system wait for the right device?
 - Missing modules (cat /proc/modules; ls /dev)
 ALERT! /dev/disk/by-uuid/47173345-34e3-4ab3-98b5-f39e80424191 does not exist.
 Dropping to a shell!
 
 I don't know if that uuid is an MD device, but it seem likely.  Grub is
 installed on each disk, and I previously tested the RAID 1 arrays by
 unplugging each disk one at a time and was able to boot to either.

The kernel and initramfs started, so grub did its job and does not seem
to be the problem.

Are the disks, partitions and RAID devices present in /proc/partitions
and /dev/ ? What does /proc/mdstat contain ? Any related messages in dmesg ?


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4d47cb86.6030...@plouf.fr.eu.org



Re: Can't reboot after power failure (RAID problem?)

2011-02-01 Thread David Gaudine

On 11-01-31 8:47 PM, Andrew Reid wrote:

On Monday 31 January 2011 10:51:04 dav...@alcor.concordia.ca wrote:

I posted in a panic and left out a lot of details.  I'm using Squeeze, and
set up the system about a month ago, so there have been some upgrades.  I
wonder if maybe the kernel or Grub was upgraded and I neglected to install
Grub again, but I would expect it to automatically be reinstalled on at
least the first disk.  If I remove either disk I get the same error
message.

I did look at /proc/cmdline.  It shows the same uuid for the root device
as in the menu, so that seems to prove it's an MD device that isn't ready
since my boot and root partitions are each on MD devices.  /proc/modules
does show md_mod.

   What about the actual device?  Does /dev/md/0 (or /dev/md0, or whatever)
exist?

   If the module is loaded but the device does not exist, then it's possible
there's a problem with your mdadm.conf file, and the initramfs doesn't
have the array info in it, so it wasn't started.

   The easy way out is to boot from a rescue disk, fix the mdadm.conf
file, rebuild the initramfs, and reboot.

   The Real Sysadmin way is to start the array by hand from inside
the initramfs.  You want mdadm -A /dev/md0 (or possibly
mdadm -A -uyour-uuid) to start it, and once it's up, ctrl-d out
of the initramfs and hope.  The part I don't remember is whether or
not this creates the symlinks in /dev/disk that your root-fs-finder
is looking for.

   It may be better to boot with break=premount to get into the
initramfs in a more controlled state, instead of trying to fix it
in the already-error-ed state, assuming you try the initramfs
thing at all.

   And further assuming that the mdadm.conf file is the problem,
which was pretty much guesswork on my part...

-- A.


I found the problem.  You're right, mdadm.conf was the problem, which is 
amazing considering that I had previously restarted without changing 
mdadm.conf.  I edited it in the initramfs, then did mdadm -A /dev/md0 
as you suggested and control-d worked.  I assume I'll still have to 
rebuild the initramfs; I might need handholding, but I'll google first.


I think what went wrong might interest some people, since it answers a 
question I previously raised under the subject

RAID1 with multiple partitions
There was no concensus so I made the wrong choice.

The cause of the problem is, I set up my system under a temporary 
hostname and then changed the hostname.  The hostname appeared at the 
end of each ARRAY line in mdadm.conf, and I didn't know whether I should 
change it there because I didn't know if whether it has to match the 
current hostname in the current /etc/host, has to match the current 
hostname, or is just a meaningless label.  I changed it to the new 
hostname at the same time that I changed the hostname, then shut down 
and restarted.  It booted fine.  I did the same thing on another 
computer, and I'm sure I restarted that one successfully several times.  
So, I foolishly thought I was safe.  After the power failure it wouldn't 
boot.  After following your advice I was sufficiently inspired to edit 
mdadm.conf back to the original hostname, mount my various md's, and 
control-d.  I assume I'll have to do that every time I boot until I 
rebuild the initramfs.


Thank you very much.  I'd already recovered everything from a backup, 
but I needed to find the solution or I'd be afraid to raid in future.


David


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org

Archive: http://lists.debian.org/4d4828e7.9030...@alcor.concordia.ca



Re: Can't reboot after power failure (RAID problem?)

2011-02-01 Thread Tom H
On Tue, Feb 1, 2011 at 10:38 AM, David Gaudine
dav...@alcor.concordia.ca wrote:
 On 11-01-31 8:47 PM, Andrew Reid wrote:
 On Monday 31 January 2011 10:51:04 dav...@alcor.concordia.ca wrote:

 I posted in a panic and left out a lot of details.  I'm using Squeeze,
 and
 set up the system about a month ago, so there have been some upgrades.  I
 wonder if maybe the kernel or Grub was upgraded and I neglected to
 install
 Grub again, but I would expect it to automatically be reinstalled on at
 least the first disk.  If I remove either disk I get the same error
 message.

 I did look at /proc/cmdline.  It shows the same uuid for the root device
 as in the menu, so that seems to prove it's an MD device that isn't ready
 since my boot and root partitions are each on MD devices.  /proc/modules
 does show md_mod.

   What about the actual device?  Does /dev/md/0 (or /dev/md0, or whatever)
 exist?

   If the module is loaded but the device does not exist, then it's
 possible
 there's a problem with your mdadm.conf file, and the initramfs doesn't
 have the array info in it, so it wasn't started.

   The easy way out is to boot from a rescue disk, fix the mdadm.conf
 file, rebuild the initramfs, and reboot.

   The Real Sysadmin way is to start the array by hand from inside
 the initramfs.  You want mdadm -A /dev/md0 (or possibly
 mdadm -A -uyour-uuid) to start it, and once it's up, ctrl-d out
 of the initramfs and hope.  The part I don't remember is whether or
 not this creates the symlinks in /dev/disk that your root-fs-finder
 is looking for.

   It may be better to boot with break=premount to get into the
 initramfs in a more controlled state, instead of trying to fix it
 in the already-error-ed state, assuming you try the initramfs
 thing at all.

   And further assuming that the mdadm.conf file is the problem,
 which was pretty much guesswork on my part...

                                        -- A.

 I found the problem.  You're right, mdadm.conf was the problem, which is
 amazing considering that I had previously restarted without changing
 mdadm.conf.  I edited it in the initramfs, then did mdadm -A /dev/md0 as
 you suggested and control-d worked.  I assume I'll still have to rebuild the
 initramfs; I might need handholding, but I'll google first.

 I think what went wrong might interest some people, since it answers a
 question I previously raised under the subject
 RAID1 with multiple partitions
 There was no concensus so I made the wrong choice.

 The cause of the problem is, I set up my system under a temporary hostname
 and then changed the hostname.  The hostname appeared at the end of each
 ARRAY line in mdadm.conf, and I didn't know whether I should change it there
 because I didn't know if whether it has to match the current hostname in the
 current /etc/host, has to match the current hostname, or is just a
 meaningless label.  I changed it to the new hostname at the same time that I
 changed the hostname, then shut down and restarted.  It booted fine.  I did
 the same thing on another computer, and I'm sure I restarted that one
 successfully several times.  So, I foolishly thought I was safe.  After the
 power failure it wouldn't boot.  After following your advice I was
 sufficiently inspired to edit mdadm.conf back to the original hostname,
 mount my various md's, and control-d.  I assume I'll have to do that every
 time I boot until I rebuild the initramfs.

 Thank you very much.  I'd already recovered everything from a backup, but I
 needed to find the solution or I'd be afraid to raid in future.

If you'd like to have homehost in mdadm.conf be the same as the
hostname, you could break your boot in initramfs and assemble the
array with
mdadm --assemble /dev/mdX --homehost=whatever --update=homehost /dev/sdXX.


--
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/aanlktimrinhk1bo+-6rj-vgzmqq-jesvbvfc8fhfg...@mail.gmail.com



Re: Can't reboot after power failure (RAID problem?)

2011-01-31 Thread davidg
I posted in a panic and left out a lot of details.  I'm using Squeeze, and
set up the system about a month ago, so there have been some upgrades.  I
wonder if maybe the kernel or Grub was upgraded and I neglected to install
Grub again, but I would expect it to automatically be reinstalled on at
least the first disk.  If I remove either disk I get the same error
message.

I did look at /proc/cmdline.  It shows the same uuid for the root device
as in the menu, so that seems to prove it's an MD device that isn't ready
since my boot and root partitions are each on MD devices.  /proc/modules
does show md_mod.

David

 Original Message 
Subject: Can't reboot after power failure (RAID problem?)
From:dav...@alcor.concordia.ca
Date:Mon, January 31, 2011 10:18 am
To:  debian-user@lists.debian.org
--

My system went down because of a power failure, and now it won't start.  I
use RAID 1, and I don't know if that's related to the problem.  The screen
shows the following.

Loading, please wait...
Gave up waiting for rood device.  Common problems:
- Boot args (cat /proc/cmdline)
  - Check rootdelay- (did the system wait long enough?)
  - Check root- (did the system wait for the right device?
- Missing modules (cat /proc/modules; ls /dev)
ALERT! /dev/disk/by-uuid/47173345-34e3-4ab3-98b5-f39e80424191 does not exist.
Dropping to a shell!

I don't know if that uuid is an MD device, but it seem likely.  Grub is
installed on each disk, and I previously tested the RAID 1 arrays by
unplugging each disk one at a time and was able to boot to either.

Ideas?

David




-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/4dddf59c3040d675d7baf7eceb707f83.squir...@webmail.concordia.ca



Can't reboot after power failure (RAID problem?)

2011-01-31 Thread davidg
My system went down because of a power failure, and now it won't start.  I
use RAID 1, and I don't know if that's related to the problem.  The screen
shows the following.

Loading, please wait...
Gave up waiting for rood device.  Common problems:
- Boot args (cat /proc/cmdline)
  - Check rootdelay- (did the system wait long enough?)
  - Check root- (did the system wait for the right device?
- Missing modules (cat /proc/modules; ls /dev)
ALERT! /dev/disk/by-uuid/47173345-34e3-4ab3-98b5-f39e80424191 does not exist.
Dropping to a shell!

I don't know if that uuid is an MD device, but it seem likely.  Grub is
installed on each disk, and I previously tested the RAID 1 arrays by
unplugging each disk one at a time and was able to boot to either.

Ideas?

David



-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/8f87441c861b61e6f6bf597949f5bdc0.squir...@webmail.concordia.ca



Re: Can't reboot after power failure (RAID problem?)

2011-01-31 Thread Andrew Reid
On Monday 31 January 2011 10:51:04 dav...@alcor.concordia.ca wrote:
 I posted in a panic and left out a lot of details.  I'm using Squeeze, and
 set up the system about a month ago, so there have been some upgrades.  I
 wonder if maybe the kernel or Grub was upgraded and I neglected to install
 Grub again, but I would expect it to automatically be reinstalled on at
 least the first disk.  If I remove either disk I get the same error
 message.

 I did look at /proc/cmdline.  It shows the same uuid for the root device
 as in the menu, so that seems to prove it's an MD device that isn't ready
 since my boot and root partitions are each on MD devices.  /proc/modules
 does show md_mod.

  What about the actual device?  Does /dev/md/0 (or /dev/md0, or whatever)
exist?  

  If the module is loaded but the device does not exist, then it's possible
there's a problem with your mdadm.conf file, and the initramfs doesn't
have the array info in it, so it wasn't started.

  The easy way out is to boot from a rescue disk, fix the mdadm.conf
file, rebuild the initramfs, and reboot.

  The Real Sysadmin way is to start the array by hand from inside
the initramfs.  You want mdadm -A /dev/md0 (or possibly
mdadm -A -u your-uuid) to start it, and once it's up, ctrl-d out
of the initramfs and hope.  The part I don't remember is whether or
not this creates the symlinks in /dev/disk that your root-fs-finder
is looking for.

  It may be better to boot with break=premount to get into the 
initramfs in a more controlled state, instead of trying to fix it 
in the already-error-ed state, assuming you try the initramfs 
thing at all.

  And further assuming that the mdadm.conf file is the problem,
which was pretty much guesswork on my part...

-- A.  
-- 
Andrew Reid / rei...@bellatlantic.net


-- 
To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org 
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/201101312047.53519.rei...@bellatlantic.net



Debian Etch - LSI 8704ELP hardware raid problem.

2008-04-17 Thread Jarek Jarzebowski
Hi all,

I have installed Etch on LSI 8704ELP HW RAID controller. Installation
process went just fine but after reboot starting proccess freezed.

If you have any suggestions I will be glad.

Regards,
Jarek


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: Raid Problem

2006-10-18 Thread Daniel Haensse
Nach der Neuinstallation von Sarge gibt es immer noch den Fehler
mdadm --examine /dev/md0
mdadm: No super block found on /dev/md0 (Expected magic a92b4efc, got 
)

die beiden Platten starten aber als Array hoch. Komisch!


-- 
Haeufig gestellte Fragen und Antworten (FAQ): 
http://www.de.debian.org/debian-user-german-FAQ/

Zum AUSTRAGEN schicken Sie eine Mail an [EMAIL PROTECTED]
mit dem Subject unsubscribe. Probleme? Mail an [EMAIL PROTECTED] (engl)



Raid Problem

2006-10-16 Thread Daniel Haensse
Hallo Liste,

ich habe immer noch das Problem mit dem Raid1 das beim Systemneustart nur mit 
einem Laufwerk gestartet wird.

Auf der Liste wurde folgendes vorgeschlagen, nützt aber nichts:
mdadm --zero-superblock /dev/hda1  
mdadm --zero-superblock /dev/hdc1  
dpkg-reconfigure mdadm   

Kann mir jemand erklären, was das für ein Superblock ist, auch dd scheint das 
Problem nicht lösen zu können, wenn ich beide Platten spiegle. 

Gruss Dani

fd-fhk-03657:~# mdadm --misc --detail /dev/md0
/dev/md0:
Version : 00.90.01
  Creation Time : Tue Oct  3 13:45:01 2006
 Raid Level : raid1
 Array Size : 242187776 (230.97 GiB 248.00 GB)
Device Size : 242187776 (230.97 GiB 248.00 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent

Update Time : Mon Oct 16 18:13:53 2006
  State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

   UUID : 876ebbcd:22d4e45a:885f088a:e77f9967
 Events : 0.31136

Number   Major   Minor   RaidDevice State
   0   310  active sync   /dev/hda1
   1  2211  active sync   /dev/hdc1

fd-fhk-03657:~# mdadm --examine /dev/md0
mdadm: No super block found on /dev/md0 (Expected magic a92b4efc, got 
)

fd-fhk-03657:~# dpkg-reconfigure mdadm
Starting raid devices: done.
Starting RAID monitor daemon: mdadm -F.
fd-fhk-03657:~# mdadm --examine /dev/md0
mdadm: No super block found on /dev/md0 (Expected magic a92b4efc, got 
)

fd-fhk-03657:~# fsck /dev/md0

fd-fhk-03657:~# mdadm --examine /dev/md0
mdadm: No super block found on /dev/md0 (Expected magic a92b4efc, got 
)





RAID problem: data lost on RAID 6...

2006-04-18 Thread James Sutherland
Hi,

My Debian 3.1 (x86_64) system has suffered a very nasty mishap.

First, the IOMMU code ran out of space to map I/O to the SATA drives.
This looked to the md code like a faulty drive - so one by one, drives
were marked 'failed', until three components had failed out of the
seven-drive array, at which point it no longer functioned.

After rebooting, I got five drives back into the array - enough for it
to 'run' and be fscked. Almost recovered!

Then, a genuine drive failure - lots of entries like this in syslog:

***
end_request: I/O error, dev sde, sector 4057289
Buffer I/O error on device sde2, logical block 962111
ATA: abnormal status 0xD8 on port 0xC2010287
ATA: abnormal status 0xD8 on port 0xC2010287
ATA: abnormal status 0xD8 on port 0xC2010287
ata7: command 0x25 timeout, stat 0xd8 host_stat 0x1
ata7: translated ATA stat/err 0xd8/00 to SCSI SK/ASC/ASCQ 0xb/47/00
ata7: status=0xd8 { Busy }
sd 6:0:0:0: SCSI error: return code = 0x802
sde: Current: sense key=0xb
ASC=0x47 ASCQ=0x0
***

Of course, with two drives already (wrongly) marked 'failed', there's
nothing to rebuild with...

Is there a way I can 'unfail' another of the two drives and rebuild
from that? Trying to 'assemble' the array just results in the other two
being marked as spare components, then I'm told that 4 drives and 1
spare isn't enough to start a 7 drive array.


James.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: SW Raid Problem

2005-12-23 Thread Torsten Geile

Klaus Zerwes schrieb:

 


:-) das ist wohl nicht dein ernst
http://www.tldp.org/HOWTO/Software-RAID-HOWTO.html



Diese howto basiert meinem Verständnis nach auf den Raidtools, welche 
man ja wohl nicht mehr benutzen sollte. Dort wird Raid 1 mit der 
Konfiguration in der /etc/raidtab wie folgt beschrieben:


Set up the |/etc/raidtab| file like this:

raiddev /dev/md0
   raid-level  1
   nr-raid-disks   2
   nr-spare-disks  0
   persistent-superblock 1
   device  /dev/sdb6
   raid-disk   0
   device  /dev/sdc5
   raid-disk   1

Da ich aber mit mdadm arbeiten möchte, scheint mir diese Variante als unbrauchbar. Da 
aber laut man mdadm


mdadm does not use /etc/raidtab, the raidtools configuration  file,  at
  all.  It has a different configuration file with a different format and
  an different purpose.


die /etc/raidtab nicht benötigt wird, kann ich mit dem Howto nicht alzuviel 
anfangen. Habe zwar andere Seiten gefunden, die sind aber immer nur zum Teil 
hilfreich.

Gruß

Torsten









Re: SW Raid Problem

2005-12-23 Thread Ralf Schmidt
Hallo Torsten,

Am Fri, 23 Dec 2005 19:08:25 +0100 schrieb Torsten Geile:

 
 Da ich aber mit mdadm arbeiten möchte, scheint mir diese Variante als
 unbrauchbar. Da aber laut man mdadm

So ging es mir auch.
 
 die /etc/raidtab nicht benötigt wird, kann ich mit dem Howto nicht
 alzuviel anfangen. Habe zwar andere Seiten gefunden, die sind aber
 immer nur zum Teil hilfreich.


Vielleicht hilft Dir das Howto auf
http://ralf-schmidt.de/privat/computer/raid1-lvm.htm


HTH

 ciao

   Ralf Schmidt


-- 
Haeufig gestellte Fragen und Antworten (FAQ): 
http://www.de.debian.org/debian-user-german-FAQ/

Zum AUSTRAGEN schicken Sie eine Mail an [EMAIL PROTECTED]
mit dem Subject unsubscribe. Probleme? Mail an [EMAIL PROTECTED] (engl)



Re: SW Raid Problem

2005-12-21 Thread Klaus Zerwes

Torsten Geile wrote:

Hi,

Am Mon, 19 Dec 2005 12:00:47 +0100 schrieb Klaus Zerwes:

Da hast du ein grundlegendes Problem: du hast Software-RAID nicht 
verstanden.
Dateisysteme erzeugst du auf einem device (md0) und nicht auf den 
einzelnen Bestandteilen des devices.



Doch, das war mir schon klar, aber auch vor der Erzeugung des Dateisystems
gab es den gleichen Fehler, daher hatte ich es mal so versucht. Was mir
allerdings nicht klar war ist die Tatsache, dass man erst ein raid mit dem
missing device einrichten muss, um dann später die Daten rüberzukopieren.


Beispiel:
 - dein raid soll vom typ 1 sein und die partitionen sda1 und sdb1 
enthalten.

 - im Moment ist sda1 als normale Partition mit ext2 formatiert.
 - wenn du einfach ein raid im aktuellen Zustand über sda1 und sdb1 
erzeugst und dann ein filesystem draufpackst sind deine Daten weg!
 - wenn du genug Speicherplatz besitzt, um deine Daten 
zwischenzulagern, dann gehst du wie folgt vor:

- daten kopieren
- sda1 umounten
- raid1 erstellen mit beiden devices
- dateisystem drauf
- raid mounten
- daten zurückspielen
 - wenn nicht mußt du den missin-trick machen
- raid md0 mit sda1 als missing erzeugen und mkfs
- md0 mounten und die daten draufkopieren
- sda1 umounten und dem raid hinzufügen = sync
 - es gibt tasächlich die Möglichkeit ein nonraid direkt aufzuraiden - 
habe ich aber nie angewendet = daher halte ich mich zu dem Thema mal 
zurück.


in allen fällen mußt du, wenn du autodetection haben möchtest den 
partitionstyp auf 0xFD setzten und persistent-superblock verwenden


Wenn du nett zum Gockel bist und ihn mit den richtigen Begriffen 
fütterst  kommst du auf die richtige Seite - zum Software Raid HowTo ;-)


Gefüttert hatte ich schon, aber die dokus waren alle nicht so tiefgreifend.


:-) das ist wohl nicht dein ernst
http://www.tldp.org/HOWTO/Software-RAID-HOWTO.html


[...]


Backup VOR dem Start ist selbstredend


selbstverständlich..

Gruß

Torsten





--
Klaus Zerwes
http://www.zero-sys.net


--
Haeufig gestellte Fragen und Antworten (FAQ): 
http://www.de.debian.org/debian-user-german-FAQ/


Zum AUSTRAGEN schicken Sie eine Mail an [EMAIL PROTECTED]
mit dem Subject unsubscribe. Probleme? Mail an [EMAIL PROTECTED] (engl)



Re: SW Raid Problem

2005-12-20 Thread Torsten Geile
Hi,

Am Mon, 19 Dec 2005 12:00:47 +0100 schrieb Klaus Zerwes:


 Da hast du ein grundlegendes Problem: du hast Software-RAID nicht 
 verstanden.
 Dateisysteme erzeugst du auf einem device (md0) und nicht auf den 
 einzelnen Bestandteilen des devices.

Doch, das war mir schon klar, aber auch vor der Erzeugung des Dateisystems
gab es den gleichen Fehler, daher hatte ich es mal so versucht. Was mir
allerdings nicht klar war ist die Tatsache, dass man erst ein raid mit dem
missing device einrichten muss, um dann später die Daten rüberzukopieren.
 
 Wenn du nett zum Gockel bist und ihn mit den richtigen Begriffen 
 fütterst  kommst du auf die richtige Seite - zum Software Raid HowTo ;-)

Gefüttert hatte ich schon, aber die dokus waren alle nicht so tiefgreifend.


 Nachträglich in etwa so:
   - auf der neuen Platte ein Raid mit missing device einrichten
   - die Daten von der alten Platte transferiren
   - testen ob das raiddevice beim booten initialisiert wird
   - (u.U. initrd bauen / kernel neu backen ...)
   - die alte Platte dem raid hinzufügen
 
 Backup VOR dem Start ist selbstredend

selbstverständlich..

Gruß

Torsten


-- 
Haeufig gestellte Fragen und Antworten (FAQ): 
http://www.de.debian.org/debian-user-german-FAQ/

Zum AUSTRAGEN schicken Sie eine Mail an [EMAIL PROTECTED]
mit dem Subject unsubscribe. Probleme? Mail an [EMAIL PROTECTED] (engl)



SW Raid Problem

2005-12-19 Thread Torsten Geile
Hallo,

habe nachträglich eine identische SATA HDD ins System eingebaut und möchte
nun, dass die im RAID1 level arbeitet. Dazu habe ich die Partitionierung
auf sdb blockgenau per fdsik der sda nachgebildet.Die Swap Partition habe
ich rausgelassen.

Mit 

 mdadm --create --verbose /dev/md0 --level=1 --raid-device=2 /dev/sda8
/dev/sdb7

bekomme ich die Fehlermeldung

mdadm: ADD_NEW_DISK for /dev/sda8 failed: Device or resource busy

obwohl sda8 ( /home Verzeichnis ) nicht gemountet ist. sdb7 wurde per
mkfs.ext3 erstellt.

Wie genau richtige ich denn nun das SW raid1 ein?

Gruß

Torsten


-- 
Haeufig gestellte Fragen und Antworten (FAQ): 
http://www.de.debian.org/debian-user-german-FAQ/

Zum AUSTRAGEN schicken Sie eine Mail an [EMAIL PROTECTED]
mit dem Subject unsubscribe. Probleme? Mail an [EMAIL PROTECTED] (engl)



Re: SW Raid Problem

2005-12-19 Thread Klaus Zerwes

Torsten Geile wrote:

Hallo,

habe nachträglich eine identische SATA HDD ins System eingebaut und möchte
nun, dass die im RAID1 level arbeitet. Dazu habe ich die Partitionierung
auf sdb blockgenau per fdsik der sda nachgebildet.Die Swap Partition habe
ich rausgelassen.

Mit 


 mdadm --create --verbose /dev/md0 --level=1 --raid-device=2 /dev/sda8
/dev/sdb7

bekomme ich die Fehlermeldung

mdadm: ADD_NEW_DISK for /dev/sda8 failed: Device or resource busy

obwohl sda8 ( /home Verzeichnis ) nicht gemountet ist. sdb7 wurde per
mkfs.ext3 erstellt.


Da hast du ein grundlegendes Problem: du hast Software-RAID nicht 
verstanden.
Dateisysteme erzeugst du auf einem device (md0) und nicht auf den 
einzelnen Bestandteilen des devices.


Wenn du nett zum Gockel bist und ihn mit den richtigen Begriffen 
fütterst  kommst du auf die richtige Seite - zum Software Raid HowTo ;-)

Das solltest du dir mal gründlich durchlesen.
Und die Doku zu mdadmin


Wie genau richtige ich denn nun das SW raid1 ein?


Nachträglich in etwa so:
 - auf der neuen Platte ein Raid mit missing device einrichten
 - die Daten von der alten Platte transferiren
 - testen ob das raiddevice beim booten initialisiert wird
 - (u.U. initrd bauen / kernel neu backen ...)
 - die alte Platte dem raid hinzufügen

Backup VOR dem Start ist selbstredend



Gruß

Torsten





--
Klaus Zerwes
http://www.zero-sys.net


--
Haeufig gestellte Fragen und Antworten (FAQ): 
http://www.de.debian.org/debian-user-german-FAQ/


Zum AUSTRAGEN schicken Sie eine Mail an [EMAIL PROTECTED]
mit dem Subject unsubscribe. Probleme? Mail an [EMAIL PROTECTED] (engl)



Software RAID problem - disks names change in case one fails

2005-09-09 Thread initiators
Hello,

I'm testing a server before I put it in production, and I've got a problem with
mdraid.

The config:
- Dell PowerEdge 800
- 4 x 250 Go SATA attached to the mobo
- /boot 4 x 1 GB (1 GB available)in RAID1, 3 active + 1 spare
- / 4 x 250 GB (500 GB available) in RAID5, 3 active + 1 spare
No problems at install, and the server runs OK.


Then I stop the server and remove /dev/sdb to simulate a hard disk failure that
has caused a crash and a reboot.
With the second disk removed the disks names are changed, the 3rd disk /dev/sdc
becomes /dev/sdb and the 4th disk (that was the spare disk) /dev/sdd becomes
/dev/sdc.
During the boot process md detects that there is a problem, but then complains
it can't find the /dev/sdd spare disk and the boot process stops with a kernel
panic error.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: Software RAID problem - disks names change in case one fails

2005-09-09 Thread Alvin Oga


On Fri, 9 Sep 2005 [EMAIL PROTECTED] wrote:

 I'm testing a server before I put it in production, and I've got a problem 
 with
 mdraid.
 
 The config:
 - Dell PowerEdge 800
 - 4 x 250 Go SATA attached to the mobo
 - /boot 4 x 1 GB (1 GB available)in RAID1, 3 active + 1 spare
 - / 4 x 250 GB (500 GB available) in RAID5, 3 active + 1 spare
 No problems at install, and the server runs OK.
 
 
 Then I stop the server and remove /dev/sdb to simulate a hard disk failure 
 that
 has caused a crash and a reboot.

purrfect test  and do the same for each disk 

it is pointless to have 1 spare disk in the raid array

- have you evern wondered about other folks that try to build a sata-based
  raid subsystem ??
- how did their sata pass the failed disk test if it 
reassigns its drive numbers upon reboot

 With the second disk removed the disks names are changed,

exactly that is the problem with scsi
- pull the power cord from it to simulate the disk failure
or pull the sata cable ...

- in either case, if the disk drives rename itself, based on
who's alive, raid won't work to boot after the failed disk
( but it will stay running until its booted )

 the 3rd disk /dev/sdc
 becomes /dev/sdb and the 4th disk (that was the spare disk) /dev/sdd becomes
 /dev/sdc.

thta's always been true of scsi

 During the boot process md detects that there is a problem, but then complains
 it can't find the /dev/sdd spare disk and the boot process stops with a kernel
 panic error.

exactly

c ya
alvin


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: Software RAID problem - disks names change in case one fails

2005-09-09 Thread C Shore
 - have you evern wondered about other folks that try to build a sata-based
   raid subsystem ??
   - how did their sata pass the failed disk test if it 
   reassigns its drive numbers upon reboot
 
  With the second disk removed the disks names are changed,
 
 exactly that is the problem with scsi
   - pull the power cord from it to simulate the disk failure
   or pull the sata cable ...
 
   - in either case, if the disk drives rename itself, based on
   who's alive, raid won't work to boot after the failed disk
   ( but it will stay running until its booted )
 
  the 3rd disk /dev/sdc
  becomes /dev/sdb and the 4th disk (that was the spare disk) /dev/sdd becomes
  /dev/sdc.
 
 thta's always been true of scsi

That's why devfs used names that wouldn't change, e.g. 
/dev/scsi/bus0/host0/target0/lun0/part1

Of course that's been deprecated in the newer 2.6 kernels and I don't 
know udev so I don't know if it has a similarly helpful naming 
scheme...anyone?


signature.asc
Description: Digital signature


Raid 10, mdadm, mdadm-raid problem

2005-01-04 Thread Roger Ellison
I've been having an enjoyable time tinkering with software raid with
Sarge and the RC2 installer.  The system boots fine with Raid 1 for
/boot and Raid 5 for /.  I decided to experiment with Raid 10 for /opt
since there's nothing there to destroy :).  Using mdadm to create a Raid
0 array from two Raid 1 arrays was simple enough, but getting the Raid
10 array activated at boot isn't working well.  I used update-rc.d to
add the symlinks to mdadm-raid using the defaults, but the Raid 10 array
isn't assembled at boot time.  After getting kicked to a root shell, if
I check /proc/mdstat only md1 (/) is started.  After running mdadm-raid
start, md0 (/boot), md2, and md3 start.  If I run mdadm-raid start again
md4 (/opt) starts.  Fsck'ing the newly assembled arrays before
successfully issuing 'mount -a' shows no filesystem errors.  I'm at a
loss and haven't found any similar issue mentions on this list or the
linux-raid list.  Here's mdadm.conf:

DEVICE partitions
DEVICE /dev/md*
ARRAY /dev/md4 level=raid0 num-devices=2
UUID=bf3456d3:2af15cc9:18d816bf:d630c183
   devices=/dev/md2,/dev/md3
ARRAY /dev/md3 level=raid1 num-devices=2
UUID=a51da14e:41eb27ad:b6eefb94:21fcdc95
   devices=/dev/sdb5,/dev/sde5
ARRAY /dev/md2 level=raid1 num-devices=2
UUID=ac25a75b:3437d397:c00f83a3:71ea45de
   devices=/dev/sda5,/dev/sdc5
ARRAY /dev/md1 level=raid5 num-devices=4 spares=1
UUID=efec4ae2:1e74d648:85582946:feb98f0c
   devices=/dev/sda3,/dev/sdb3,/dev/sdc3,/dev/sde3,/dev/sdd3
ARRAY /dev/md0 level=raid1 num-devices=4 spares=1
UUID=04209b62:6e46b584:06ec149f:97128bfb
   devices=/dev/sda1,/dev/sdb1,/dev/sdc1,/dev/sde1,/dev/sdd1


Roger


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED] 
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



RAID problem

2002-10-03 Thread Jacek Krzyzanowski
Witam :)

Nigdy wczesniej nie mialem do czynienia z raidem, a teraz przyszlo mi
zainstalowac system na maszynie z raid5
Jakos Debian po wystartowaniu takiego urzadzenia nie widzi...
Co z tym zrobic ? Czy przygotowac dyskietke startowa na innej maszynie ?

Komputer do Dell PowerEdge 2500
-- 

KRZYZAK LC4 640 Czerwony Grejfrut :)
Skierniewicehttp://www.krzyzak.motocykle.org




Re: RAID problem

2002-10-03 Thread Robert Pyciarz
On Thu, Oct 03, 2002 at 10:47:38AM +0200, Jacek Krzyzanowski wrote:
 Witam :)
 
 Nigdy wczesniej nie mialem do czynienia z raidem, a teraz przyszlo mi
 zainstalowac system na maszynie z raid5
 Jakos Debian po wystartowaniu takiego urzadzenia nie widzi...
 Co z tym zrobic ? Czy przygotowac dyskietke startowa na innej maszynie ?
 
 Komputer do Dell PowerEdge 2500

Musisz mieć wkompilowane wsparcie dla kontrolerów raid - te montowane w
dellach są zazwyczaj obsługiwane przez moduły megaraid, albo aacraid.

Jeśli nie ma ich w kernelu instalacyjnym (ztcp. pierwszy jest, nawet
w potato), to najprościej będzie dołożyć jeden dysk do zwykłego
kontrolera scsi na płycie i na nim zainstalować system, a potem
skompilować odpowiednie jądro i całość przerzucić na raida.

pozdrawiam,
rp.

-- 
I WAS AT A PARTY, he added, a shade reproachfully.
-- Death is summoned by the Wizards
   (Terry Pratchett, The Light Fantastic)



Re: RAID problem

2002-10-03 Thread Cezary Bogner
Jezeli potato to spróbuj puscic instalke z CD2.
Jest tam kernel który zawiera (chyba wszystkie) raidy
a jak woody to musisz sprawdzic która płytka jest  jeszcze bootująca
i spróbowac z niej.
Jak instalowalem potato to tak wlasnie robiłem
pzw
CB
- Original Message -
From: Robert Pyciarz [EMAIL PROTECTED]
To: debian-user-list debian-user-polish@lists.debian.org
Sent: Thursday, October 03, 2002 11:42 AM
Subject: Re: RAID problem


 On Thu, Oct 03, 2002 at 10:47:38AM +0200, Jacek Krzyzanowski wrote:
  Witam :)
 
  Nigdy wczesniej nie mialem do czynienia z raidem, a teraz przyszlo mi
  zainstalowac system na maszynie z raid5
  Jakos Debian po wystartowaniu takiego urzadzenia nie widzi...
  Co z tym zrobic ? Czy przygotowac dyskietke startowa na innej maszynie ?
 
  Komputer do Dell PowerEdge 2500

 Musisz mieć wkompilowane wsparcie dla kontrolerów raid - te montowane w
 dellach są zazwyczaj obsługiwane przez moduły megaraid, albo aacraid.

 Jeśli nie ma ich w kernelu instalacyjnym (ztcp. pierwszy jest, nawet
 w potato), to najprościej będzie dołożyć jeden dysk do zwykłego
 kontrolera scsi na płycie i na nim zainstalować system, a potem
 skompilować odpowiednie jądro i całość przerzucić na raida.

 pozdrawiam,
 rp.

 --
 I WAS AT A PARTY, he added, a shade reproachfully.
 -- Death is summoned by the Wizards
(Terry Pratchett, The Light Fantastic)


 --
 To UNSUBSCRIBE, email to [EMAIL PROTECTED]
 with a subject of unsubscribe. Trouble? Contact
[EMAIL PROTECTED]





Re: RAID problem

2002-10-03 Thread Marcin Juszkiewicz
On Thu, Oct 03, 2002 at 11:42:50AM +0200, Robert Pyciarz wrote:
 On Thu, Oct 03, 2002 at 10:47:38AM +0200, Jacek Krzyzanowski wrote:

  Nigdy wczesniej nie mialem do czynienia z raidem, a teraz przyszlo mi
  zainstalowac system na maszynie z raid5
  Jakos Debian po wystartowaniu takiego urzadzenia nie widzi...
  Co z tym zrobic ? Czy przygotowac dyskietke startowa na innej maszynie ?
  
  Komputer do Dell PowerEdge 2500
 
 Musisz mieć wkompilowane wsparcie dla kontrolerów raid - te montowane w
 dellach są zazwyczaj obsługiwane przez moduły megaraid, albo aacraid.
 
 Jeśli nie ma ich w kernelu instalacyjnym (ztcp. pierwszy jest, nawet
 w potato), to najprościej będzie dołożyć jeden dysk do zwykłego
 kontrolera scsi na płycie i na nim zainstalować system, a potem
 skompilować odpowiednie jądro i całość przerzucić na raida.

  Można też skompilować swoje jądro instalacyjne (config od niego jest
na płytce w discs/) ze wsparciem dla raida i uruchomic instalacje albo
via loadlin (jesli ma sie partycje vfat i dosa) albo z dyskietek (jedna
to odpowiednio spreparowany root.bin (podmienia sie kernel), druga to
rescue.bin.

  Chociaz tak prawde mowiac to nie wiem czy aby na etapie instalacji
przy wyborze modulow nie ma przypadkiem modulow od raida - po ich
zaladowaniu macierz powinna raczej byc dostepna.

-- 
Marcin 'Szczepan|Hrw' Juszkiewicz
mailto: marcinatamigadotpl
my Debian packages: deb http://users.stone.pl/szczepan/ apt/



Re: RAID problem

2002-10-03 Thread Grzegorz Kusnierz
On Thu, Oct 03, 2002 at 10:47:38AM +0200, Jacek Krzyzanowski wrote:
 Witam :)
 
 Nigdy wczesniej nie mialem do czynienia z raidem, a teraz przyszlo mi
 zainstalowac system na maszynie z raid5
 Jakos Debian po wystartowaniu takiego urzadzenia nie widzi...
 Co z tym zrobic ? Czy przygotowac dyskietke startowa na innej maszynie ?

A to jest macierz jakas sprzetowa, czy linux MD? Jesli MD, to aby kernel przy 
bootowaniu wykrywal macierz wystarczy przy uzyciu fdiska zmienic typ partycji w 
macierzy na 'fd' (hex) czyli 'linux raid autodetect'
a wogole to polecam http://www.tldp.org/HOWTO/Software-RAID-HOWTO.html.

konik