Re: [gentoo-user] recovering RAID from an old server

2010-02-21 Thread Andrea Conti
 Agreed, however Iain also said that he tried to mount individual
 partitions and this failed. This should work with RAID1

Only if you force the filesystem type (i.e. mount -t xxx, or use
mount.xxx directly).

However, while I know this works with ext2/ext3/ext4, I have no idea if
xfs is also smart enough to ignore the raid superblock and mount the
filesystem anyway in this case.

andrea



Re: [gentoo-user] recovering RAID from an old server

2010-02-21 Thread Iain Buchanan
On Sat, 2010-02-20 at 16:22 +0100, Andrea Conti wrote:

 AFAICT this is all you need to know -- you definitely have two software
 (mdraid) RAID 1 volumes:
 
 md100 with hda2, hde2 and hdg2 as members
 md101 with hda5, hde5 and hdg5 as members
 
 Both arrays seem to have lost a member (I guess hdc2 and hdc5 respectively).
 
 Honestly I don't know what is the point of running RAID1 volumes with
 four mirrors, but that seems to be the way it was configured.

strange, I'm pretty sure I didn't configure it like this - however it
has an inbuilt snapshot feature so maybe that's what the mirrors are
for...

I'm having some luck chasing up the original CDs so I think I'll try
that first.

thanks :)
-- 
Iain Buchanan iaindb at netspace dot net dot au

If it smells it's chemistry, if it crawls it's biology, if it doesn't work
it's physics.




Re: [gentoo-user] recovering RAID from an old server

2010-02-21 Thread Neil Bothwick
On Sun, 21 Feb 2010 09:10:41 +0100, Andrea Conti wrote:

  Agreed, however Iain also said that he tried to mount individual
  partitions and this failed. This should work with RAID1  
 
 Only if you force the filesystem type (i.e. mount -t xxx, or use
 mount.xxx directly).

Not in my experience.

 However, while I know this works with ext2/ext3/ext4, I have no idea if
 xfs is also smart enough to ignore the raid superblock and mount the
 filesystem anyway in this case.

The RAID superblock is at the end of the filesystem, to avoid any
conflicts with the filesystem superblock.


-- 
Neil Bothwick

Never get into fights with ugly people because they have nothing to lose.


signature.asc
Description: PGP signature


Re: [gentoo-user] recovering RAID from an old server

2010-02-21 Thread Andrea Conti
 The RAID superblock is at the end of the filesystem, to avoid any
 conflicts with the filesystem superblock.

It can be either at the start, at the end or even 4K into the device,
depending on which format (metadata revision) is used. In this case I
suppose it's 0.90, which is stored at the beginning.

andrea





Re: [gentoo-user] recovering RAID from an old server

2010-02-20 Thread Mick
On Saturday 20 February 2010 06:29:03 Iain Buchanan wrote:

 so it looks like there's some problems with hdc.  Are there any disk
 hardware testing tools on the gentoo minimal live cd?

If you want to check the disk use sys-apps/smartmontools, but this problem may 
be a fs corruption - which could of course have been caused by the hardware 
failing.

-- 
Regards,
Mick



Re: [gentoo-user] recovering RAID from an old server

2010-02-20 Thread Stroller


On 20 Feb 2010, at 04:31, Iain Buchanan wrote:


On Fri, 2010-02-19 at 14:44 +, Stroller wrote:

On 19 Feb 2010, at 12:15, Iain Buchanan wrote:

...
Can I randomly mount partitions read-only or will this screw  
things up

further?


If this is unsafe I will have ketchup  mustard on my baseball cap.


er... could you translate that?  How about dead horse on my baggy
green?


http://idioms.thefreedictionary.com/I'll+eat+my+hat

I just don't see how you can break anything *as long as* you don't let  
the system write anything to the disks. How can read-only be unsafe?


One might be paranoid enough to clone images of the drive before  
proceeding, however.


My one concern is over how you know this system uses software RAID.  
You know that EIDE hardware RAID was available, right? I'm sure this  
would rarely be available built-in to the motherboard.


Stroller.




Re: [gentoo-user] recovering RAID from an old server

2010-02-20 Thread Iain Buchanan
On Sat, 2010-02-20 at 13:39 +, Stroller wrote:
 On 20 Feb 2010, at 04:31, Iain Buchanan wrote:
 
  On Fri, 2010-02-19 at 14:44 +, Stroller wrote:
  On 19 Feb 2010, at 12:15, Iain Buchanan wrote:
  ...
  Can I randomly mount partitions read-only or will this screw  
  things up
  further?
 
  If this is unsafe I will have ketchup  mustard on my baseball cap.
 
  er... could you translate that?  How about dead horse on my baggy
  green?
 
 http://idioms.thefreedictionary.com/I'll+eat+my+hat

yeah, I got that, I was just picking on your use of ketchup  baseball.
Over here it's tomatoe sauce (dead horse) and cricket (baggy greens) :)
Most of my jokes need explaining %-)

 I just don't see how you can break anything *as long as* you don't let  
 the system write anything to the disks. How can read-only be unsafe?

Perhaps something to do with the superblock or last mount time or
something?  I don't know!  I know that mounting a drive while a system
is hibernated, even ro, will kill kittens.

 One might be paranoid enough to clone images of the drive before  
 proceeding, however.

I don't have enough spare...

 My one concern is over how you know this system uses software RAID.  
 You know that EIDE hardware RAID was available, right? I'm sure this  
 would rarely be available built-in to the motherboard.

well there appears to be no RAID controller, unless it's onboard, but as
I mentioned to Francessco the BIOS can see all drives, so can gentoo
minimal...  

I've since found that the OS is in flash RAM, and only the help files
are on disk, so maybe I have bigger problems if I can't boot :(  I hope
to get a copy of Guardian OS somehow...

thanks,
-- 
Iain Buchanan iaindb at netspace dot net dot au

Go ahead, bake my quiche
-- Magrat instructs the castle cook
   (Terry Pratchett, Lords and Ladies)




Re: [gentoo-user] recovering RAID from an old server

2010-02-20 Thread Andrea Conti
 md: bindhdg2,1
 md: bindhde2,2
 md: bindhda2,3
 raid1: raid set md100 active with 3 out of 4 mirrors

 md: bindhdg5,1
 md: bindhde5,2
 md: bindhda5,3
 raid1: raid set md101 active with 3 out of 4 mirrors

AFAICT this is all you need to know -- you definitely have two software
(mdraid) RAID 1 volumes:

md100 with hda2, hde2 and hdg2 as members
md101 with hda5, hde5 and hdg5 as members

Both arrays seem to have lost a member (I guess hdc2 and hdc5 respectively).

Honestly I don't know what is the point of running RAID1 volumes with
four mirrors, but that seems to be the way it was configured.

I would suggest that you take a *single* disk (let's say hdg) out of the
thing and hook it up to a fully functional Gentoo system with mdraid
(and of course XFS) compiled in the kernel and sys-fs/mdadm installed.

Then you can bring up each RAID volume in degraded state from the single
mirror:

#mdadm -A /dev/md100 -R /dev/hdX2
#mdadm -A /dev/md101 -R /dev/hdX5

(substiture hdX with the actual device name of the transplanted disk; in
any case mdadm has a very useful man page)

At this point you should be able to mount md100 and md101 *read-only*
and salvage any data you need.

Andrea



Re: [gentoo-user] recovering RAID from an old server

2010-02-20 Thread Neil Bothwick
On Sat, 20 Feb 2010 16:22:51 +0100, Andrea Conti wrote:

  md: bindhdg5,1
  md: bindhde5,2
  md: bindhda5,3
  raid1: raid set md101 active with 3 out of 4 mirrors  
 
 AFAICT this is all you need to know -- you definitely have two software
 (mdraid) RAID 1 volumes:

Agreed, however Iain also said that he tried to mount individual
partitions and this failed. This should work with RAID1, so I wonder if
something else is at work here; either some sort of logical volumes or a
weird filesystem in use.


-- 
Neil Bothwick

Heisenberg's Uncertainty Principle: The location of all objects cannot be
known simultaneously. Corollary: If a lost thing is found, something else
will disappear.



signature.asc
Description: PGP signature


Re: [gentoo-user] recovering RAID from an old server

2010-02-19 Thread Stroller


On 19 Feb 2010, at 12:15, Iain Buchanan wrote:

...
Can I randomly mount partitions read-only or will this screw things up
further?


If this is unsafe I will have ketchup  mustard on my baseball cap.

Stroller.




Re: [gentoo-user] recovering RAID from an old server

2010-02-19 Thread Iain Buchanan
On Fri, 2010-02-19 at 14:44 +, Stroller wrote:
 On 19 Feb 2010, at 12:15, Iain Buchanan wrote:
  ...
  Can I randomly mount partitions read-only or will this screw things up
  further?
 
 If this is unsafe I will have ketchup  mustard on my baseball cap.

er... could you translate that?  How about dead horse on my baggy
green?

Should I be able to mount them automatically and let the SW RAID module
sort it out or do I have to know how they're tied together beforehand?

The message from the kernel is:

Linux version 2.4.19-snap (r...@buildsys) (gcc version egcs-2.91.66
19990314/Linux (egcs-1.1.2 release)) #1 Tue Jul 13 20:24:35 PDT 2004

and later there's output from md which is (I assume) the linux
software raid module (this is a grep, so there are other messages in
between):

md: linear personality registered as nr 1
md: raid0 personality registered as nr 2
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
md: spare personality registered as nr 8
md: md driver 0.91.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: Autodetecting RAID arrays.
md: autorun ...
md: ... autorun DONE.
md: bindhdg2,1
md: bindhde2,2
md: bindhda2,3
md: hda2's event counter: 039d
md: hde2's event counter: 039d
md: hdg2's event counter: 039d
md: md100: raid array is not clean -- starting background reconstruction
md: RAID level 1 does not need chunksize! Continuing anyway.
md100: max total readahead window set to 124k
md100: 1 data-disks, max readahead per data-disk: 124k
raid1: md100, not all disks are operational -- trying to recover array
raid1: raid set md100 active with 3 out of 4 mirrors
md: updating md100 RAID superblock on device
md: hda2 [events: 039e]6(write) hda2's sb offset: 546112
md: recovery thread got woken up ...
md: looking for a shared spare drive
md100: no spare disk to reconstruct array! -- continuing in degraded
mode
md: recovery thread finished ...
md: hde2 [events: 039e]6(write) hde2's sb offset: 546112
md: hdg2 [events: 039e]6(write) hdg2's sb offset: 546112
md: bindhdg5,1
md: bindhde5,2
md: bindhda5,3
md: hda5's event counter: 03a4
md: hde5's event counter: 03a4
md: hdg5's event counter: 03a4
md: md101: raid array is not clean -- starting background reconstruction
md: RAID level 1 does not need chunksize! Continuing anyway.
md101: max total readahead window set to 124k
md101: 1 data-disks, max readahead per data-disk: 124k
raid1: md101, not all disks are operational -- trying to recover array
raid1: raid set md101 active with 3 out of 4 mirrors
md: updating md101 RAID superblock on device
md: hda5 [events: 03a5]6(write) hda5's sb offset: 273024
md: recovery thread got woken up ...
md: looking for a shared spare drive
md101: no spare disk to reconstruct array! -- continuing in degraded
mode
md: looking for a shared spare drive
md100: no spare disk to reconstruct array! -- continuing in degraded
mode
md: recovery thread finished ...
md: hde5 [events: 03a5]6(write) hde5's sb offset: 273024
md: hdg5 [events: 03a5]6(write) hdg5's sb offset: 273024
XFS mounting filesystem md(9,100)
Ending clean XFS mount for filesystem: md(9,100)

The partitions look like:
9   100 546112 md100
   9   101 273024 md101
  34 0   78150744 hdg
  34 1  16041 hdg1
  34 2 546210 hdg2
  34 3  1 hdg3
  34 4   76656636 hdg4
  34 5 273104 hdg5
  34 6 273104 hdg6
  33 0   78150744 hde
  33 1  16041 hde1
  33 2 546210 hde2
  33 3  1 hde3
  33 4   76656636 hde4
  33 5 273104 hde5
  33 6 273104 hde6
  22 0   78150744 hdc
  22 1  16041 hdc1
  22 2 546210 hdc2
  22 3  1 hdc3
  22 4   76656636 hdc4
  22 5 273104 hdc5
  22 6 273104 hdc6
   3 0   78150744 hda
   3 1  16041 hda1
   3 2 546210 hda2
   3 3  1 hda3
   3 4   76656636 hda4
   3 5 273104 hda5
   3 6 273104 hda6

many thanks!
-- 
Iain Buchanan iaindb at netspace dot net dot au

By golly, I'm beginning to think Linux really *is* the best thing since
sliced bread.
-- Vance Petree, Virginia Power




Re: [gentoo-user] recovering RAID from an old server

2010-02-19 Thread Iain Buchanan
On Sat, 2010-02-20 at 14:01 +0930, Iain Buchanan wrote:
 On Fri, 2010-02-19 at 14:44 +, Stroller wrote:
  On 19 Feb 2010, at 12:15, Iain Buchanan wrote:
   ...
   Can I randomly mount partitions read-only or will this screw things up
   further?

OK, I've randomly mounted partitions, and now I'm stuck because I don't
know what the original /etc/raidtab was.  /proc/mdstat just says:

Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
unused devices: none

which looks like nothing is used in any RAID set.  Autodetect seems not
to be working, perhaps because the ID wasn't set to 0xFD or 253.  Each
drive has identical partitions:
   Device Boot  Start End  Blocks   Id  System
/dev/hda1   *   1   2   16041+  83  Linux
/dev/hda2   3  70  546210   83  Linux
/dev/hda3  71 138  5462105  Extended
/dev/hda4 139968276656636   83  Linux
/dev/hda5  71 104  273104+  83  Linux
/dev/hda6 105 138  273104+  83  Linux

and /dev/hd[aceg]1 is /boot on each one.

all the other /dev/hd[aceg][2-6] mount says:
 mount: unknown filesystem type 'linux_raid_member
obviously this is the raid.  But how do I get to it?

All /boots mount ok and are readable with some kernel files and stuff,
however /dev/hdc1 give some errors:

hdc: dma_intr: status=0x51 { DriveReady SeekComplete Error }
hdc: dma_intr: error=0x40 { UncorrectableError }, LBAsect=585, sector=575
hdc: possibly failed opcode: 0x25
end_request: I/O error, dev hdc, sector 575
__ratelimit: 22 callbacks suppressed
Buffer I/O error on device hdc1, logical block 528
Buffer I/O error on device hdc1, logical block 529
Buffer I/O error on device hdc1, logical block 530
Buffer I/O error on device hdc1, logical block 531
Buffer I/O error on device hdc1, logical block 532
Buffer I/O error on device hdc1, logical block 533
Buffer I/O error on device hdc1, logical block 534
Buffer I/O error on device hdc1, logical block 535
Buffer I/O error on device hdc1, logical block 536
Buffer I/O error on device hdc1, logical block 537

so it looks like there's some problems with hdc.  Are there any disk
hardware testing tools on the gentoo minimal live cd?

thanks,
-- 
Iain Buchanan iaindb at netspace dot net dot au

It's simply unbelievable how much energy and creativity people have
invested into creating contradictory, bogus and stupid licenses...
--- Sven Rudolph about licences in debian/non-free.