[gentoo-user] Re: recovering RAID from an old server

2010-02-20 Thread Francesco Talamona
 Should I be able to mount them automatically and let the SW RAID
  module sort it out or do I have to know how they're tied together
  beforehand?

 md: looking for a shared spare drive
 md100: no spare disk to reconstruct array! -- continuing in degraded
 mode
 md: recovery thread finished ...
 md: hde5 [events: 03a5]6(write) hde5's sb offset: 273024
 md: hdg5 [events: 03a5]6(write) hdg5's sb offset: 273024
 XFS mounting filesystem md(9,100)
 Ending clean XFS mount for filesystem: md(9,100)
 
 The partitions look like:
 9   100 546112 md100
9   101 273024 md101

It seems it has correctly mounted its partition... Can't you find it?

I have the feeling that you are messing it up. If I understand it 
correctly the server has an hardware RAID controller, that has to be 
managed via its drivers.

Software RAID tools aren't suitable to mount correctly this setup, I 
would mount random partition for testing purposes only, on a spare 
machine.

The wiser thing to do is find an old livecd supporting PERC SAS (or 
whatever raid card is in that Snap) RAID cards and assemble the array in 
degraded mode for data recovery.

Another thing can come very useful: we once had a similar problem, we 
ended up borrowing one identical disc from another running server to put 
the array back online, we recovered our data, then restored the other 
server's array.

HTH
Francesco

-- 
Linux Version 2.6.32-gentoo-r5, Compiled #2 SMP PREEMPT Wed Feb 17 
20:30:02 CET 2010
Two 1GHz AMD Athlon 64 Processors, 4GB RAM, 4021.84 Bogomips Total
aemaeth



Re: [gentoo-user] Re: recovering RAID from an old server

2010-02-20 Thread Iain Buchanan
On Sat, 2010-02-20 at 10:46 +0100, Francesco Talamona wrote:
  Should I be able to mount them automatically and let the SW RAID
   module sort it out or do I have to know how they're tied together
   beforehand?
 
  md: looking for a shared spare drive
  md100: no spare disk to reconstruct array! -- continuing in degraded
  mode
  md: recovery thread finished ...
  md: hde5 [events: 03a5]6(write) hde5's sb offset: 273024
  md: hdg5 [events: 03a5]6(write) hdg5's sb offset: 273024
  XFS mounting filesystem md(9,100)
  Ending clean XFS mount for filesystem: md(9,100)
  
  The partitions look like:
  9   100 546112 md100
 9   101 273024 md101
 
 It seems it has correctly mounted its partition... Can't you find it?

This is with the server recovery console, which is basically just a web
page.  No shell access.  There's not much I can do to get at md100 and
md101 (is this what software RAID devices usually appear as?)

 I have the feeling that you are messing it up. If I understand it 
 correctly the server has an hardware RAID controller, that has to be 
 managed via its drivers.

I think it's software RAID.  There is no RAID controller AFAICT.  All 4
drives are visible to the BIOS as Primary and Secondary Master and
Slaves.

 Another thing can come very useful: we once had a similar problem, we 
 ended up borrowing one identical disc from another running server to put 
 the array back online, we recovered our data, then restored the other 
 server's array.

That's a possibility given what I can find on Google, however these are
few and far between, so I'd have to find someone willing to send their
drive to me (or vice versa) or send me the OS, which overlandstorage
doesn't like!

thanks,
-- 
Iain Buchanan iaindb at netspace dot net dot au

Come quickly, I am tasting stars!
-- Dom Perignon, upon discovering champagne.




[gentoo-user] Re: recovering RAID from an old server

2010-02-20 Thread Francesco Talamona
On Saturday 20 February 2010, Iain Buchanan wrote:
 On Sat, 2010-02-20 at 10:46 +0100, Francesco Talamona wrote:
   Should I be able to mount them automatically and let the SW RAID
module sort it out or do I have to know how they're tied
   together beforehand?
  
   md: looking for a shared spare drive
   md100: no spare disk to reconstruct array! -- continuing in
   degraded mode
   md: recovery thread finished ...
   md: hde5 [events: 03a5]6(write) hde5's sb offset: 273024
   md: hdg5 [events: 03a5]6(write) hdg5's sb offset: 273024
   XFS mounting filesystem md(9,100)
   Ending clean XFS mount for filesystem: md(9,100)
  
   The partitions look like:
   9   100 546112 md100
  9   101 273024 md101
 
  It seems it has correctly mounted its partition... Can't you find
  it?
 
 This is with the server recovery console, which is basically just a
  web page.  No shell access.  There's not much I can do to get at
  md100 and md101 (is this what software RAID devices usually appear
  as?)
 
  I have the feeling that you are messing it up. If I understand it
  correctly the server has an hardware RAID controller, that has to
  be managed via its drivers.
 
 I think it's software RAID.  There is no RAID controller AFAICT.  All
  4 drives are visible to the BIOS as Primary and Secondary Master and
  Slaves.

This isn't a proof: most hardware RAID are proprietary software 
solutions pretending to be hardware. Linux without the driver can't see 
the logical volume and shows all the physical drives.

You should do some research about that server hardware... Aren't snap 
equipped with PERC controller?.

  Another thing can come very useful: we once had a similar problem,
  we ended up borrowing one identical disc from another running
  server to put the array back online, we recovered our data, then
  restored the other server's array.
 
 That's a possibility given what I can find on Google, however these
  are few and far between, so I'd have to find someone willing to send
  their drive to me (or vice versa) or send me the OS, which
  overlandstorage doesn't like!

What happens if you physically remove the drive marked as bad?

You may image it for backup, then format it at low level, then put it 
back in place as if it was brand new. Or add a similar disk to be 
considered spare by the controller (given that it is looking for a spare 
disk in first instance).

Most controller have automated procedures to manage failures, disk swaps 
and so on.

For this reason you can't be sure that the inspection operations you are 
doing are read only. Unless the drives are attached to another machine 
with a trusted OS doing nothing on its own.

The ideas given above may let you to waste all of your data, be very 
careful and patient.

Good luck.
Francesco

-- 
Linux Version 2.6.32-gentoo-r5, Compiled #2 SMP PREEMPT Wed Feb 17 
20:30:02 CET 2010
Two 2.9GHz AMD Athlon 64 Processors, 4GB RAM, 11659 Bogomips Total
aemaeth