Re: [SLUG] recovering xfs

2009-05-15 Thread jam
On Friday 15 May 2009 21:03:06 slug-requ...@slug.org.au wrote:
  Lessons learnt:
 
  - a journalling file system is bigger than what you see,  3Tb is really
  3.3Tb when doing a direct copy.
 
  - Get lots of harddisk in the beginning.   750G drives really only give
  you 698G.  It is annoying to be 300G short and have to go to the shop
  again.
 
  - Expect lots of wait time,   hard errors on raid take a long time to
  give up.
 
  - Don't promise anything, expect it to fail.
 
  - LVM is really cool and well worth the time to rad up on it.   I am now
  going to LVM my home system.

 I'm planning to do this as well.
 I was thinking back to Mary's backup post last year and thinking if I could
 do lvm snapshots with an external harddrive.  Still a bit new to lvm
 though. I think you have to install the alternate ubuntu cd to get lvm
 right? (unless you are using the server install instead of the desktop).

Terry Pratchett says the chance of a million to one event happening is 9 / 10.
LVM is really kewl iff you do frequent (daily) backups. Otherwise we await 
your plainted cry about how you lost 100 zigabytes of unique photos ...
james
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] recovering xfs

2009-05-15 Thread Daniel Pittman
jam j...@tigger.ws writes:
 On Friday 15 May 2009 21:03:06 slug-requ...@slug.org.au wrote:
  Lessons learnt:
 
  - a journalling file system is bigger than what you see,  3Tb is
really 3.3Tb when doing a direct copy.
 
  - Get lots of harddisk in the beginning.   750G drives really only
give you 698G.  It is annoying to be 300G short and have to go to
the shop again.
 
  - Expect lots of wait time,   hard errors on raid take a long time
to give up.
 
  - Don't promise anything, expect it to fail.
 
  - LVM is really cool and well worth the time to rad up on it.   I
am now going to LVM my home system.

 I'm planning to do this as well.

 I was thinking back to Mary's backup post last year and thinking if I
 could do lvm snapshots with an external harddrive.

LVM snapshots are useful for getting a consistent backup, but just
copying the bits is necessary for a real one.  You might snapshot, then
use the snapshot as the source of your backup if you run a database
server or something similar though...

  Still a bit new to lvm though. I think you have to install the
 alternate ubuntu cd to get lvm right? (unless you are using the
 server install instead of the desktop).

IIRC you can get LVM, but not software RAID, from the Ubuntu graphical
installer.  The alternate installer can do all of the above.

 Terry Pratchett says the chance of a million to one event happening is 9/10.

...which is great for comedic effect, but not actually an expression of
real statistics.

Well, and a comment on the pretty awful ability of humans to accurately
assess and rate risk.


 LVM is really kewl iff you do frequent (daily) backups.  Otherwise we
 await your plainted cry about how you lost 100 zigabytes of unique
 photos ...

So, can you actually support your claim that LVM is disastrously
unreliable with some, you know, evidence?

I am curious to know, you see, because I have tens of terabytes of data
sitting on LVM[1] across a wide range of machines, scaling from
single-disk workstations through software-RAID servers to serious
bulk-storage systems with hardware RAID.

We have ... well.  It probably suffices to say that we would notice
fairly quickly if LVM was, in fact, unreliable.

Regards,
Daniel

Besides, no one here is stupid enough to have their systems running
*without* a daily backup, and without routinely checking it, right?

Footnotes: 
[1]  ...and XFS, to introduce the other technology that people love to
 make grandiose claims about the data-destroying abilities of.

--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] recovering xfs

2009-05-14 Thread foskey
Quoting Adrian Chadd adr...@creative.net.au:

 On Thu, May 14, 2009, fos...@tpg.com.au wrote:
 
  Still working on this, the whole thing appeared to lock up and we
  have rebooted.We have now replaced a drive in the raid and hopefully
  this will work better.   Of course this is going to take ages to
  rebuild.
 
 lets hope you don't throw another disk during the rebuild with all of
 that parallel IO going on.

Update:

dd has a problem with fatal disk errors,  it will stop and loop.   There
are two programs that will help with this ddrescue and dd_rescue,  they
are different.   dd_rescue appears to be the one that can use seek to
skip these fatal errors.

I downloaded a staticly linked version from the internet and I am now
running the recovery.   There are two disks in the original Raid that
are cactus when the copy hits them it just dies totally.  After a period
of time it gets a fatal and then dd_rescue dumps nulls in it's place and
moves on.

Lessons learnt:

- a journalling file system is bigger than what you see,  3Tb is really
3.3Tb when doing a direct copy.

- Get lots of harddisk in the beginning.   750G drives really only give
you 698G.  It is annoying to be 300G short and have to go to the shop again.

- Expect lots of wait time,   hard errors on raid take a long time to
give up.

- Don't promise anything, expect it to fail.

- LVM is really cool and well worth the time to rad up on it.   I am now
going to LVM my home system.

Ta
Ken
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] recovering xfs

2009-05-14 Thread Daniel Bush
2009/5/15 fos...@tpg.com.au

 Quoting Adrian Chadd adr...@creative.net.au:

  On Thu, May 14, 2009, fos...@tpg.com.au wrote:
 
 ...


 Lessons learnt:

 - a journalling file system is bigger than what you see,  3Tb is really
 3.3Tb when doing a direct copy.

 - Get lots of harddisk in the beginning.   750G drives really only give
 you 698G.  It is annoying to be 300G short and have to go to the shop
 again.

 - Expect lots of wait time,   hard errors on raid take a long time to
 give up.

 - Don't promise anything, expect it to fail.

 - LVM is really cool and well worth the time to rad up on it.   I am now
 going to LVM my home system.


I'm planning to do this as well.
I was thinking back to Mary's backup post last year and thinking if I could
do lvm snapshots with an external harddrive.  Still a bit new to lvm though.
I think you have to install the alternate ubuntu cd to get lvm right?
(unless you are using the server install instead of the desktop).

-- 
Daniel Bush

http://blog.web17.com.au
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


[SLUG] recovering xfs

2009-05-13 Thread Ken Foskey

I have a very large xfs file system that is corrupt and there is an IO
error in the middle of the file system, xfs_repair crashes.  A bit of
reading and I have a solution,  just thought I would put it out there in
case I have forgotten something.

Booting xeon server with 32 bit Ubuntu live CD.   All the hard disks on
new server in LVM giving me 3.76 TB.  I am copying 3.3TB from the other
server.

First I need to grab a copy of the data across the network,  from my new
server I access the old server:

ssh r...@server 'dd  if=/dev/vg1/lv1  conv=noerror,sync' | dd
of=/dev/vg/lv01

next I simply repair it

xfs_repair  /dev/vg/lv01

Mount it.  I figure this bit is easy by my reading, Ubuntu handles xfs
out of the box (read only so that it cannot be used as a proper server)

I am reading from a 32 bit server and booting a Xeon server with 32 bit
live CD.  This means the NFS should match correctly.

Is the above basically correct?   Is there any hints that I might need.

Ta
Ken


-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] recovering xfs

2009-05-13 Thread Tony Sceats
I'm actually not sure I'm reading this right, but I don't think you want to
dd to an NFS share? Maybe you meant XFS should match correctly?

If you're trying to do it over a network, I would have thought you'd have
more luck by piping dd through an nc connection, then dd back to a disk on
the target

ie,

on old server : dd  if=/dev/vg1/lv1  conv=noerror,sync | nc NEWSERVER PORT

on new server: nc -l PORT | dd of=/dev/vg/lv01


Also I'm not sure how the LVM is going to interact because I don't tend to
use LVM on my production servers with XFS


On Wed, May 13, 2009 at 7:14 PM, Ken Foskey fos...@tpg.com.au wrote:


 I have a very large xfs file system that is corrupt and there is an IO
 error in the middle of the file system, xfs_repair crashes.  A bit of
 reading and I have a solution,  just thought I would put it out there in
 case I have forgotten something.

 Booting xeon server with 32 bit Ubuntu live CD.   All the hard disks on
 new server in LVM giving me 3.76 TB.  I am copying 3.3TB from the other
 server.

 First I need to grab a copy of the data across the network,  from my new
 server I access the old server:

 ssh r...@server 'dd  if=/dev/vg1/lv1  conv=noerror,sync' | dd
 of=/dev/vg/lv01

 next I simply repair it

 xfs_repair  /dev/vg/lv01

 Mount it.  I figure this bit is easy by my reading, Ubuntu handles xfs
 out of the box (read only so that it cannot be used as a proper server)

 I am reading from a 32 bit server and booting a Xeon server with 32 bit
 live CD.  This means the NFS should match correctly.

 Is the above basically correct?   Is there any hints that I might need.

 Ta
 Ken


 --
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] recovering xfs

2009-05-13 Thread Ken Foskey

Thanks for replying.

nc in your command simply replaces ssh,  so I think this is the same
thing.   I am not using nfs, I cannot mount anything so it is raw, going
logical volume to logical volume.  They are slightly different sizes,
hope that will not kill anything.

The server that is being copied originally had an LVM so there is no
obvious conflict here.I don't care whether it is fast,   it is a
recovery operation not a prime server.

Ta
Ken



On Wed, 2009-05-13 at 21:10 +1000, Tony Sceats wrote:
 I'm actually not sure I'm reading this right, but I don't think you
 want to dd to an NFS share? Maybe you meant XFS should match
 correctly? 
 
 If you're trying to do it over a network, I would have thought you'd
 have more luck by piping dd through an nc connection, then dd back to
 a disk on the target
 
 
 ie,
 
 
 on old server : dd  if=/dev/vg1/lv1  conv=noerror,sync | nc NEWSERVER
 PORT
 
 
 
 on new server: nc -l PORT | dd of=/dev/vg/lv01
 
 
 
 
 Also I'm not sure how the LVM is going to interact because I don't
 tend to use LVM on my production servers with XFS
 
 
 
 
 On Wed, May 13, 2009 at 7:14 PM, Ken Foskey fos...@tpg.com.au wrote:
 
 I have a very large xfs file system that is corrupt and there
 is an IO
 error in the middle of the file system, xfs_repair crashes.  A
 bit of
 reading and I have a solution,  just thought I would put it
 out there in
 case I have forgotten something.
 
 Booting xeon server with 32 bit Ubuntu live CD.   All the hard
 disks on
 new server in LVM giving me 3.76 TB.  I am copying 3.3TB from
 the other
 server.
 
 First I need to grab a copy of the data across the network,
  from my new
 server I access the old server:
 
 ssh r...@server 'dd  if=/dev/vg1/lv1  conv=noerror,sync' | dd
 of=/dev/vg/lv01
 
 next I simply repair it
 
 xfs_repair  /dev/vg/lv01
 
 Mount it.  I figure this bit is easy by my reading, Ubuntu
 handles xfs
 out of the box (read only so that it cannot be used as a
 proper server)
 
 I am reading from a 32 bit server and booting a Xeon server
 with 32 bit
 live CD.  This means the NFS should match correctly.
 
 Is the above basically correct?   Is there any hints that I
 might need.
 
 Ta
 Ken
 
 
 --
 SLUG - Sydney Linux User's Group Mailing List -
 http://slug.org.au/
 Subscription info and FAQs:
 http://slug.org.au/faq/mailinglists.html
 
 

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] recovering xfs

2009-05-13 Thread Tony Sceats
ah, it appears as though I missed the close quote on your ssh command when I
read it last night, and I also saw NFS, which was  why I was a bit confused
as to what you were trying :)

On Wed, May 13, 2009 at 9:56 PM, Ken Foskey fos...@tpg.com.au wrote:


 Thanks for replying.

 nc in your command simply replaces ssh,  so I think this is the same
 thing.   I am not using nfs, I cannot mount anything so it is raw, going
 logical volume to logical volume.  They are slightly different sizes,
 hope that will not kill anything.

 The server that is being copied originally had an LVM so there is no
 obvious conflict here.I don't care whether it is fast,   it is a
 recovery operation not a prime server.

 Ta
 Ken



 On Wed, 2009-05-13 at 21:10 +1000, Tony Sceats wrote:
  I'm actually not sure I'm reading this right, but I don't think you
  want to dd to an NFS share? Maybe you meant XFS should match
  correctly?
 
  If you're trying to do it over a network, I would have thought you'd
  have more luck by piping dd through an nc connection, then dd back to
  a disk on the target
 
 
  ie,
 
 
  on old server : dd  if=/dev/vg1/lv1  conv=noerror,sync | nc NEWSERVER
  PORT
 
 
 
  on new server: nc -l PORT | dd of=/dev/vg/lv01
 
 
 
 
  Also I'm not sure how the LVM is going to interact because I don't
  tend to use LVM on my production servers with XFS
 
 
 
 
  On Wed, May 13, 2009 at 7:14 PM, Ken Foskey fos...@tpg.com.au wrote:
 
  I have a very large xfs file system that is corrupt and there
  is an IO
  error in the middle of the file system, xfs_repair crashes.  A
  bit of
  reading and I have a solution,  just thought I would put it
  out there in
  case I have forgotten something.
 
  Booting xeon server with 32 bit Ubuntu live CD.   All the hard
  disks on
  new server in LVM giving me 3.76 TB.  I am copying 3.3TB from
  the other
  server.
 
  First I need to grab a copy of the data across the network,
   from my new
  server I access the old server:
 
  ssh r...@server 'dd  if=/dev/vg1/lv1  conv=noerror,sync' | dd
  of=/dev/vg/lv01
 
  next I simply repair it
 
  xfs_repair  /dev/vg/lv01
 
  Mount it.  I figure this bit is easy by my reading, Ubuntu
  handles xfs
  out of the box (read only so that it cannot be used as a
  proper server)
 
  I am reading from a 32 bit server and booting a Xeon server
  with 32 bit
  live CD.  This means the NFS should match correctly.
 
  Is the above basically correct?   Is there any hints that I
  might need.
 
  Ta
  Ken
 
 
  --
  SLUG - Sydney Linux User's Group Mailing List -
  http://slug.org.au/
  Subscription info and FAQs:
  http://slug.org.au/faq/mailinglists.html
 
 


-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] recovering xfs

2009-05-13 Thread foskey

Problem

dd sat and looped on a single point last night.   I have downloaded a
statically linked version of dd_rescue and I am now trying this.   
Fingers crossed.

ssh r...@server '/root/dd_rescue /dev/vg1/lv1 -' | cat  /dev/vg1/lv1 


It is coming up with this error though:

dd_rescue: (info): ipos: 34031.5k, opos: 34031.5k, xferd:
34031.5k
*  errs:  7, errxfer: 3.5k, succxfer:
34028.0k
 +curr.rate:11628kB/s, avg.rate:  149kB/s, avg.load:
 0.1%
dd_rescue: (warning): /dev/vg1/lv1 (34031.5k): Invalid argument!



-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] recovering xfs

2009-05-13 Thread Adrian Chadd
And what is being logged in dmesg?

Your kernel should be spewing a whole lot of error messages if your physical
media is returning errors.



Adrian

On Thu, May 14, 2009, fos...@tpg.com.au wrote:
 
 Problem
 
 dd sat and looped on a single point last night.   I have downloaded a
 statically linked version of dd_rescue and I am now trying this.   
 Fingers crossed.
 
 ssh r...@server '/root/dd_rescue /dev/vg1/lv1 -' | cat  /dev/vg1/lv1 
 
 
 It is coming up with this error though:
 
 dd_rescue: (info): ipos: 34031.5k, opos: 34031.5k, xferd:
 34031.5k
 *  errs:  7, errxfer: 3.5k, succxfer:
 34028.0k
  +curr.rate:11628kB/s, avg.rate:  149kB/s, avg.load:
  0.1%
 dd_rescue: (warning): /dev/vg1/lv1 (34031.5k): Invalid argument!
 
 
 
 -- 
 SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
 Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

-- 
- Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support -
- $25/pm entry-level VPSes w/ capped bandwidth charges available in WA -
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] recovering xfs

2009-05-13 Thread foskey
Quoting Adrian Chadd adr...@creative.net.au:

 And what is being logged in dmesg?
 
 Your kernel should be spewing a whole lot of error messages if your
 physical
 media is returning errors.

Still working on this, the whole thing appeared to lock up and we have
rebooted.We have now replaced a drive in the raid and hopefully this
will work better.   Of course this is going to take ages to rebuild.


Ta
Ken
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] recovering xfs

2009-05-13 Thread Adrian Chadd
On Thu, May 14, 2009, fos...@tpg.com.au wrote:

 Still working on this, the whole thing appeared to lock up and we have
 rebooted.We have now replaced a drive in the raid and hopefully this
 will work better.   Of course this is going to take ages to rebuild.

lets hope you don't throw another disk during the rebuild with all of that
parallel IO going on.

:P



Adrian

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html