Re: [SLUG] recovering xfs
On Friday 15 May 2009 21:03:06 slug-requ...@slug.org.au wrote: Lessons learnt: - a journalling file system is bigger than what you see, 3Tb is really 3.3Tb when doing a direct copy. - Get lots of harddisk in the beginning. 750G drives really only give you 698G. It is annoying to be 300G short and have to go to the shop again. - Expect lots of wait time, hard errors on raid take a long time to give up. - Don't promise anything, expect it to fail. - LVM is really cool and well worth the time to rad up on it. I am now going to LVM my home system. I'm planning to do this as well. I was thinking back to Mary's backup post last year and thinking if I could do lvm snapshots with an external harddrive. Still a bit new to lvm though. I think you have to install the alternate ubuntu cd to get lvm right? (unless you are using the server install instead of the desktop). Terry Pratchett says the chance of a million to one event happening is 9 / 10. LVM is really kewl iff you do frequent (daily) backups. Otherwise we await your plainted cry about how you lost 100 zigabytes of unique photos ... james -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] recovering xfs
jam j...@tigger.ws writes: On Friday 15 May 2009 21:03:06 slug-requ...@slug.org.au wrote: Lessons learnt: - a journalling file system is bigger than what you see, 3Tb is really 3.3Tb when doing a direct copy. - Get lots of harddisk in the beginning. 750G drives really only give you 698G. It is annoying to be 300G short and have to go to the shop again. - Expect lots of wait time, hard errors on raid take a long time to give up. - Don't promise anything, expect it to fail. - LVM is really cool and well worth the time to rad up on it. I am now going to LVM my home system. I'm planning to do this as well. I was thinking back to Mary's backup post last year and thinking if I could do lvm snapshots with an external harddrive. LVM snapshots are useful for getting a consistent backup, but just copying the bits is necessary for a real one. You might snapshot, then use the snapshot as the source of your backup if you run a database server or something similar though... Still a bit new to lvm though. I think you have to install the alternate ubuntu cd to get lvm right? (unless you are using the server install instead of the desktop). IIRC you can get LVM, but not software RAID, from the Ubuntu graphical installer. The alternate installer can do all of the above. Terry Pratchett says the chance of a million to one event happening is 9/10. ...which is great for comedic effect, but not actually an expression of real statistics. Well, and a comment on the pretty awful ability of humans to accurately assess and rate risk. LVM is really kewl iff you do frequent (daily) backups. Otherwise we await your plainted cry about how you lost 100 zigabytes of unique photos ... So, can you actually support your claim that LVM is disastrously unreliable with some, you know, evidence? I am curious to know, you see, because I have tens of terabytes of data sitting on LVM[1] across a wide range of machines, scaling from single-disk workstations through software-RAID servers to serious bulk-storage systems with hardware RAID. We have ... well. It probably suffices to say that we would notice fairly quickly if LVM was, in fact, unreliable. Regards, Daniel Besides, no one here is stupid enough to have their systems running *without* a daily backup, and without routinely checking it, right? Footnotes: [1] ...and XFS, to introduce the other technology that people love to make grandiose claims about the data-destroying abilities of. -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] recovering xfs
Quoting Adrian Chadd adr...@creative.net.au: On Thu, May 14, 2009, fos...@tpg.com.au wrote: Still working on this, the whole thing appeared to lock up and we have rebooted.We have now replaced a drive in the raid and hopefully this will work better. Of course this is going to take ages to rebuild. lets hope you don't throw another disk during the rebuild with all of that parallel IO going on. Update: dd has a problem with fatal disk errors, it will stop and loop. There are two programs that will help with this ddrescue and dd_rescue, they are different. dd_rescue appears to be the one that can use seek to skip these fatal errors. I downloaded a staticly linked version from the internet and I am now running the recovery. There are two disks in the original Raid that are cactus when the copy hits them it just dies totally. After a period of time it gets a fatal and then dd_rescue dumps nulls in it's place and moves on. Lessons learnt: - a journalling file system is bigger than what you see, 3Tb is really 3.3Tb when doing a direct copy. - Get lots of harddisk in the beginning. 750G drives really only give you 698G. It is annoying to be 300G short and have to go to the shop again. - Expect lots of wait time, hard errors on raid take a long time to give up. - Don't promise anything, expect it to fail. - LVM is really cool and well worth the time to rad up on it. I am now going to LVM my home system. Ta Ken -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] recovering xfs
2009/5/15 fos...@tpg.com.au Quoting Adrian Chadd adr...@creative.net.au: On Thu, May 14, 2009, fos...@tpg.com.au wrote: ... Lessons learnt: - a journalling file system is bigger than what you see, 3Tb is really 3.3Tb when doing a direct copy. - Get lots of harddisk in the beginning. 750G drives really only give you 698G. It is annoying to be 300G short and have to go to the shop again. - Expect lots of wait time, hard errors on raid take a long time to give up. - Don't promise anything, expect it to fail. - LVM is really cool and well worth the time to rad up on it. I am now going to LVM my home system. I'm planning to do this as well. I was thinking back to Mary's backup post last year and thinking if I could do lvm snapshots with an external harddrive. Still a bit new to lvm though. I think you have to install the alternate ubuntu cd to get lvm right? (unless you are using the server install instead of the desktop). -- Daniel Bush http://blog.web17.com.au -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
[SLUG] recovering xfs
I have a very large xfs file system that is corrupt and there is an IO error in the middle of the file system, xfs_repair crashes. A bit of reading and I have a solution, just thought I would put it out there in case I have forgotten something. Booting xeon server with 32 bit Ubuntu live CD. All the hard disks on new server in LVM giving me 3.76 TB. I am copying 3.3TB from the other server. First I need to grab a copy of the data across the network, from my new server I access the old server: ssh r...@server 'dd if=/dev/vg1/lv1 conv=noerror,sync' | dd of=/dev/vg/lv01 next I simply repair it xfs_repair /dev/vg/lv01 Mount it. I figure this bit is easy by my reading, Ubuntu handles xfs out of the box (read only so that it cannot be used as a proper server) I am reading from a 32 bit server and booting a Xeon server with 32 bit live CD. This means the NFS should match correctly. Is the above basically correct? Is there any hints that I might need. Ta Ken -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] recovering xfs
I'm actually not sure I'm reading this right, but I don't think you want to dd to an NFS share? Maybe you meant XFS should match correctly? If you're trying to do it over a network, I would have thought you'd have more luck by piping dd through an nc connection, then dd back to a disk on the target ie, on old server : dd if=/dev/vg1/lv1 conv=noerror,sync | nc NEWSERVER PORT on new server: nc -l PORT | dd of=/dev/vg/lv01 Also I'm not sure how the LVM is going to interact because I don't tend to use LVM on my production servers with XFS On Wed, May 13, 2009 at 7:14 PM, Ken Foskey fos...@tpg.com.au wrote: I have a very large xfs file system that is corrupt and there is an IO error in the middle of the file system, xfs_repair crashes. A bit of reading and I have a solution, just thought I would put it out there in case I have forgotten something. Booting xeon server with 32 bit Ubuntu live CD. All the hard disks on new server in LVM giving me 3.76 TB. I am copying 3.3TB from the other server. First I need to grab a copy of the data across the network, from my new server I access the old server: ssh r...@server 'dd if=/dev/vg1/lv1 conv=noerror,sync' | dd of=/dev/vg/lv01 next I simply repair it xfs_repair /dev/vg/lv01 Mount it. I figure this bit is easy by my reading, Ubuntu handles xfs out of the box (read only so that it cannot be used as a proper server) I am reading from a 32 bit server and booting a Xeon server with 32 bit live CD. This means the NFS should match correctly. Is the above basically correct? Is there any hints that I might need. Ta Ken -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] recovering xfs
Thanks for replying. nc in your command simply replaces ssh, so I think this is the same thing. I am not using nfs, I cannot mount anything so it is raw, going logical volume to logical volume. They are slightly different sizes, hope that will not kill anything. The server that is being copied originally had an LVM so there is no obvious conflict here.I don't care whether it is fast, it is a recovery operation not a prime server. Ta Ken On Wed, 2009-05-13 at 21:10 +1000, Tony Sceats wrote: I'm actually not sure I'm reading this right, but I don't think you want to dd to an NFS share? Maybe you meant XFS should match correctly? If you're trying to do it over a network, I would have thought you'd have more luck by piping dd through an nc connection, then dd back to a disk on the target ie, on old server : dd if=/dev/vg1/lv1 conv=noerror,sync | nc NEWSERVER PORT on new server: nc -l PORT | dd of=/dev/vg/lv01 Also I'm not sure how the LVM is going to interact because I don't tend to use LVM on my production servers with XFS On Wed, May 13, 2009 at 7:14 PM, Ken Foskey fos...@tpg.com.au wrote: I have a very large xfs file system that is corrupt and there is an IO error in the middle of the file system, xfs_repair crashes. A bit of reading and I have a solution, just thought I would put it out there in case I have forgotten something. Booting xeon server with 32 bit Ubuntu live CD. All the hard disks on new server in LVM giving me 3.76 TB. I am copying 3.3TB from the other server. First I need to grab a copy of the data across the network, from my new server I access the old server: ssh r...@server 'dd if=/dev/vg1/lv1 conv=noerror,sync' | dd of=/dev/vg/lv01 next I simply repair it xfs_repair /dev/vg/lv01 Mount it. I figure this bit is easy by my reading, Ubuntu handles xfs out of the box (read only so that it cannot be used as a proper server) I am reading from a 32 bit server and booting a Xeon server with 32 bit live CD. This means the NFS should match correctly. Is the above basically correct? Is there any hints that I might need. Ta Ken -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] recovering xfs
ah, it appears as though I missed the close quote on your ssh command when I read it last night, and I also saw NFS, which was why I was a bit confused as to what you were trying :) On Wed, May 13, 2009 at 9:56 PM, Ken Foskey fos...@tpg.com.au wrote: Thanks for replying. nc in your command simply replaces ssh, so I think this is the same thing. I am not using nfs, I cannot mount anything so it is raw, going logical volume to logical volume. They are slightly different sizes, hope that will not kill anything. The server that is being copied originally had an LVM so there is no obvious conflict here.I don't care whether it is fast, it is a recovery operation not a prime server. Ta Ken On Wed, 2009-05-13 at 21:10 +1000, Tony Sceats wrote: I'm actually not sure I'm reading this right, but I don't think you want to dd to an NFS share? Maybe you meant XFS should match correctly? If you're trying to do it over a network, I would have thought you'd have more luck by piping dd through an nc connection, then dd back to a disk on the target ie, on old server : dd if=/dev/vg1/lv1 conv=noerror,sync | nc NEWSERVER PORT on new server: nc -l PORT | dd of=/dev/vg/lv01 Also I'm not sure how the LVM is going to interact because I don't tend to use LVM on my production servers with XFS On Wed, May 13, 2009 at 7:14 PM, Ken Foskey fos...@tpg.com.au wrote: I have a very large xfs file system that is corrupt and there is an IO error in the middle of the file system, xfs_repair crashes. A bit of reading and I have a solution, just thought I would put it out there in case I have forgotten something. Booting xeon server with 32 bit Ubuntu live CD. All the hard disks on new server in LVM giving me 3.76 TB. I am copying 3.3TB from the other server. First I need to grab a copy of the data across the network, from my new server I access the old server: ssh r...@server 'dd if=/dev/vg1/lv1 conv=noerror,sync' | dd of=/dev/vg/lv01 next I simply repair it xfs_repair /dev/vg/lv01 Mount it. I figure this bit is easy by my reading, Ubuntu handles xfs out of the box (read only so that it cannot be used as a proper server) I am reading from a 32 bit server and booting a Xeon server with 32 bit live CD. This means the NFS should match correctly. Is the above basically correct? Is there any hints that I might need. Ta Ken -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] recovering xfs
Problem dd sat and looped on a single point last night. I have downloaded a statically linked version of dd_rescue and I am now trying this. Fingers crossed. ssh r...@server '/root/dd_rescue /dev/vg1/lv1 -' | cat /dev/vg1/lv1 It is coming up with this error though: dd_rescue: (info): ipos: 34031.5k, opos: 34031.5k, xferd: 34031.5k * errs: 7, errxfer: 3.5k, succxfer: 34028.0k +curr.rate:11628kB/s, avg.rate: 149kB/s, avg.load: 0.1% dd_rescue: (warning): /dev/vg1/lv1 (34031.5k): Invalid argument! -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] recovering xfs
And what is being logged in dmesg? Your kernel should be spewing a whole lot of error messages if your physical media is returning errors. Adrian On Thu, May 14, 2009, fos...@tpg.com.au wrote: Problem dd sat and looped on a single point last night. I have downloaded a statically linked version of dd_rescue and I am now trying this. Fingers crossed. ssh r...@server '/root/dd_rescue /dev/vg1/lv1 -' | cat /dev/vg1/lv1 It is coming up with this error though: dd_rescue: (info): ipos: 34031.5k, opos: 34031.5k, xferd: 34031.5k * errs: 7, errxfer: 3.5k, succxfer: 34028.0k +curr.rate:11628kB/s, avg.rate: 149kB/s, avg.load: 0.1% dd_rescue: (warning): /dev/vg1/lv1 (34031.5k): Invalid argument! -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html -- - Xenion - http://www.xenion.com.au/ - VPS Hosting - Commercial Squid Support - - $25/pm entry-level VPSes w/ capped bandwidth charges available in WA - -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] recovering xfs
Quoting Adrian Chadd adr...@creative.net.au: And what is being logged in dmesg? Your kernel should be spewing a whole lot of error messages if your physical media is returning errors. Still working on this, the whole thing appeared to lock up and we have rebooted.We have now replaced a drive in the raid and hopefully this will work better. Of course this is going to take ages to rebuild. Ta Ken -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] recovering xfs
On Thu, May 14, 2009, fos...@tpg.com.au wrote: Still working on this, the whole thing appeared to lock up and we have rebooted.We have now replaced a drive in the raid and hopefully this will work better. Of course this is going to take ages to rebuild. lets hope you don't throw another disk during the rebuild with all of that parallel IO going on. :P Adrian -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html