Re: hard disk failure - now what?

2009-08-27 Thread Sebastian Seidl

George Davidovich wrote:

On Wed, Aug 26, 2009 at 04:45:40PM -0400, Jerry McAllister wrote:
  

On Wed, Aug 26, 2009 at 10:23:47PM +0200, Roland Smith wrote:



On Wed, Aug 26, 2009 at 12:13:48PM -0700, George Davidovich wrote: I
remember this special non-condictive 3M fluid that can be used to
cool electronics. A group of hackers dunked a complete PC minus the
case and power supply in this stuff. The fluid itself was cooled
with liquid nitrogen. They everclocked it something wicked. Not very
practical though. :-)
  

A number of supercomputers from Cray and Control Data and maybe some
other places used this sort of thing on some experimental systems.  I
don't know if any ever were put in to commercial production.  They
submerged who boards in to it and then supercooled the fluid.   I
don't remember the chemical names.  



I do, but have no idea why.

http://en.wikipedia.org/wiki/Perfluorohexane

  
The fluid was a relative of Freon and held sufficient levels of oxygen 
to support lung breathers.  They used to have a tank with a live mouse 
submerged in it bouncing around and seeming to have no trouble not 
choking or drowning.  



  

A variation of it was also researched as a blood substitute for some
special medical needs.  I don't know how far that went.I know it
is not all fantasy because I saw the live mouse.   



I believe you.  I saw a similar scene in a movie, so I already knew it
had to be true.  Bonus points for anyone that can add to this thread's
collection of off-topic but semi-interesting trivia and name the movie. 

  

I didn't try the blood substitute.



How do you save a drowning mouse?
Use mouse to mouse resuscitation.

Thanks, I'll be here all week.  Try the veal instead.

  
If the the freezer doesn't work I suggest finding an identical drive and 
replace the electronic board. Worked for many damaged drives.


Regards,
Sebastian Seidl

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-27 Thread Mark Stapper
Gary Gatten wrote:
 Naw, I don't recall the POST error exactly, but from what I remember it
 couldn't find a boot device.  Could've been the controller, but from
 what I recall I swapped the drive (later) and all was good.  I really
 don't recall though - I could've put the bad drive in a good laptop
 and fixed it that way - really don't recall details.  Wish I could fix
 some other problems by throwing them in a freezer!
   
Some try to solve their marital problems with a freezer... and an axe ;-)



signature.asc
Description: OpenPGP digital signature


Re: hard disk failure - now what?

2009-08-26 Thread cpghost
On Mon, Aug 24, 2009 at 02:51:41PM -0600, Tim Judd wrote:
  Buy spinrite, no matter what.
 
 It's OS/FS independent.  it works on the bits stored on the magnetic
 platters, NOT on a filesystem.  TiVo, Linux, BSD and Mac OSX drives
 are treated the same.  Bits on a magnetic platter.  It's recovery
 stems from the randomization and movement of the head to the sector in
 question that allows it to salvage any bits it can (for example, other
 recovery will abandon 512bytes if 1 bit cannot be read.  spinrite will
 recover 512bytes-1bit to a hard drive's spare sector once spinrite
 says i'm done working with this sector.)  It leads to a very
 successful rate.

(Disclaimer: I'm not familiar with spinrite.)

512bytes-1bit may be read back, but you can't be sure that those are
the correct bytes! IIRC, sectors are usually protected by some kind of
ECC. Simply ignoring the ECC and reading raw magnetic data will all
too often result in corrupt sectors.

Of course, if you have out-of-band error correction or at least error
detection mechanisms (like .PAR or md5/sha1 checksums), raw magnetic
recovery is better than nothing, if you're desperate.

-cpghost.

-- 
Cordula's Web. http://www.cordula.ws/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-26 Thread Jerry
On Wed, 26 Aug 2009 18:10:38 +0200
cpghost cpgh...@cordula.ws wrote:

 On Mon, Aug 24, 2009 at 02:51:41PM -0600, Tim Judd wrote:
   Buy spinrite, no matter what.
  
  It's OS/FS independent.  it works on the bits stored on the magnetic
  platters, NOT on a filesystem.  TiVo, Linux, BSD and Mac OSX drives
  are treated the same.  Bits on a magnetic platter.  It's recovery
  stems from the randomization and movement of the head to the sector
  in question that allows it to salvage any bits it can (for example,
  other recovery will abandon 512bytes if 1 bit cannot be read.
  spinrite will recover 512bytes-1bit to a hard drive's spare sector
  once spinrite says i'm done working with this sector.)  It leads
  to a very successful rate.
 
 (Disclaimer: I'm not familiar with spinrite.)
 
 512bytes-1bit may be read back, but you can't be sure that those are
 the correct bytes! IIRC, sectors are usually protected by some kind of
 ECC. Simply ignoring the ECC and reading raw magnetic data will all
 too often result in corrupt sectors.
 
 Of course, if you have out-of-band error correction or at least error
 detection mechanisms (like .PAR or md5/sha1 checksums), raw magnetic
 recovery is better than nothing, if you're desperate.
 
 -cpghost.

I have used Spinrite several times with excellent results. In fact, I
recently used it to recover a Laptop drive that had become unusable.

Spinrite tries to turn off ECC if possible. It is not the cheapest
product; however, it works better than anything else I have tried on
bonked discs. Use it on its highest recover level and it will recover
the drive; although it may take a while.

http://www.grc.com/intro.htm

-- 
Jerry
ges...@yahoo.com

Lord, defend me from my friends; I can account for my enemies.

Charles D'Hericault
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-26 Thread Roland Smith
On Tue, Aug 25, 2009 at 11:46:50PM -0600, Kelly Martin wrote:
 plugging the drive in and accessing it, I heard those tell-tale signs
 of hard drive failure: clicks and pops and other unusual noises, so I
 know that it has some damage. I hate those sounds, having heard them
 on failing drives too many times before.

If the drive is that bad, it is doubtfull if dd or ddrescue will be able to
get a good copy.

  My question: what kind of checks and/or repair tools should I run on
  the damaged drive after it's mounted?
 
  As others have mentioned, first make a copy (with the disk unmounted) of the
  partitions on that disk with dd, saving them to another drive. That way you
  can experiment with the data without further deterioration of the
  original.
 
 I ran dd and it took over 20 hours to complete. In fact it just
 finished this evening, after running all day. Lots of FAILURE errors
 were reported along the way, enough to fill two console screens or
 more. And of course to complicate things I didn't have a spare drive
 as an output device that was the *same size*, so I used a smaller
 drive thinking that it wouldn't matter since the source drive wasn't
 full anyway. I have no idea if data is scattered around on the FFS
 filesystem such that cloning a mostly empty, larger drive onto
 something smaller might lose data... I searched Google and couldn't
 find the answer, so I proceeded anyway. It doesn't matter now though,
 as I have a new drive now and another plan.

Using dd you make a block-for block copy; dd doesn't know about filesystems.
You could pipe the output from dd through a compression program like gzip or
bzip2. That could yield a smaller image. But you'd have to uncompress it in
order to use it.

Or you could try just copying the filesystems separately. E.g. copy from
ad4s1f instead of the whole ad4. That way you can split the data over several
files which you can store in different places.

 I'm going to try dd a second time, but this time I'll use ddrescue as
 some people suggested and I'll make the target drive an
 identical-sized 500 Gbyte drive, which I purchased today. I imagine it
 will take a long time to create this cloned disk... hopefully with
 fewer errors than dd gave me, though we'll see.
 
I hope you get a good copy, but it doesn't sound too likely. I'm not a hardware
expert, but if the disk is really breaking down in the hardware or
electronics, it is not inconceivable that even reading might further
deteriorate it. If you do not get a good 1:1 copy, you'll have extra errors in
your data! Depending on the options you give dd, it will either skip blocks
with errors or fill it with zeroes or other characters. See the piece of the
manual page of fsck_ufs that describes the 'noerror' conversion.

 Indeed some of the partitions seem to be beyond repair. In particular
 my /var partition is totally fubar'ed. When using fsck_ffs I got all
 sorts of errors when trying to repair the partition, things like:
 
 BAD SUPER BLOCK: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE
 So I used the -b option suggested in the man page, fsck_ffs -y -b 160
 /dev/ad0s1d and it ran and fixed a few things, but then stopped with
 the following error:
 
 fsck_ufs: cannot alloc 4294967292 bytes for inoinfo

The meaning of errors is explained in Appendix A of Fsck - The UNIX File
System Check Program. You can find it this as
/usr/share/doc/smm/03.fsck/paper.ascii.gz

 MySQL databases are normally stored in /var/db/mysql. But then I
 remembered my MySQL server was actually running in a Jail environment,
 and therefore it was located at /usr/jails/myjail/var/db/mysql instead
 of /var/db/mysql, and therefore the jailed MySQL database was on a
 totally different partition. Lucky! And I was also very lucky that I
 could mount the large /usr partition in read-only mode and copy off
 the most critical files I needed, starting with the database. No
 errors on that part of the disk so far, at least with the few critical
 files I've copied over. Whew!

Congratulations!
 
 Until just a few minutes ago I didn't think there'd be a happy ending.
 But I've got the most critical data copied over now, the rest can
 wait. I'm going to go run dd a second time (well, ddrescue) now and
 then start work on the copy once it finishes, in a day or two.

Time to start thinking about a solid backup strategy as well. :-)


Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgpOcLejmqquP.pgp
Description: PGP signature


Re: hard disk failure - now what?

2009-08-26 Thread George Davidovich
On Wed, Aug 26, 2009 at 08:07:41PM +0200, Roland Smith wrote:
 On Tue, Aug 25, 2009 at 11:46:50PM -0600, Kelly Martin wrote:
  plugging the drive in and accessing it, I heard those tell-tale
  signs of hard drive failure: clicks and pops and other unusual
  noises, so I know that it has some damage. I hate those sounds,
  having heard them on failing drives too many times before.
 
 If the drive is that bad, it is doubtfull if dd or ddrescue will be
 able to get a good copy.

Probably true.  I hesitate to suggest this, but sticking the drive in a
freezer (preferrably in a ziplock bag) for a few hours or overnight
might help.  Stories from people claiming I swear it works! go back
years.  

To the exent it does work, it might give Kelly enough time to attempt
recovery.  If more time is required, he can try and find a creative
workaround for the 5 meter max length for USB cables.  Also,
experimenting with dry ice or acetone baths might prove to be
interesting, or at least educational. ;-)

-- 
George
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-26 Thread Roland Smith
On Wed, Aug 26, 2009 at 12:13:48PM -0700, George Davidovich wrote:
snip
  If the drive is that bad, it is doubtfull if dd or ddrescue will be
  able to get a good copy.
 
 Probably true.  I hesitate to suggest this, but sticking the drive in a
 freezer (preferrably in a ziplock bag) for a few hours or overnight
 might help.  Stories from people claiming I swear it works! go back
 years.  

Interesting.

 To the exent it does work, it might give Kelly enough time to attempt
 recovery.  If more time is required, he can try and find a creative
 workaround for the 5 meter max length for USB cables.  Also,
 experimenting with dry ice or acetone baths might prove to be
 interesting, or at least educational. ;-)

Acetone and electronics are _not_ a good mix! Acetone is extremely
flammable. It evaporates easily and can form explosive mixtures in air over a
wide range of concentrations. Not to mention that it would degrade/destroy
printed circuit boards; acetone breaks down the resin that binds the glass
fibers in the laminates! Not as fast as n-Methyl-2-pyrrolidone, bus fast
enough.

I remember this special non-condictive 3M fluid that can be used to cool
electronics. A group of hackers dunked a complete PC minus the case and power
supply in this stuff. The fluid itself was cooled with liquid nitrogen. They
everclocked it something wicked. Not very practical though. :-)

Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgpXZwydo45KR.pgp
Description: PGP signature


Re: hard disk failure - now what?

2009-08-26 Thread Jerry McAllister
On Wed, Aug 26, 2009 at 10:23:47PM +0200, Roland Smith wrote:

 On Wed, Aug 26, 2009 at 12:13:48PM -0700, George Davidovich wrote:
 snip
   If the drive is that bad, it is doubtfull if dd or ddrescue will be
   able to get a good copy.
  
  Probably true.  I hesitate to suggest this, but sticking the drive in a
  freezer (preferrably in a ziplock bag) for a few hours or overnight
  might help.  Stories from people claiming I swear it works! go back
  years.  
 
 Interesting.
 
  To the exent it does work, it might give Kelly enough time to attempt
  recovery.  If more time is required, he can try and find a creative
  workaround for the 5 meter max length for USB cables.  Also,
  experimenting with dry ice or acetone baths might prove to be
  interesting, or at least educational. ;-)
 
 
 I remember this special non-condictive 3M fluid that can be used to cool
 electronics. A group of hackers dunked a complete PC minus the case and power
 supply in this stuff. The fluid itself was cooled with liquid nitrogen. They
 everclocked it something wicked. Not very practical though. :-)

A number of supercomputers from Cray and Control Data and maybe some
other places used this sort of thing on some experimental systems.  I
don't know if any ever were put in to commercial production.  They submerged
who boards in to it and then supercooled the fluid.   I don't remember
the chemical names.  

The fluid was a relative of Freon and held sufficient levels of oxygen 
to support lung breathers.  They used to have a tank with a live mouse 
submerged in it bouncing around and seeming to have no trouble not 
choking or drowning.  A variation of it was also researched as a blood 
substitute for some special medical needs.  I don't know how far that 
went.I know it is not all fantasy because I saw the live mouse.   
I didn't try the blood substitute.

jerry


 
 Roland
 -- 
 R.F.Smith   http://www.xs4all.nl/~rsmith/
 [plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
 pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-26 Thread Polytropon
On Wed, 26 Aug 2009 12:13:48 -0700, George Davidovich free...@optimis.net 
wrote:
 Probably true.  I hesitate to suggest this, but sticking the drive in a
 freezer (preferrably in a ziplock bag) for a few hours or overnight
 might help.  Stories from people claiming I swear it works! go back
 years.  

I heared a similar suggestion from a guy who tried to get the
protection code out of a car radio. :-)



 To the exent it does work, it might give Kelly enough time to attempt
 recovery.  If more time is required, he can try and find a creative
 workaround for the 5 meter max length for USB cables. 

5 meters? I always thought USB is specified for 2 meters only.
I've never seen a 5 meters long USB cable, by the way.





-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-26 Thread George Davidovich
On Wed, Aug 26, 2009 at 04:45:40PM -0400, Jerry McAllister wrote:
 On Wed, Aug 26, 2009 at 10:23:47PM +0200, Roland Smith wrote:
 
  On Wed, Aug 26, 2009 at 12:13:48PM -0700, George Davidovich wrote: I
  remember this special non-condictive 3M fluid that can be used to
  cool electronics. A group of hackers dunked a complete PC minus the
  case and power supply in this stuff. The fluid itself was cooled
  with liquid nitrogen. They everclocked it something wicked. Not very
  practical though. :-)
 
 A number of supercomputers from Cray and Control Data and maybe some
 other places used this sort of thing on some experimental systems.  I
 don't know if any ever were put in to commercial production.  They
 submerged who boards in to it and then supercooled the fluid.   I
 don't remember the chemical names.  

I do, but have no idea why.

http://en.wikipedia.org/wiki/Perfluorohexane

 The fluid was a relative of Freon and held sufficient levels of oxygen 
 to support lung breathers.  They used to have a tank with a live mouse 
 submerged in it bouncing around and seeming to have no trouble not 
 choking or drowning.  

 A variation of it was also researched as a blood substitute for some
 special medical needs.  I don't know how far that went.I know it
 is not all fantasy because I saw the live mouse.   

I believe you.  I saw a similar scene in a movie, so I already knew it
had to be true.  Bonus points for anyone that can add to this thread's
collection of off-topic but semi-interesting trivia and name the movie. 

 I didn't try the blood substitute.

How do you save a drowning mouse?
Use mouse to mouse resuscitation.

Thanks, I'll be here all week.  Try the veal instead.

-- 
George
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-26 Thread Scott Schappell

On Aug 26, 2009, at 14:14:51, George Davidovich wrote:

I believe you.  I saw a similar scene in a movie, so I already knew it
had to be true.  Bonus points for anyone that can add to this thread's
collection of off-topic but semi-interesting trivia and name the  
movie.


What is The Abyss for 1000, Alex?

:)

Scott
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


RE: hard disk failure - now what?

2009-08-26 Thread Gary Gatten
I had a laptop years ago that started to die, but seemed to work OK when
first removed from a cold car.  After an hour or so it would die.  I
eventually put it in the freezer long enough to get what I needed off
the drive, so in some cases I would agree that cold is good!

-Original Message-
From: owner-freebsd-questi...@freebsd.org
[mailto:owner-freebsd-questi...@freebsd.org] On Behalf Of Polytropon
Sent: Wednesday, August 26, 2009 4:13 PM
To: George Davidovich
Cc: freebsd-questions@freebsd.org
Subject: Re: hard disk failure - now what?

On Wed, 26 Aug 2009 12:13:48 -0700, George Davidovich
free...@optimis.net wrote:
 Probably true.  I hesitate to suggest this, but sticking the drive in
a
 freezer (preferrably in a ziplock bag) for a few hours or overnight
 might help.  Stories from people claiming I swear it works! go back
 years.  

I heared a similar suggestion from a guy who tried to get the
protection code out of a car radio. :-)



 To the exent it does work, it might give Kelly enough time to attempt
 recovery.  If more time is required, he can try and find a creative
 workaround for the 5 meter max length for USB cables. 

5 meters? I always thought USB is specified for 2 meters only.
I've never seen a 5 meters long USB cable, by the way.





-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to
freebsd-questions-unsubscr...@freebsd.org





font size=1
div style='border:none;border-bottom:double windowtext 2.25pt;padding:0in 0in 
1.0pt 0in'
/div
This email is intended to be reviewed by only the intended recipient
 and may contain information that is privileged and/or confidential.
 If you are not the intended recipient, you are hereby notified that
 any review, use, dissemination, disclosure or copying of this email
 and its attachments, if any, is strictly prohibited.  If you have
 received this email in error, please immediately notify the sender by
 return email and delete this email from your system.
/font

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-26 Thread Jerry McAllister
On Wed, Aug 26, 2009 at 02:14:51PM -0700, George Davidovich wrote:

  
  A number of supercomputers from Cray and Control Data and maybe some
  other places used this sort of thing on some experimental systems.  I
  don't know if any ever were put in to commercial production.  They
  submerged who boards in to it and then supercooled the fluid.   I
  don't remember the chemical names.  
 
 I do, but have no idea why.
 
 http://en.wikipedia.org/wiki/Perfluorohexane
 
  The fluid was a relative of Freon and held sufficient levels of oxygen 
  to support lung breathers.  They used to have a tank with a live mouse 
  submerged in it bouncing around and seeming to have no trouble not 
  choking or drowning.  
 
  A variation of it was also researched as a blood substitute for some
  special medical needs.  I don't know how far that went.I know it
  is not all fantasy because I saw the live mouse.   
 
 I believe you.  I saw a similar scene in a movie, so I already knew it
 had to be true.  Bonus points for anyone that can add to this thread's
 collection of off-topic but semi-interesting trivia and name the movie. 

I vaguely remember a movie with it in, but I saw it in
person at Cray headquarters back when.

 
  I didn't try the blood substitute.
 
   How do you save a drowning mouse?
   Use mouse to mouse resuscitation.
 
 Thanks, I'll be here all week.  Try the veal instead.

Only with the asparagus.

jerry

 
 -- 
 George
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-26 Thread Polytropon
On Wed, 26 Aug 2009 16:30:59 -0500, Gary Gatten ggat...@waddell.com wrote:
 I had a laptop years ago that started to die, but seemed to work OK when
 first removed from a cold car.  After an hour or so it would die.  I
 eventually put it in the freezer long enough to get what I needed off
 the drive, so in some cases I would agree that cold is good!

That really sounds like a thermal problem (defective cooling)...



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


RE: hard disk failure - now what?

2009-08-26 Thread Gary Gatten
Naw, I don't recall the POST error exactly, but from what I remember it
couldn't find a boot device.  Could've been the controller, but from
what I recall I swapped the drive (later) and all was good.  I really
don't recall though - I could've put the bad drive in a good laptop
and fixed it that way - really don't recall details.  Wish I could fix
some other problems by throwing them in a freezer!

-Original Message-
From: Polytropon [mailto:free...@edvax.de] 
Sent: Wednesday, August 26, 2009 5:54 PM
To: Gary Gatten
Cc: George Davidovich; freebsd-questions@freebsd.org
Subject: Re: hard disk failure - now what?

On Wed, 26 Aug 2009 16:30:59 -0500, Gary Gatten ggat...@waddell.com
wrote:
 I had a laptop years ago that started to die, but seemed to work OK
when
 first removed from a cold car.  After an hour or so it would die.  I
 eventually put it in the freezer long enough to get what I needed off
 the drive, so in some cases I would agree that cold is good!

That really sounds like a thermal problem (defective cooling)...



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...





font size=1
div style='border:none;border-bottom:double windowtext 2.25pt;padding:0in 0in 
1.0pt 0in'
/div
This email is intended to be reviewed by only the intended recipient
 and may contain information that is privileged and/or confidential.
 If you are not the intended recipient, you are hereby notified that
 any review, use, dissemination, disclosure or copying of this email
 and its attachments, if any, is strictly prohibited.  If you have
 received this email in error, please immediately notify the sender by
 return email and delete this email from your system.
/font

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-26 Thread Polytropon
On Wed, 26 Aug 2009 20:07:41 +0200, Roland Smith rsm...@xs4all.nl wrote:
 If the drive is that bad, it is doubtfull if dd or ddrescue will be able to
 get a good copy.

There's an additional problem: Let's assume dd creates an 1:1 copy
of the file system in its actual state - nobody guarantees that
this file system is fully intact, or can be repaired.

I have (!) the problem myself that I got the dd copy from the partition
holding my home directory just fine, but the file system itself is
damaged in such a state that fsck_ffs cannot repair it. At least, I
could get data off it - EXCEPT my home directory, sadly. But that's
not a (physical) disk problem, but a file system related one.



 Using dd you make a block-for block copy; dd doesn't know about filesystems.
 You could pipe the output from dd through a compression program like gzip or
 bzip2. That could yield a smaller image. But you'd have to uncompress it in
 order to use it.

I'm often told that hard disks are cheap today, and it's much
more relaxing operating on a plain image than on a compressed
one.




 Or you could try just copying the filesystems separately. E.g. copy from
 ad4s1f instead of the whole ad4. That way you can split the data over several
 files which you can store in different places.

That is the encouraged method. In case you have separated file
systems, it's a quite optimum case. For example, you don't need
to mess around with a 20 GB /tmp partition if you intendedly want
to lose its data.



 I hope you get a good copy, but it doesn't sound too likely. I'm not a 
 hardware
 expert, but if the disk is really breaking down in the hardware or
 electronics, it is not inconceivable that even reading might further
 deteriorate it.

In case of such hardware defects that causes growing problems,
it's wise to get the data (1st) as fast as possible and (2nd)
as accurate as possible - before the disk completely dies.

In such a case, it's still possible to recover data, e. g. to
mount the disks (the cylinders or platters) into another drive
unit. But if the disks are defective theirselves...


 If you do not get a good 1:1 copy, you'll have extra errors in
 your data! Depending on the options you give dd, it will either skip blocks
 with errors or fill it with zeroes or other characters. See the piece of the
 manual page of fsck_ufs that describes the 'noerror' conversion.

As far as I remember, dd_rescue or ddrescue can handle such
problems. In case of errors, they retry and keep reading.



  fsck_ufs: cannot alloc 4294967292 bytes for inoinfo
 
 The meaning of errors is explained in Appendix A of Fsck - The UNIX File
 System Check Program. You can find it this as
 /usr/share/doc/smm/03.fsck/paper.ascii.gz

When I tried to repair my defective partition in another system
with less RAM, I got a similar error:

cannot alloc 1073796864 bytes for inoinfo

The real (usual) error is

fsck_4.2bsd: bad inode number 306176 to nextinode

It seems that more RAM is needed to store information.



 Time to start thinking about a solid backup strategy as well. :-)

The correct time to do so is BEFORE you start storing data. :-)



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-26 Thread Al Plant

Gary Gatten wrote:

I had a laptop years ago that started to die, but seemed to work OK when
first removed from a cold car.  After an hour or so it would die.  I
eventually put it in the freezer long enough to get what I needed off
the drive, so in some cases I would agree that cold is good!

-Original Message-
From: owner-freebsd-questi...@freebsd.org
[mailto:owner-freebsd-questi...@freebsd.org] On Behalf Of Polytropon
Sent: Wednesday, August 26, 2009 4:13 PM
To: George Davidovich
Cc: freebsd-questions@freebsd.org
Subject: Re: hard disk failure - now what?

On Wed, 26 Aug 2009 12:13:48 -0700, George Davidovich
free...@optimis.net wrote:

Probably true.  I hesitate to suggest this, but sticking the drive in

a

freezer (preferrably in a ziplock bag) for a few hours or overnight
might help.  Stories from people claiming I swear it works! go back
years.  


I heared a similar suggestion from a guy who tried to get the
protection code out of a car radio. :-)




To the exent it does work, it might give Kelly enough time to attempt
recovery.  If more time is required, he can try and find a creative
workaround for the 5 meter max length for USB cables. 


5 meters? I always thought USB is specified for 2 meters only.
I've never seen a 5 meters long USB cable, by the way.






Aloha,

Off Topic but very funny as well as interesting.

I have a usb cable that I bought  it on line and have used it for a 
small video camera that is 15 meters long and it works OK.



~Al Plant - Honolulu, Hawaii -  Phone:  808-284-2740
  + http://hawaiidakine.com + http://freebsdinfo.org +
  + http://aloha50.net   - Supporting - FreeBSD 6.* - 7.* - 8.* +
   email: n...@hdk5.net 
All that's really worth doing is what we do for others.- Lewis Carrol

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-26 Thread Roland Smith
On Thu, Aug 27, 2009 at 01:03:58AM +0200, Polytropon wrote:
 On Wed, 26 Aug 2009 20:07:41 +0200, Roland Smith rsm...@xs4all.nl wrote:
  If the drive is that bad, it is doubtfull if dd or ddrescue will be able to
  get a good copy.
 
 There's an additional problem: Let's assume dd creates an 1:1 copy
 of the file system in its actual state - nobody guarantees that
 this file system is fully intact, or can be repaired.

Certainly. If filesystem data is missing, there is only so much that fsck_ufs
can do about it.
 
  Using dd you make a block-for block copy; dd doesn't know about filesystems.
  You could pipe the output from dd through a compression program like gzip or
  bzip2. That could yield a smaller image. But you'd have to uncompress it in
  order to use it.
 
 I'm often told that hard disks are cheap today, and it's much
 more relaxing operating on a plain image than on a compressed
 one.

Of course. But if you are operating under restricted scape constraints...

  I hope you get a good copy, but it doesn't sound too likely. I'm not a
  hardware expert, but if the disk is really breaking down in the hardware
  or electronics, it is not inconceivable that even reading might further
  deteriorate it.
 
 In case of such hardware defects that causes growing problems,
 it's wise to get the data (1st) as fast as possible and (2nd)
 as accurate as possible - before the disk completely dies.

And (3rd) in as few tries as possible!

 In such a case, it's still possible to recover data, e. g. to
 mount the disks (the cylinders or platters) into another drive
 unit. But if the disks are defective theirselves...

I wonder if that is still possible with current drives? My impression was
(from a paper that I can't locate ATM) that data densities are so high that it
is extremely difficult to read the data with different arm/head assembly then
the one it was written with.

  Time to start thinking about a solid backup strategy as well. :-)
 
 The correct time to do so is BEFORE you start storing data. :-)

Very true! But since the lack of backups was what got the OP in this mess in
the first place...

Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgpuFA9QD2zWP.pgp
Description: PGP signature


Re: hard disk failure - now what?

2009-08-25 Thread perryh
Lowell Gilbert freebsd-questions-lo...@be-well.ilk.org wrote:
 Kelly Martin kellymar...@gmail.com writes:
  I just experienced a hard drive failure on one of my
  FreeBSD 7.2 production servers with no backup!
...
 First, try copying the entire disk, *without* mounting it.

Yep.

 Use dd(1) to get a copy of the whole disk.  I believe that
 conv=noerror may be necessary.

Much better:  use sysutils/ddrescue, which was written
specifically to deal with this sort of situation.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-25 Thread Jerry McAllister
On Mon, Aug 24, 2009 at 10:26:11PM +0200, Polytropon wrote:

 On Mon, 24 Aug 2009 12:29:19 -0600, Kelly Martin kellymar...@gmail.com 
 wrote:
  My question: what kind of checks and/or repair tools should I run on
  the damaged drive after it's mounted? Or should I mount it as
  read-only and start backing it up?
 
 Thou shalt not manipluate thy file systems while they are mounted. :-)
 Perform an fsck on the partitions first, then mount them ro. Copy
 the files you need.
 
 In case you can't reach essential files, you have the change to
 use forensic tools to get them.
 
 Finally, keep in mind that for further diagnostics and restore
 operations it's always wise not to use the original file systems,
 i. e. the original disk. Make dd copies of the partitions onto
 a working disk and use them instead. Luckily, most operations
 work on plain files as well as on block device specials.

dd will barf on bad bits too.
You can tinker to make it skip over the bad block, but it
won't read it.   

jerry


 
  I am hoping most of my data is
  still there, but also don't want to damage it further.
 
 Good idea. This encourages you to follow the advice given above.
 
 
 
  I desperately
  need to salvage the data, what do the kind people on this list
  recommend?
 
 BACKUPS!!! =^_^=
 
 
 
 -- 
 Polytropon
 Magdeburg, Germany
 Happy FreeBSD user since 4.0
 Andra moi ennepe, Mousa, ...
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-25 Thread Lowell Gilbert
per...@pluto.rain.com writes:

 Lowell Gilbert freebsd-questions-lo...@be-well.ilk.org wrote:
 Kelly Martin kellymar...@gmail.com writes:
  I just experienced a hard drive failure on one of my
  FreeBSD 7.2 production servers with no backup!
 ...
 First, try copying the entire disk, *without* mounting it.

 Yep.

 Use dd(1) to get a copy of the whole disk.  I believe that
 conv=noerror may be necessary.

 Much better:  use sysutils/ddrescue, which was written
 specifically to deal with this sort of situation.

Excellent suggestion.
-- 
Lowell Gilbert, embedded/networking software engineer, Boston area
http://be-well.ilk.org/~lowell/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-25 Thread Polytropon
On Tue, 25 Aug 2009 11:04:38 -0400, Jerry McAllister jerr...@msu.edu wrote:
 dd will barf on bad bits too.
 You can tinker to make it skip over the bad block, but it
 won't read it.   

As it has been suggested, there are interesting tools in the
ports collection. I'll post my famous list again. Among them,
note ddrescue and dd_rescue. But base system tools such as the
fetch program can help.


System:
dd
fsck_ffs
clri
fsdb
fetch -rR device
recoverdisk (!)

Ports:
ddrescue
dd_rescue
ffs2recov
magicrescue
testdisk
The Sleuth Kit:
fls
dls
ils
autopsy
scan_ffs
recoverjpeg
foremost
photorec

Those programs are not ordered in any way.


-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-25 Thread Kelly Martin
First, thanks to everyone for the really great replies. Many
suggestions were quite helpful and have kept me on track. I'll quote a
couple of people and then add some comments below.

On Mon, Aug 24, 2009 at 4:32 PM, Roland Smithrsm...@xs4all.nl wrote:
 It _could_ just be a bad or improperly connected SATA cable. Try changing or
 re-seating the cable.

I thought of that too, but no luck.

 Read errors cannot damage your data, but write errors can! Immediately stop
 all writing to the disk. Re-mount the partitions on that disk as read-only, or
 unmount them.

That was a consensus among everyone who replied, so I made that step
#1. I mounted the partitions read-only and crossed my fingers. Trying
to check the integrity of the data, or even get directory listings was
another matter, as I got various strange errors... which told me I
quite likely had some data loss.

 To see if a disk really is broken, install sysutils/smartmontools, and run
 'smartctl -a' on the disk. If you see errors in its report (e.g. reallocated
 sectors), the disk is dying and should be unplugged to prevent it from getting
 worse.

That's a good idea and I'll try to use it in the future. After
plugging the drive in and accessing it, I heard those tell-tale signs
of hard drive failure: clicks and pops and other unusual noises, so I
know that it has some damage. I hate those sounds, having heard them
on failing drives too many times before.


 My question: what kind of checks and/or repair tools should I run on
 the damaged drive after it's mounted?

 As others have mentioned, first make a copy (with the disk unmounted) of the
 partitions on that disk with dd, saving them to another drive. That way you
 can experiment with the data without further deterioration of the
 original.

I ran dd and it took over 20 hours to complete. In fact it just
finished this evening, after running all day. Lots of FAILURE errors
were reported along the way, enough to fill two console screens or
more. And of course to complicate things I didn't have a spare drive
as an output device that was the *same size*, so I used a smaller
drive thinking that it wouldn't matter since the source drive wasn't
full anyway. I have no idea if data is scattered around on the FFS
filesystem such that cloning a mostly empty, larger drive onto
something smaller might lose data... I searched Google and couldn't
find the answer, so I proceeded anyway. It doesn't matter now though,
as I have a new drive now and another plan.

You can use this disk image e.g. as a vnode-backed memory disk, see
 mdconfig(8). If you cannot get a good copy of the disk partitions it might be
 a good idea to get a quote from a professional hard drive data recovery
 company to do that for you. I've never had occasion to try this (hooray for
 backups) but I've heard it can be quite expensive. :-/

I'm going to try dd a second time, but this time I'll use ddrescue as
some people suggested and I'll make the target drive an
identical-sized 500 Gbyte drive, which I purchased today. I imagine it
will take a long time to create this cloned disk... hopefully with
fewer errors than dd gave me, though we'll see.

 Try using fsck_ffs on (copies of) the disk image to see if that can restore
 the damage. If the damage is beyond repair for fsck_ffs, you have a real
 problem. Of course is you have a good disk image, your data is still
 there, but you might have to use a forensics program like sysutils/sleuthkit
 or hexdump to try and piece files together. And even then you cannot be sure
 that there is no corrupted data in the files themselves. Good luck with that. 
 :-(

Indeed some of the partitions seem to be beyond repair. In particular
my /var partition is totally fubar'ed. When using fsck_ffs I got all
sorts of errors when trying to repair the partition, things like:

BAD SUPER BLOCK: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE
So I used the -b option suggested in the man page, fsck_ffs -y -b 160
/dev/ad0s1d and it ran and fixed a few things, but then stopped with
the following error:

fsck_ufs: cannot alloc 4294967292 bytes for inoinfo

The worst part of all is that the /var partition would normally be
okay to lose if it didn't have my MySQL database on it - the most
important data on the server. I just about choked down a golf ball
when I discovered my /var partition was in such rough shape and I
might be forced to use real recovery tools, or hire a professional for
$$$, or be out-of-luck.

MySQL databases are normally stored in /var/db/mysql. But then I
remembered my MySQL server was actually running in a Jail environment,
and therefore it was located at /usr/jails/myjail/var/db/mysql instead
of /var/db/mysql, and therefore the jailed MySQL database was on a
totally different partition. Lucky! And I was also very lucky that I
could mount the large /usr partition in read-only mode and copy off
the most critical files I needed, starting with the database. No
errors on that part of the disk so 

Re: hard disk failure - now what?

2009-08-24 Thread Tim Judd
On 8/24/09, Kelly Martin kellymar...@gmail.com wrote:
 I just experienced a hard drive failure on one of my FreeBSD 7.2
 production servers with no backup! I am so mad at myself for not
 backing up!! Now it's a salvage operation. Here are the type of errors
 I was getting on the console, over-and-over:

 ad4: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=441633503
 ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout -
 completing request directly
 ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout -
 completing request directly
 ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
 ad4: FAILURE - WRITE_DMA48 timed out LBA=441633375
 g_vgs_done():ad4s1f[WRITE(offset=216338284544, length=16384)]error = 5

 I could still login to the machine (after an eternity) but got lots of
 read/write errors along the way.  The offset shown in the errors kept
 changing, so I thought it was a hardware eSATA controller issue
 instead of a bad sector on the drive -  I replaced the motherboard,
 but the problem persisted. So I bought a new hard drive and have
 re-installed FreeBSD 7.2 on it. I'd like to plug in the old hard drive
 today, mount it and salvage as much as I can... especially the
 database files, config files, etc.

 My question: what kind of checks and/or repair tools should I run on
 the damaged drive after it's mounted? Or should I mount it as
 read-only and start backing it up? I am hoping most of my data is
 still there, but also don't want to damage it further. I desperately
 need to salvage the data, what do the kind people on this list
 recommend?

 thanks,
 kelly


If I were you, get a copy of spinrite (from grc.com) and always keep
it handy.  It can be risky on a drive already failing.  Here's what
I'd do

Buy spinrite, no matter what.

slave the bad drive, read-only mount..  even if the FS is dirty,
read-only.. no fsck.
copy the data you can (if any).
reboot and run spinrite on the bad drive, deepest analysis (level 4 or
5) [may take days, weeks or even reports of months]
re-slave the bad drive to the system, fsck and mount read-only.
compare and copy any additional data, if any/if applicable, you can.

Scrap/destroy the drive if it has sensitive data.  I crack open the
drive and dismantle the HDD platters from the spindle, break the
read-write head ribbon cable, and remove the circuit board on the
drive when I destroy drives.

Each component should be recycled (being the responsible citizen),
maybe on separate runs to remove the possibility of someone nosy
getting into your stuff.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-24 Thread Lowell Gilbert
Kelly Martin kellymar...@gmail.com writes:

 I just experienced a hard drive failure on one of my FreeBSD 7.2
 production servers with no backup! I am so mad at myself for not
 backing up!! Now it's a salvage operation. Here are the type of errors
 I was getting on the console, over-and-over:

 ad4: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=441633503
 ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout -
 completing request directly
 ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout -
 completing request directly
 ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
 ad4: FAILURE - WRITE_DMA48 timed out LBA=441633375
 g_vgs_done():ad4s1f[WRITE(offset=216338284544, length=16384)]error = 5

 I could still login to the machine (after an eternity) but got lots of
 read/write errors along the way.  The offset shown in the errors kept
 changing, so I thought it was a hardware eSATA controller issue
 instead of a bad sector on the drive -  I replaced the motherboard,
 but the problem persisted. So I bought a new hard drive and have
 re-installed FreeBSD 7.2 on it. I'd like to plug in the old hard drive
 today, mount it and salvage as much as I can... especially the
 database files, config files, etc.

 My question: what kind of checks and/or repair tools should I run on
 the damaged drive after it's mounted? Or should I mount it as
 read-only and start backing it up? I am hoping most of my data is
 still there, but also don't want to damage it further. I desperately
 need to salvage the data, what do the kind people on this list
 recommend?

First, try copying the entire disk, *without* mounting it.  Use dd(1) to
get a copy of the whole disk.  I believe that conv=noerror may be necessary.

-- 
Lowell Gilbert, embedded/networking software engineer, Boston area
http://be-well.ilk.org/~lowell/
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-24 Thread Polytropon
On Mon, 24 Aug 2009 12:29:19 -0600, Kelly Martin kellymar...@gmail.com wrote:
 My question: what kind of checks and/or repair tools should I run on
 the damaged drive after it's mounted? Or should I mount it as
 read-only and start backing it up?

Thou shalt not manipluate thy file systems while they are mounted. :-)
Perform an fsck on the partitions first, then mount them ro. Copy
the files you need.

In case you can't reach essential files, you have the change to
use forensic tools to get them.

Finally, keep in mind that for further diagnostics and restore
operations it's always wise not to use the original file systems,
i. e. the original disk. Make dd copies of the partitions onto
a working disk and use them instead. Luckily, most operations
work on plain files as well as on block device specials.



 I am hoping most of my data is
 still there, but also don't want to damage it further.

Good idea. This encourages you to follow the advice given above.



 I desperately
 need to salvage the data, what do the kind people on this list
 recommend?

BACKUPS!!! =^_^=



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-24 Thread Polytropon
On Mon, 24 Aug 2009 14:13:22 -0600, Tim Judd taj...@gmail.com wrote:
 If I were you, get a copy of spinrite (from grc.com) and always keep
 it handy.  It can be risky on a drive already failing.  Here's what
 I'd do
 
 Buy spinrite, no matter what.

Is it really such a good tool? From my own problems, I researched
that common recovery tools are R-Studio and UFS Explorer. Both
do not natively run on BSD, but the first one offers a bootable
CD. Without buying, you can run the diagnostics mode fullwise.
For recovery, you need to buy the program.

The Spinrite web page reads as follows:

The industry's #1 hard drive data recovery
software is NOW COMPATIBLE with NTFS,
FAT, Linux, and ALL OTHER file systems!

What? Linux and other file systems?

Is this just marketing, in order to look good to the not very
educated ones? Or do they not know what they're talking about?

In fact, I will keep an eye on this program. Maybe it can help me
get my data back (inode defect of $HOME entry). I'm reading their
web page some more right now.



 slave the bad drive, read-only mount..  even if the FS is dirty,
 read-only.. no fsck.

You can at least do one fsck run without any modification options,
like a read only file system check. This of course can - like
any read operation on the disk - be risky if the disk is fast
degrading, simply by using it.





-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-24 Thread Tim Judd
On 8/24/09, Polytropon free...@edvax.de wrote:
 On Mon, 24 Aug 2009 14:13:22 -0600, Tim Judd taj...@gmail.com wrote:
 If I were you, get a copy of spinrite (from grc.com) and always keep
 it handy.  It can be risky on a drive already failing.  Here's what
 I'd do

 Buy spinrite, no matter what.

 Is it really such a good tool? From my own problems, I researched
 that common recovery tools are R-Studio and UFS Explorer. Both
 do not natively run on BSD, but the first one offers a bootable
 CD. Without buying, you can run the diagnostics mode fullwise.
 For recovery, you need to buy the program.

 The Spinrite web page reads as follows:

   The industry's #1 hard drive data recovery
   software is NOW COMPATIBLE with NTFS,
   FAT, Linux, and ALL OTHER file systems!

It's OS/FS independent.  it works on the bits stored on the magnetic
platters, NOT on a filesystem.  TiVo, Linux, BSD and Mac OSX drives
are treated the same.  Bits on a magnetic platter.  It's recovery
stems from the randomization and movement of the head to the sector in
question that allows it to salvage any bits it can (for example, other
recovery will abandon 512bytes if 1 bit cannot be read.  spinrite will
recover 512bytes-1bit to a hard drive's spare sector once spinrite
says i'm done working with this sector.)  It leads to a very
successful rate.


 What? Linux and other file systems?

 Is this just marketing, in order to look good to the not very
 educated ones? Or do they not know what they're talking about?

 In fact, I will keep an eye on this program. Maybe it can help me
 get my data back (inode defect of $HOME entry). I'm reading their
 web page some more right now.


Again, works on the bits.  if it's a bit problem, it will do it's best
to fix the problem, unless it's a hardware defect and cannot be
relocated.  If enough sectors are relocated, and the drive has run out
of spare sectors, it's time to scrap the drive anyway.


 slave the bad drive, read-only mount..  even if the FS is dirty,
 read-only.. no fsck.

 You can at least do one fsck run without any modification options,
 like a read only file system check. This of course can - like
 any read operation on the disk - be risky if the disk is fast
 degrading, simply by using it.


which is why i recommend against making changes to the disk until a
spinrite has completed.


Personally, I setup a spinrite to be net-bootable (not officially
supported).  I can write a walkthrough to people who want to net-boot
it.  I won't provide spinrite, of course.


I currently netboot:
  FreeBSD
  memtest86
  spinrite

with no changes to my setup any time I want to boot anything.



 --
 Polytropon
 Magdeburg, Germany
 Happy FreeBSD user since 4.0
 Andra moi ennepe, Mousa, ...

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-24 Thread Polytropon
On Mon, 24 Aug 2009 14:51:41 -0600, Tim Judd taj...@gmail.com wrote:
 It's OS/FS independent.  it works on the bits stored on the magnetic
 platters, NOT on a filesystem.

Ah, I see. So it's primarily intended for diagnosing and recovering
from physically defective disks. Good to know, because there are
times when you exactly need to do this. So it's much more hardware
oriented than the usual candidates for recovery programs.

So the strange mentioning of Linux and other file systems just
seems to be of a marketing nature. :-)





-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-24 Thread Tim Judd
On 8/24/09, Polytropon free...@edvax.de wrote:
 On Mon, 24 Aug 2009 14:51:41 -0600, Tim Judd taj...@gmail.com wrote:
 It's OS/FS independent.  it works on the bits stored on the magnetic
 platters, NOT on a filesystem.

 Ah, I see. So it's primarily intended for diagnosing and recovering
 from physically defective disks. Good to know, because there are
 times when you exactly need to do this. So it's much more hardware
 oriented than the usual candidates for recovery programs.

 So the strange mentioning of Linux and other file systems just
 seems to be of a marketing nature. :-)

whatever you would like to call it, I find it accurate description of
the product and it avoids false advertising.


Not just diagnostics and recovery, it's for preventive maintenance,
and healthy operations too.  Most people who use it are in a
diagnostics and recovery, but if you always use it as preventive
maintenance, you'll never need to use it for diagnostics and recovery.


People complain about it: I keep running spinrite, but it never finds
problems!  exactly, it's doing it's job and not having to
recover.  It's doing the work the drive needs to swap out bad sectors
and everything.



 --
 Polytropon
 Magdeburg, Germany
 Happy FreeBSD user since 4.0
 Andra moi ennepe, Mousa, ...

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-24 Thread Polytropon
On Mon, 24 Aug 2009 15:32:05 -0600, Tim Judd taj...@gmail.com wrote:
 Not just diagnostics and recovery, it's for preventive maintenance,
 and healthy operations too.  Most people who use it are in a
 diagnostics and recovery, but if you always use it as preventive
 maintenance, you'll never need to use it for diagnostics and recovery.
 
 People complain about it: I keep running spinrite, but it never finds
 problems!  exactly, it's doing it's job and not having to
 recover.  It's doing the work the drive needs to swap out bad sectors
 and everything.

Well, and its price is not as high as most recovery tools.
So prevention is cheaper than intervention here. :-)



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to freebsd-questions-unsubscr...@freebsd.org


Re: hard disk failure - now what?

2009-08-24 Thread Roland Smith
On Mon, Aug 24, 2009 at 12:29:19PM -0600, Kelly Martin wrote:
 I just experienced a hard drive failure on one of my FreeBSD 7.2
 production servers with no backup! I am so mad at myself for not
 backing up!!

Welcome to the club. :-)

 Now it's a salvage operation. Here are the type of errors
 I was getting on the console, over-and-over:
 
 ad4: TIMEOUT - WRITE_DMA48 retrying (0 retries left) LBA=441633503
 ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout -
 completing request directly
 ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout -
 completing request directly
 ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly
 ad4: FAILURE - WRITE_DMA48 timed out LBA=441633375
 g_vgs_done():ad4s1f[WRITE(offset=216338284544, length=16384)]error = 5

It _could_ just be a bad or improperly connected SATA cable. Try changing or
re-seating the cable.

Read errors cannot damage your data, but write errors can! Immediately stop
all writing to the disk. Re-mount the partitions on that disk as read-only, or
unmount them.

To see if a disk really is broken, install sysutils/smartmontools, and run
'smartctl -a' on the disk. If you see errors in its report (e.g. reallocated
sectors), the disk is dying and should be unplugged to prevent it from getting
worse.

 My question: what kind of checks and/or repair tools should I run on
 the damaged drive after it's mounted?

As others have mentioned, first make a copy (with the disk unmounted) of the
partitions on that disk with dd, saving them to another drive. That way you
can experiment with the data without further deterioration of the
original. You can use this disk image e.g. as a vnode-backed memory disk, see
mdconfig(8). If you cannot get a good copy of the disk partitions it might be
a good idea to get a quote from a professional hard drive data recovery
company to do that for you. I've never had occasion to try this (hooray for
backups) but I've heard it can be quite expensive. :-/

Try using fsck_ffs on (copies of) the disk image to see if that can restore
the damage. If the damage is beyond repair for fsck_ffs, you have a real
problem. Of course is you have a good disk image, your data is still
there, but you might have to use a forensics program like sysutils/sleuthkit
or hexdump to try and piece files together. And even then you cannot be sure
that there is no corrupted data in the files themselves. Good luck with that. 
:-(


Roland
-- 
R.F.Smith   http://www.xs4all.nl/~rsmith/
[plain text _non-HTML_ PGP/GnuPG encrypted/signed email much appreciated]
pgp: 1A2B 477F 9970 BA3C 2914  B7CE 1277 EFB0 C321 A725 (KeyID: C321A725)


pgpG8KHu4CLdA.pgp
Description: PGP signature