Hi all,

(For those on both lists, I also posted this on DC-404.)

I just had an experience with the hard drive on my laptop that I thought I'd 
share.  I think I created some bad sectors on my hdd without meaning to.  And, 
then I figured out how to fix it.  I will note that I'm referring to soft 
errors only, which do not involve head crashes, failed servos, failed heads, or 
failed controllers, or anything else physically wrong with the drive.

The hdd on my laptop is a Seagate hybrid hdd (with some flash cache).  (I don't 
know what the acronym is ... SSHD?)  Anyway, it's fundamentally a spinning 
drive.  It has about 1 year of run time on it and has never given me a problem. 
 The other day, my CrystalDiskInfo disk monitoring utility gave me an alert 
that said 8 sectors were about to be reallocated.  The smart system's c5 and c6 
attributes, current pending sector count and uncorrectable sector count, were 
set to 8.  I was bothered by the prospect of replacing a virtually new drive.

The first thing I do when a drive starts throwing bad sectors is run Spinrite 
on it.  (Fire proof suit on.)  My main intent here is not to discuss Spinrite, 
per se, but to discuss drive failure modes.  I know from research and personal 
experience that Spinrite can often recover otherwise unreadable sectors.  It 
has very advanced statistical methods which can recover partial sectors in many 
cases, whereas I know of no other program that can do this.  But, alas, in this 
case, although it said it recovered most of the data, it could not consider 
these sectors to be recovered.  Keep in mind, that the OS was ALREADY about to 
give up on these sectors.

In the mode that I use Spinrite in, it does an exhaustive surface analysis by 
reading each sector, inverting and writing the data, reading it again, and 
inverting and writing it again.  This verifies whether every possible sector 
can or cannot accept data and retain it.  It maps out the results on a 
graphical map on the screen.  At the time of this writing, I've scanned about 
2/3 of this 750 GB drive.  And, I'm watching the drive map showing the results 
and every one of the ~ 125 million sectors scanned, other than those 8, is 
working perfectly for both reading and writing.

So, I'm thinking to myself, it's a fairly new drive.  The overwhelming majority 
of the drive is working flawlessly.  There are no mechanical noises that are 
unusual.  So, I'm thinking and hoping that these are soft errors.  Yes, it 
could have been a minor head crash, but I treat my laptop gingerly.  It could 
be some magnetic coating decay, but you'd think there'd be evidence elsewhere 
on the disk.  Yes it could be a bad servo, but the drive is relatively new.

So, I ask myself, what kind of soft error could create some unreadable sectors. 
 A power failure comes to mind.  If this happens while writing, the magnetic 
domains could be stuck in limbo neither properly on nor off.  The result is 
digital gibberish.  But, I didn't remember any power failures recently.  Then, 
it hits me.  The system became unresponsive a couple of weeks ago and forced a 
power shutdown when it wouldn't automatically shut down Windows properly.  I 
decide to proceed on the premise that these 8 sectors were scrambled when I did 
that, and that there's nothing mechanically or electrically wrong with the 
drive.  (If I'm wrong, CrystalDiskInfo will likely show me more errors in the 
not too distant future.)

So, if you've scrambled your sectors to the point that the OS won't read them, 
and neither will an advanced utility, how do you fix it.  Well, assuming 
there's nothing electrically and mechanically wrong with the drive, you have to 
write to the sectors in question, and then thoroughly test them to make sure 
that they really can retain data.  Now, I didn't want to wipe the drive and 
reformat and restore all the data as that would be a major pain.

So, how do I write the sectors in place without destroying all my other 500 GB 
of data that's already there?  Well, it turns out that Spinrite does this by 
default.  For good sectors, it does the read invert write read invert write 
cycle and leaves the data exactly the way it found it.  For bad sectors, it 
statistically, and exhaustively, analyses an otherwise unreadable sector, which 
the OS CAN'T read anyway.  It determines what it thinks the data in the sector 
was.  Then it writes the data back to the sector, and verifies that the data 
was written correctly.  If it cannot write the data correctly, it relocates the 
sector to a spare.

My point is, that the uncorrectable sectors were automatically rewritten with 
the data that Spinrite thought should be there, even though it acknowledged 
that some was lost.  This is not a problem, as, with any other utility, all of 
the sector data would have been lost.  But, the main point is that the sector 
data was written.  Therefore, it is NO LONGER scrambled and unreadable.  If my 
theory was right, those sectors on the disk should now be fully usable.  They 
may contain partial or erroneous data as far as the file system is concerned, 
but again, they would have been completely lost anyway.  Some application may 
barf because of them.  But the hdd itself should be usable.

Well, how do I know for sure?  Well, Spinrite has a final maximum mode of 
operation that not only exhaustively does a surface analysis, but it also marks 
previously questionable sectors as usable again, assuming they pass these 
tests.  I rarely use this mode.  Most of the time, if I believe the hard drive 
really is developing bad sectors, I replace it.  But, in this case, I just 
didn't believe that.  So, I ran Spinrite in maximum mode and told it to restore 
all previously questionable but usable sectors to health.  To save time, I only 
ran it on the region that had failed before.

The previously unreadable sectors passed the surface analysis test with flying 
colors.  I rebooted the computer and checked CrystalDiskInfo.  It was perfectly 
happy, with all blue lights (blue is good) instead of yellow (yellow is bad).  
The reallocated sectors count was always 0 and still was.  The pending sector 
and uncorrectable sector counts had gone from 8 back to 0.  As far as I know, 
my hdd is as functional as it ever was, will hopefully last much longer, and 
was probably throwing fake sector errors beause I had accidentally scrambled 
the data in those sectors.

It is my theory that many hdd errors in the consumer space are this type.  This 
is because consumers are very fond of forcing the computer to shutdown before 
it's ready.  There's even a menu option in Windows when you're shutting down 
that says "xyz app is preventing your computer from shutting down" and there is 
a "force shutdown" button.  You should never press that unless you've waited a 
good while and think you have no choice.  What I did was worse, as I forcibly 
deprived the unit of power, although I thought I had no choice.

It's also possible that many soft but correctable errors occur in the server 
space, that could be corrected with appropriate data scrubbing, but which cause 
drives to be replaced without needing it.

So, that's how I created some bad sectors on my hard drive, then fixed them, 
all without destroying the data on the drive or replacing the drive.

Sincerely,

Ron



--

Sent from my Android Acer A500 tablet with bluetooth keyboard and K-9 Mail.
Please excuse my potential brevity if I'm typing on the touch screen.

(PS - If you email me and don't get a quick response, you might want to
call on the phone.  I get about 300 emails per day from alternate energy
mailing lists and such.  I don't always see new email messages very quickly.)

Ron Frazier
770-205-9422 (O)   Leave a message.
linuxdude AT techstarship.com

_______________________________________________
tech-chat mailing list
[email protected]
http://lists.linuxmoose.com/mailman/listinfo/tech-chat

Reply via email to