On Fri, Oct 22, 2010 at 09:13, William Lutter <[email protected]> wrote: > I have a desktop PC at work that shows a bad block. PC runs Scientific LInux > 5.0 and is a 2 TB WD Green Technology 2 Tb HD (Caviar Green WD20000CSRTL). > This one has worked fine out of the box for several months. No problems. > > Yesterday, the SMART diagnostics program smartctl (version 5.36) showed a bad > block. Deciding to waste some time on it, I used > http://smartmontools.sourceforge.net/badblockhowto.html approach. > > So, I unmounted, figured out the block and that it had a file associated with > it, determined the ext3 file system inode. But, I could not deduce the file > as it could not read the next file inode. I zeroed out the position using > dd and then rerunning smartctl that it showed another bad block: > > # 3 Extended offline Completed: read failure 90% 2151 > 3764125871 > # 4 Short offline Completed without error 00% 2151 - > # 5 Short offline Completed without error 00% 2150 - > # 6 Short offline Completed: read failure 90% 2146 > 3764125865 > # 7 Extended offline Completed without error 00% 2097 > > The LBA is in the one partition on the HD > Disk /dev/sdb: 2000.3 GB, 2000398934016 bytes > 255 heads, 63 sectors/track, 243201 cylinders, total 3907029168 sectors > Units = sectors of 1 * 512 = 512 bytes > Device Boot Start End Blocks Id System > /dev/sdb1 63 3907024064 1953512001 83 Linux > > Since, it's a new HD and not expecting catastrophic failure, I did not run > ddrescue. Having a copy of spinrite around, I ran that and the HD came out > squeaky clean. I use spinrite occasionally on windows xp and linux HD where > I expect only one bad block. Never had problems with it. Spinrite did not > find any more bad blocks. Of course, I had zeroed out the original one. > Rebooting and running e2fsck, the file system is clean. > > Running smartctl again, I again find a bad block at LBA 3764125871 > # 1 Extended offline Completed: read failure 90% 2169 > 3764125871 > # 2 Short offline Completed without error 00% 2169 -
My understanding of SMART is that once an event occurs it can not be cleaned up so smartctl is going to 'see' a bad block til the disk drive is replaced. Basically the bad block might have been remapped or not 'used' but the onboard counters only go up not down. [Since it could be indicative of other failures that might occur soon.] Everytime I have had this sort of issue with a drive I just had to replace the drive. -- Stephen J Smoogen. “The core skill of innovators is error recovery, not failure avoidance.” Randy Nelson, President of Pixar University. "We have a strategic plan. It's called doing things."" — Herb Kelleher, founder Southwest Airlines
