At 2004-08-08T18:06:56+1200, Volker Kuhlmann wrote:
> In my case, that sector was already written several times, but still
> causes the kernel to log errors every second time it's read. For some
> reason the drive appears to refuse to reallocate it (as the read
> problems don't stop).

Ah, it's that sort of failure.  As I suggested before, and since you
already know the LBA of the failing sector(s), just work out the
filesystem block(s) that are affected and add them to the filesystem bad
block list... hopefully that'll allow you to work around the problem
area of the media.

> It's difficult to make mistakes there. Use -b 512 and stick in the
> numbers from fdisk -lu.

Mistakes happen.  I don't know exactly what you were running, so it's
worth suggesting that badblocks was being initiated incorrectly,
especially given the warnings in the man page.

What you've posted will "work", but (to be pedantic) your '-b' parameter
is potentially incorrect--I'd be surprised if your filesystem was using
512 byte blocks.  Read the warning in the man page regarding the use of
bad blocks lists with produced by badblocks with an incorrect block size
setting.

In your case, you might not care if the reported bad blocks are
incorrect because you're using badblocks for testing--in that case,
there's no need to specify the block size at all.

You didn't specify what device you're running badblocks on--since you
mentioned using the numbers from 'fdisk -lu', it suggests that you might
be running it against part of the whole drive rather than a specific
partition--given that you're specifying the drive block size, that
_might_ result in badblocks scanning the correct part of the drive, but
it seems like a strange way to run badblocks.

Run badblocks against the affected partition (feel free to slim down the
area it tests using the start/end block parameters from there) rather
than "part of" the whole disk.  Use multiple passes if you haven't tried
that, and do a destructive write test with '-t random' and a largish
number of passes if you can and haven't tried already.

Note that this is general advice for using badblocks, since in your case
you already know which areas of the disk are bad due to the errors
logged by SMART and the kernel.

> Not strange at all.

Well, it's not _as_ strange when you explain exactly what's going on.

> badblocks verifies within seconds, at most minutes, after writing. The

Tried a large number of passes (-p)?

> magnetic surface holds the data for that long, ergo nothing to report
> by badblocks. When I do a dd if=/dev/hda of=/dev/null an hour later,
> the recording has sufficiently degraded for the drive to need several
> attempts to read it. At least that's how I interpret the kernel error,
> though I'm always open to better/more correct explanations
> (technically sound ones...).

Seems like a reasonable explanation.  It's difficult to offer many
others given that you provide additional bits of information about
what's going on in dribs and drabs.

-mjg
-- 
Matthew Gregan                     |/
                                  /|                [EMAIL PROTECTED]

Reply via email to