At 2004-08-01T03:27:02+1200, Volker Kuhlmann wrote:
> Yes. That's pretty much the only disk-internal info available.

Yes, thus what I said originally... the disk's sector remapping is
invisible outside of the disk.

> It would be possible for the disk to remap on read problems, but not on
> a final read failure. It's not necessarily possible to remap on write
> problems unless the disk did a read afterwards, but that would seriously
> impair performance.

No.  Think about it.  If you read the sector and get an error, where are
you going to get a good copy of the data from so that you can do sector
remapping and write the data out to a good sector?  On write failures,
the drive has the data in a buffer while it's writing--if the write
fails, it can do a remapping quite easily.  Performance is hardly a
consideration in this sort of recovery case--are you really going to
notice that writing a sector to disk took 100ms instead of whatever it
usually takes?  No, particularly when there is 2-3 layers of buffers
between the user and the disk.

> I've mentioned what was relevant. I can't tell from the logs exactly
> what day the disk started to play up and on what day I updated the
> kernel (security update from 2.4 to 2.4). In any case an updated kernel
> seems an unlikely cause for disk surface errors. In real life I don't
> spend lots of time investigating the 0.0001% probabilities first. I had
> made clear that the problem started before I upgraded to SuSE 9.1.

If you go back and read your original post, you gave no information at
all about what had changed on your system recently.

> I don't need to spend heaps of time with debuganyfs, or in fact any
> time, to find that file first - the disk was kind enough to tell me the
> LBA before I even started. My exercise with dd was to verify that

Having the LBA address won't tell you which file is affected.

And you're right, you don't need to spend heaps of time with debug.*fs
if you stopped caring about which file was affected after asking how to
find it out... at which point this whole thread becomes pointless.

> Re. kernel turning off DMA: whether I put more than one device on an
> IDE bus is my decision. Yes it affects performance. Yes drives can be
> a mutual hindrance. Yes it can make IDE bus debugging more difficult.
> And yes, it makes zero difference when there's a problem with reading
> something from a magnetic surface.

Of course it's your decision.  But don't complain when the bus and all
attached devices are reset.  It is well known that IDE behaves this way.
Complaining about it isn't going to make it any better.

> Turning DMA off during read errors on initialliy booting the system is
> smart, because if the hardware can't do DMA the kernel won't boot

How, exactly, will the kernel know when the boot sequence has begun and
ended?  While the kernel boots there is little to no I/O done to your
disks.  Once the kernel boots and hands control over to init the system
is up as far as the kernel is concerned.

Turning DMA off during recovery makes the recovery easier and more
likely to succeed.  This fact has been determined by people who have a
clue about how IDE works.  If you really think they're wrong, produce a
patch and send it to LKML.

> [...deleted rest of clueless rant about IDE error recovery...]

Cheers,
-mjg
-- 
Matthew Gregan                     |/
                                  /|                [EMAIL PROTECTED]

Reply via email to