Robert Hancock wrote:


Kuan Luo wrote:
Hi, robert
One customer reported that their system received a nmi interrupt after
issuing "dd if=/dev/sdb of=/dev/null" on a defective disk in rhel4u6.
I tested it and found  that my system hung both in rhel4u6(2.6.9-67) and
2.6.24-rc7.
The patch can work well,  but I am not sure if the patch has other
potential effect on adma.
I attached a  file in case of lines breaked.

The below info comes from Gunther Mayer to reproduce the issue.
"
used a Seagate ST3500841NS 3.AE for my test; probably other seagate drives are also capable of creating media errors with the new hdparm-8.1: - compile hdparm-8.1 - hdparm -- yes-i-know-what-i-am-doing --make-bad-sector 60000 /dev/sdb
Unfortunately this does not succeed for nvidia sata controller (timeouts
et al.), but it worked fine on AHCI machine (e.g. FSC R640).
When I insert this newly created defective disk in Ultra 20, it reboots within seconds after issueing "dd if=/dev/sdb of=/dev/null". "

Signed-off-by: [EMAIL PROTECTED]

---
drivers/ata/sata_nv.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/ata/sata_nv.c b/drivers/ata/sata_nv.c
index ed5473b..e824260 100644
--- a/drivers/ata/sata_nv.c
+++ b/drivers/ata/sata_nv.c
@@ -837,9 +837,10 @@ static void nv_adma_tf_read(struct ata_port *ap,
struct ata_taskfile *tf)
        all shortly be aborted anyway. We assume that NCQ commands
are not
        issued via passthrough, which is the only way that switching
into
        ADMA mode could abort outstanding commands. */
-    nv_adma_register_mode(ap);
+    struct nv_adma_port_priv *pp = ap->private_data;
- ata_tf_read(ap, tf);
+    if (pp->flags & NV_ADMA_PORT_REGISTER_MODE)
+        ata_tf_read(ap, tf);
 }
static unsigned int nv_adma_tf_to_cpb(struct ata_taskfile *tf, __le16
*cpb)

This is basically avoiding switching into register mode, right? I don't think this is a very good solution as the point of the tf_read function is that it's supposed to read the taskfile provided by the drive to diagnose the error, so not doing this isn't a good thing.

Agree with this analysis -- if ->tf_read() is being called, then obviously the core wants a current copy of the device's ATA registers.

It is not a good solution to simply avoiding returning meaningful data, because -- as Robert notes -- we need tf_read for analysis.

        Jeff



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to