Re: HSM violation erros on sata_promise

2007-12-27 Thread Mikael Pettersson
ction 0x2 > ata4.00: port_status 0x2008 > ata4.00: cmd c8/00:08:3f:00:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 in > res 50/00:00:46:00:00/00:00:00:00:00/e0 Emask 0x2 (HSM violation) This is a familiar issue, one that may be fixed or at least made less frequent in recent

Re: HSM violation errors

2007-12-25 Thread Robert Hancock
Jeff Mitchell wrote: I'm seeing errors in dmesg and the like. It appears to be somewhat similar to the issue reported here: http://kerneltrap.org/mailarchive/linux-kernel/2007/8/25/164711 except that my machine doesn't freeze, and everything seems normal -- hopefully nothing like silent corrupti

HSM violation errors

2007-12-24 Thread Jeff Mitchell
00:01:00:00/40 Emask 0x2 (HSM violation) ata1.00: cmd 61/08:10:a6:fb:c5/00:00:01:00:00/40 tag 2 cdb 0x0 data 4096 out res 50/00:08:46:4c:d4/00:00:01:00:00/40 Emask 0x2 (HSM violation) ata1.00: cmd 61/08:18:fe:00:c8/00:00:01:00:00/40 tag 3 cdb 0x0 data 4096 out res 50/00:08:46:4c:

Re: [PATCH 02/15] libata: zero xfer length on ATAPI data xfer IRQ is HSM violation

2007-12-05 Thread Albert Lee
Tejun Heo wrote: > From: Albert Lee <[EMAIL PROTECTED]> > > Treat zero xfer length as HSM violation. While at it, add > unlikely()'s to ATAPI ireason and transfer length checks. > > tj: Formatted patch and added unlikely()'s. > > Signed-off-by: Albert

[PATCH 02/15] libata: zero xfer length on ATAPI data xfer IRQ is HSM violation

2007-12-04 Thread Tejun Heo
From: Albert Lee <[EMAIL PROTECTED]> Treat zero xfer length as HSM violation. While at it, add unlikely()'s to ATAPI ireason and transfer length checks. tj: Formatted patch and added unlikely()'s. Signed-off-by: Albert Lee <[EMAIL PROTECTED]> Signed-off-by: Tejun

Hitachi SATA HSM violation (Was: Re: Adding SATA disk with broken NCQ)

2007-11-06 Thread Simos Xenitellis
9] ata1.00: cmd 60/08:00:49:64:9f/00:00:06:00:00/40 tag 0 cdb 0x0 data 4096 in [ 51.294222] res 50/00:08:f9:5b:5a/00:00:06:00:00/40 Emask 0x2 (HSM violation) [ 51.294232] ata1.00: cmd 60/08:10:f9:5b:5a/00:00:06:00:00/40 tag 2 cdb 0x0 data 4096 in [ 51.294235] res 50/00:08:f

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-29 Thread Mark Lord
Alan Cox wrote: Why 512 words ? Though I have queued Mark's patch to be applied, my gut feeling would lean towards a single DRQ block, rather than 512. Why not just work from the old IDE code. ata_altstatus(ap); - ata_chk_status(ap); + ata_drain_fifo(ap, qc); ap->ops->c

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation (try#2)

2007-09-28 Thread Jeff Garzik
Mark Lord wrote: I think this original patch still applies cleanly on at least 2.6.23-rc7. Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, rather than just getting stuck there forever. Signed-off-by: Mark Lord <[EMAIL PROTECTED]> --- --- old/drivers/ata/libata

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-28 Thread Alan Cox
> > Why 512 words ? > > Though I have queued Mark's patch to be applied, my gut feeling would > lean towards a single DRQ block, rather than 512. Why not just work from the old IDE code. > > > >>ata_altstatus(ap); > >> - ata_chk_status(ap); > >> + ata_drain_fifo(ap, qc); > > > > ap->ops

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-28 Thread Jeff Garzik
Alan Cox wrote: Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, rather than just getting stuck there forever. Why 512 words ? Though I have queued Mark's patch to be applied, my gut feeling would lean towards a single DRQ block, rather tha

[PATCH] libata drain fifo on stuck DRQ HSM violation (try#2)

2007-09-28 Thread Mark Lord
Alan Cox wrote: Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, rather than just getting stuck there forever. Why 512 words ? ata_altstatus(ap); - ata_chk_status(ap); + ata_drain_fifo(ap, qc); ap->ops->cleanup(); might be wiser Actua

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-28 Thread Alan Cox
> Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, > rather than just getting stuck there forever. Why 512 words ? > ata_altstatus(ap); > - ata_chk_status(ap); > + ata_drain_fifo(ap, qc); ap->ops->cleanup(); might be wiser - To unsub

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-28 Thread Tejun Heo
> Nacked-by: scripts/checkpatch.pl Mark, it seems you'll have to get ACK from this dude first. :-) -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-28 Thread Andrew Morton
On Fri, 28 Sep 2007 02:48:28 -0700 Tejun Heo <[EMAIL PROTECTED]> wrote: > Mark Lord wrote: > > Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, > > rather than just getting stuck there forever. > > > > Signed-Off-By: Mark Lord <[EMAIL

Re: [PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-28 Thread Tejun Heo
Mark Lord wrote: > Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, > rather than just getting stuck there forever. > > Signed-Off-By: Mark Lord <[EMAIL PROTECTED]> Acked-by: Tejun Heo <[EMAIL PROTECTED]> -- tejun - To unsubscribe from

Re: Stardom SATA HSM violation

2007-09-27 Thread Mark Lord
Tejun Heo wrote: Alan Cox wrote: I think there have been enough cases where this draining was necessary. IIRC, ata_piix was involved in those cases, right? If so, can you please submit a patch which applies this only to affected controllers? I don't feel too confident about applying this to al

[PATCH] libata drain fifo on stuck DRQ HSM violation

2007-09-27 Thread Mark Lord
can you be bothered to regenerate the patch and post it one more time (again)? It seems we all agree the update is needed. I think this original patch still applies cleanly on at least 2.6.23-rc7. Drain up to 512 words from host/bridge FIFO on stuck DRQ HSM violation, rather than just getting s

Re: Stardom SATA HSM violation

2007-09-27 Thread Tejun Heo
Jeff Garzik wrote: > Tejun Heo wrote: >> Alan Cox wrote: I think there have been enough cases where this draining was necessary. IIRC, ata_piix was involved in those cases, right? If so, can you please submit a patch which applies this only to affected controllers? I don't fee

Re: Stardom SATA HSM violation

2007-09-27 Thread Jeff Garzik
Tejun Heo wrote: Alan Cox wrote: I think there have been enough cases where this draining was necessary. IIRC, ata_piix was involved in those cases, right? If so, can you please submit a patch which applies this only to affected controllers? I don't feel too confident about applying this to al

Re: Stardom SATA HSM violation

2007-09-27 Thread Tejun Heo
Alan Cox wrote: >> I think there have been enough cases where this draining was necessary. >> IIRC, ata_piix was involved in those cases, right? If so, can you >> please submit a patch which applies this only to affected controllers? >> I don't feel too confident about applying this to all SFF co

Re: Stardom SATA HSM violation

2007-09-27 Thread Alan Cox
> I think there have been enough cases where this draining was necessary. > IIRC, ata_piix was involved in those cases, right? If so, can you > please submit a patch which applies this only to affected controllers? > I don't feel too confident about applying this to all SFF controllers. Old IDE

Re: Stardom SATA HSM violation

2007-09-27 Thread Tejun Heo
Mark Lord wrote: > Tejun Heo wrote: >> Hello, >> >> Mark Lord wrote: >>> I reported a very similar bug back a few releases ago. >>> Anyone who wants to try it themselves, can do this with hdparm-7.7 (from >>> sourceforge): >>> >>>hdparm --drq-hsm-error /dev/sda >>> >>> Whether or not it hangs t

Re: HSM violation with ahci+WDC WD1600BEVS-22RST0

2007-09-24 Thread Maurizio Monge
No, i did not manage to improve (it should NOT be a dangerous error BTW). I simply think that this issue is because of buggy firmware, so i posted to linux-ide a patch to blacklist this hard disk from using NCQ (because it is triggering spurious completions). I don't know what the "blacklisting pol

Re: HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-14 Thread Eamonn Hamilton
OK, On Fri, 2007-09-14 at 08:56 -0500, Bruce Allen wrote: ... > Eamonn: could you please build the latest version of smartmontools from > CVS HEAD source and see if the problem exists in that version? Then write > back. I don't think this will help but want to eliminate obvious things. > I

Re: HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-14 Thread Bruce Allen
2 and the ata_piix driver. I'm guessing this isn't good :( Anybody gt any suggestions? The violations are : ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd b0/d2:f1:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 123392 in res 50/00:f1:00:4f:c2/00:0

Re: HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-13 Thread Tejun Heo
n 0x2 frozen > > ata1.00: cmd b0/d2:f1:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data > > 123392 in > > res 50/00:f1:00:4f:c2/00:00:00:00:00/00 Emask 0x202 (HSM > > violation) -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-10 Thread Eamonn Hamilton
Hi Tejun, > > Please disable or upgrade smartd. > Thanks for that, I checked the disks and sure enough they were in an extended self test, I aborted that and it's all back to normal. the only problem, however, is that the system is already running version 5.37 of the smartmontools package, whi

Re: HSM violation spew.

2007-09-08 Thread Dave Jones
On Fri, Sep 07, 2007 at 01:42:07PM +0900, Tejun Heo wrote: > Dave Jones wrote: > > scsi 2:0:0:0: Direct-Access ATA WDC WD3200AAJS-0 12.0 PQ: 0 ANSI: > > 5 > > This could have been truncated, please post the result of 'hdparm -I > /dev/sda'. Thanks. sda is a different model, the o

Re: HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-08 Thread Tejun Heo
ag 0 cdb 0x0 data > 123392 in > res 50/00:f1:00:4f:c2/00:00:00:00:00/00 Emask 0x202 (HSM > violation) Please disable or upgrade smartd. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Stardom SATA HSM violation

2007-09-07 Thread Mark Lord
Tejun Heo wrote: Hello, Mark Lord wrote: I reported a very similar bug back a few releases ago. Anyone who wants to try it themselves, can do this with hdparm-7.7 (from sourceforge): hdparm --drq-hsm-error /dev/sda Whether or not it hangs the machine does depend upon exactly which SATA LLD

Re: HSM violation spew.

2007-09-06 Thread Tejun Heo
Dave Jones wrote: > scsi 2:0:0:0: Direct-Access ATA WDC WD3200AAJS-0 12.0 PQ: 0 ANSI: 5 This could have been truncated, please post the result of 'hdparm -I /dev/sda'. Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to [E

Re: Stardom SATA HSM violation

2007-09-06 Thread Tejun Heo
Hello, Mark Lord wrote: > I reported a very similar bug back a few releases ago. > Anyone who wants to try it themselves, can do this with hdparm-7.7 (from > sourceforge): > >hdparm --drq-hsm-error /dev/sda > > Whether or not it hangs the machine does depend upon exactly which SATA > LLD is

Re: Stardom SATA HSM violation

2007-09-06 Thread Tejun Heo
Bryan Woods wrote: > The full dmesg and hdparm -I command output are attached. > > I have received word from the vendor that the Stardom 2611 will do > RAID0 or 1 under windows, but only RAID1 under Linux. (Their manual > said it worked with Linux but failed to mention the RAID mode > restriction

Re: Stardom SATA HSM violation

2007-09-06 Thread Bryan Woods
e Barracuda 7200 10"s. Here's the device: >>> >>> >>> http://www.synetic.net/Synetic-Products/Stardoms/SR-2611-SA/Stardom-2611.htm >>> >>> During the install and at different points in the process I get an "HSM >>> violat

Re: Stardom SATA HSM violation

2007-09-05 Thread Mark Lord
xception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/40 tag 0 cdb 0x0 data 0 res 58/00:01:00:00:00/00:00:00:00:00/40 Emask 0x2 (HSM violation) ata3: soft resetting port ata3.00: configured for UDMA/100 ata3: EH complete sd 2:0:0:0: [sda]

Re: Stardom SATA HSM violation

2007-09-05 Thread Andrew Morton
ng "stuck DRQ" host state machine error do_drq_hsm_error: Success ata status=0x58 ata error=0x00 ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/40 tag 0 cdb 0x0 data 0 res 58/00:01:00:00:00/00:00:00:00:00/40 Em

Re: Stardom SATA HSM violation

2007-09-05 Thread Mark Lord
e: http://www.synetic.net/Synetic-Products/Stardoms/SR-2611-SA/Stardom-2611.htm During the install and at different points in the process I get an "HSM violation" and the system becomes unresponsive. It looks like a similar situation to: http://lkml.org/lkml/2007/6/6/195 Will more r

Re: Stardom SATA HSM violation

2007-09-05 Thread Andrew Morton
s one SATA drive. If it matters, the > >> underlying HDs are "Seagate Barracuda 7200 10"s. Here's the device: > >> > >> > >> http://www.synetic.net/Synetic-Products/Stardoms/SR-2611-SA/Stardom-2611.htm > >> > >> During the inst

HSM violation on bootup, ICH7 + ata_piix 2.6.22

2007-09-03 Thread Eamonn Hamilton
this isn't good :( Anybody gt any suggestions? The violations are : ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd b0/d2:f1:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 123392 in res 50/00:f1:00:4f:c2/00:00:00:00:00/00 Emask 0x202 (HSM violation) ata1:

Re: Stardom SATA HSM violation

2007-09-03 Thread Tejun Heo
gt; >> http://www.synetic.net/Synetic-Products/Stardoms/SR-2611-SA/Stardom-2611.htm >> >> During the install and at different points in the process I get an "HSM >> violation" and the system becomes unresponsive. It looks like a similar >> situation to: >> >&

HSM violation with ahci+WDC WD1600BEVS-22RST0

2007-09-02 Thread Maurizio Monge
tion 0x2 frozen ata1.00: spurious completions during NCQ issue=0x0 SAct=0x33 FIS=004040a1:0008 ata1.00: cmd 60/08:00:78:89:b7/00:00:0f:00:00/40 tag 0 cdb 0x0 data 4096 in res 40/00:24:5f:d4:8b/00:00:0a:00:00/40 Emask 0x2 (HSM violation) ata1.00: cmd 60/08:08:6f:d2:8b/00:00:0a:00:00/4

Re: HSM violation spew.

2007-08-29 Thread Dave Jones
On Wed, Aug 29, 2007 at 02:49:25PM -0400, Dave Jones wrote: > Just noticed this in dmesg.. > > ata3.00: exception Emask 0x2 SAct 0x1fffd SErr 0x0 action 0x2 frozen > ata3.00: spurious completions during NCQ issue=0x0 SAct=0x1fffd > FIS=004040a1:0004 There's a bunch of these that have be

HSM violation spew.

2007-08-29 Thread Dave Jones
:00:24:00:00/40 Emask 0x2 (HSM violation) ata3.00: cmd 61/10:10:b1:f4:09/00:00:24:00:00/40 tag 2 cdb 0x0 data 8192 out res 40/00:84:11:f8:09/00:00:24:00:00/40 Emask 0x2 (HSM violation) ata3.00: cmd 61/10:18:c9:f4:09/00:00:24:00:00/40 tag 3 cdb 0x0 data 8192 out res 40/00:84:11:f8:09

Re: Stardom SATA HSM violation

2007-08-26 Thread Michal Piotrowski
> During the install and at different points in the process I get an "HSM > violation" and the system becomes unresponsive. It looks like a similar > situation to: > > http://lkml.org/lkml/2007/6/6/195 > > Will more recent kernels work with this hardware (should I k

Re: hsm violation

2007-06-24 Thread Tejun Heo
Andrew Morton wrote: > That great spew of "set_level status: 0" is fairly annoying and useless. I don't know where those are coming from. It's not from libata. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to [EMAIL PROTECTED] More majo

Re: hsm violation

2007-06-24 Thread Tejun Heo
Robert Hancock wrote: > Andrew Morton wrote: >> On Sun, 24 Jun 2007 14:32:22 +0200 Enrico Sardi <[EMAIL PROTECTED]> >> wrote: >>> [ 61.176000] ata1.00: exception Emask 0x2 SAct 0x2 SErr 0x0 action >>> 0x2 frozen >>> [ 61.176000] ata1.00: (spurious completions during NCQ issue=0x0 >>> SAct=0x2 F

Re: hsm violation

2007-06-24 Thread Robert Hancock
Andrew Morton wrote: On Sun, 24 Jun 2007 14:32:22 +0200 Enrico Sardi <[EMAIL PROTECTED]> wrote: [ 61.176000] ata1.00: exception Emask 0x2 SAct 0x2 SErr 0x0 action 0x2 frozen [ 61.176000] ata1.00: (spurious completions during NCQ issue=0x0 SAct=0x2 FIS=005040a1:0004) .. It's not obv

Re: hsm violation

2007-06-24 Thread Andrew Morton
04) > [ 61.176000] ata1.00: cmd 60/08:08:37:cc:00/00:00:0c:00:00/40 tag 1 > cdb 0x0 data 4096 in > [ 61.176000] res 50/00:08:27:3c:ed/00:00:0b:00:00/40 Emask > 0x2 (HSM violation) > [ 61.488000] ata1: soft resetting port > [ 61.66] ata1: SATA li

Re: libata fails to recover from HSM violation involving DRQ status

2007-05-10 Thread Mark Lord
Mark Lord wrote: Mark Lord wrote: I retested this again today on my new pure-SATA notebook with ata_piix. In this case, the DRQ drain is not necessary, but also doesn't harm anything. Tested it both ways. This is with a Hitachi HTS541612J9SA00 SATA drive. The original fault was on ata_piix

Re: libata fails to recover from HSM violation involving DRQ status

2007-05-10 Thread Mark Lord
ails due to HSM violation caused by stuck DRQ. Yeah, so far it's just PIO FROM DEVICE on a "SATA" device on ata_piix. It *may* be more widespread than that, but we'll have to test some others. I retested this again today on my new pure-SATA notebook with ata_piix. In thi

Re: libata fails to recover from HSM violation involving DRQ status

2007-05-01 Thread Mark Lord
Mark Lord wrote: Tejun Heo wrote: So, this is specific to SATA (the host side at least) piix && PIO READ, right? I think we can fit this code nicely into piix_sata_error_handler() if we make sure that it triggers under the right condition - after a PIO READ command fails due to HSM v

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-30 Thread Tejun Heo
n >>>>> ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 >>>>> res 58/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM >>>>> violation) >>>> Why do we not always put a '\n' in front of that last line ab

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-30 Thread Mark Lord
Emask 0x2 (HSM violation) Why do we not always put a '\n' in front of that last line above ?? Sometimes it seems to have it, and lots of times it does not have a '\n'. Weird. ## Test stuck DRQ on VIA-pata (ATAPI DVD/RW): ## Notice how the first "ata4.00:

sata_nv and smartctl -o/-S trigger HSM violation (2.6.21.1, 2.6.20)

2007-04-30 Thread Robin H. Johnson
mware Version: 5.01 ... From dmesg: ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd b0/d2:f1:00:4f:c2/00:00:00:00:00/00 tag 0 cdb 0x0 data 123392 in res 50/00:f1:00:4f:c2/00:00:00:00:00/00 Emask 0x202 (HSM violation) ata1: soft resetting port ata1: SATA link up

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-30 Thread Tejun Heo
Mark Lord wrote: > Tejun Heo wrote: >> >> Hmmm... that's very weird. I've never seen such problems. The report >> messages are printed in ata_eh_report() and both the cmd and res lines >> are printed by single invocation to printk(). Is the log captured using >> serial console? I think it could

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-30 Thread Mark Lord
issing*: res 58/00:02:00:00:02/00:00:00:00:00/40 Emask 0x2 (HSM violation) ata4: soft resetting port ata4.00: configured for UDMA/66 ata4: EH complete And in this case, the first line of diagnostics (the "cmd" line) is always missing. Why? Hmmm... that's very weird.

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Tejun Heo wrote: Hmmm... that's very weird. I've never seen such problems. The report messages are printed in ata_eh_report() and both the cmd and res lines are printed by single invocation to printk(). Is the log captured using serial console? I think it could be transmission error or buffe

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Tejun Heo
TAPI DVD/RW): >> ## Notice how the first "ata4.00: cmd ..." line is *missing*: >> >> res 58/00:02:00:00:02/00:00:00:00:00/40 Emask 0x2 (HSM violation) >> ata4: soft resetting port >> ata4.00: configured for UDMA/66 >> ata4: EH complete > > And

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Tejun Heo
Mark Lord wrote: > .. And here is another test of un-hacked 2.6.21, > this time for ata_piix with a pure PATA configuration. > Again, it passes with flying colours. Thanks a lot. I'd also like to try but I'm on the road and not bored enough (yet) to do that on my only working machine. It's good

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Mark Lord wrote: ## Test stuck DRQ on VIA-sata (disk): ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 res 58/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) Why do we not always

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
0:00:00/00:00:00:00:00/40 Emask 0x2 (HSM violation) ata1: soft resetting port ata1.00: configured for UDMA/100 ata1: EH complete SCSI device sda: 320173056 512-byte hdwr sectors (163929 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: write cache: enabled, read cach

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
t; IRQ 17 ## Test stuck DRQ on VIA-sata (disk): ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 res 58/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation) ata1: soft resetting

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Tejun Heo wrote: So, this is specific to SATA (the host side at least) piix && PIO READ, right? I think we can fit this code nicely into piix_sata_error_handler() if we make sure that it triggers under the right condition - after a PIO READ command fails due to HSM violation caused by s

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Tejun Heo
cache: enabled, doesn't > support DPO or FUA > > So no draining, and all is well again. > Odds look pretty good that this is just a PIO thing. So, this is specific to SATA (the host side at least) piix && PIO READ, right? I think we can fit this code nicely into piix_sa

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Tejun Heo
Mark Lord wrote: > Tejun Heo wrote: >> >> Anyways, can you try to hack it into ata_bmdma_error_handler() > > From greping the code, I don't see how that function would ever > be called from ata_piix. ?? Yeah, I meant ata_bmdma_drive_eh(). You apparently have figured that out already. Sorry abo

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Ah.. one more thing, is this draining also needed after DMA commands or only after PIO commands? My drive doesn't do IDENTIFY_DMA, so I fed it a READ_DMA instead with "no data", and libata recovered without draining. More specifically, here's what happens for READ_DMA(1 sector) with "NON_DA

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Mark Lord wrote: Tejun Heo wrote: Tejun Heo wrote: .. Anyways, can you try to hack it into ata_bmdma_error_handler() and see whether it actually works? You can check for AC_ERR_HSM there and drain data port if DRQ is set. After HSM, ATA_NIEN is set and the port should be quiescent at that poi

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Tejun Heo wrote: Anyways, can you try to hack it into ata_bmdma_error_handler() From greping the code, I don't see how that function would ever be called from ata_piix. ?? - To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to [EMAIL PROTECTED] Mo

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Mark Lord
Tejun Heo wrote: Tejun Heo wrote: .. Anyways, can you try to hack it into ata_bmdma_error_handler() and see whether it actually works? You can check for AC_ERR_HSM there and drain data port if DRQ is set. After HSM, ATA_NIEN is set and the port should be quiescent at that point. Sure, I'll d

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-29 Thread Tejun Heo
Jeff Garzik wrote: > Tejun Heo wrote: >> and thus clear DRQ, right? Stuck DRQ after SRST seems odd to me. > > Unfortunately not odd on ata_piix, which can get stuck DRQ-on somewhere > deep inside its IDE emulation engine. And neither draining the FIFO nor > SRST nor a couple other tricks ever he

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Tejun Heo
Tejun Heo wrote: > Mark Lord wrote: >> Jeff Garzik wrote: >>> It's not really a good idea for SATA. The "FIFO" often co-emulated by >>> the SATA controller and SATA phy. You just want to kick SATA really >>> hard (i.e. bus reset and friends). >> Sure. So why don't we do that now? > > We do that

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Jeff Garzik
Tejun Heo wrote: and thus clear DRQ, right? Stuck DRQ after SRST seems odd to me. Unfortunately not odd on ata_piix, which can get stuck DRQ-on somewhere deep inside its IDE emulation engine. And neither draining the FIFO nor SRST nor a couple other tricks ever helped. The only thing that

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Tejun Heo
Mark Lord wrote: > Jeff Garzik wrote: >> >> It's not really a good idea for SATA. The "FIFO" often co-emulated by >> the SATA controller and SATA phy. You just want to kick SATA really >> hard (i.e. bus reset and friends). > > Sure. So why don't we do that now? We do that. It's just that ata_

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Tejun Heo
da: Mode Sense: 00 3a 00 00 >> SCSI device sda: write cache: enabled, read cache: enabled, doesn't >> support DPO or FUA >> ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen >> ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 >>

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Alan Cox
> > This one does need dealing with. It happens in the real world and the old > > IDE paths for this do get triggered and used now and then (we know this > > because bugs in them were found). All it takes is a device and a > > controller disagreeing about the length of a data transfer to get in a >

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
read cache: enabled, doesn't support DPO or FUA ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 res 58/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation) ata1: soft resetting port ata1.00:

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
Jeff Garzik wrote: It's not really a good idea for SATA. The "FIFO" often co-emulated by the SATA controller and SATA phy. You just want to kick SATA really hard (i.e. bus reset and friends). Sure. So why don't we do that now? - To unsubscribe from this list: send the line "unsubscribe li

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Jeff Garzik
Alan Cox wrote: I am reluctant to do anything about this. This one does need dealing with. It happens in the real world and the old IDE paths for this do get triggered and used now and then (we know this because bugs in them were found). All it takes is a device and a controller disagreeing abo

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
Alan Cox wrote: I am reluctant to do anything about this. This one does need dealing with. It happens in the real world and the old IDE paths for this do get triggered and used now and then (we know this because bugs in them were found). All it takes is a device and a controller disagreeing abo

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Alan Cox
> I am reluctant to do anything about this. This one does need dealing with. It happens in the real world and the old IDE paths for this do get triggered and used now and then (we know this because bugs in them were found). All it takes is a device and a controller disagreeing about the length of

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Jeff Garzik
Mark Lord wrote: Actually, I'm not so sure that this problem hasn't *already* been posted to this very mailing list. http://lkml.org/lkml/2006/10/1/264 http://www.mail-archive.com/linux-ide@vger.kernel.org/msg05078.html ... What Tejun said at the end of that thread :) That one is a phy-level

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
Jeff Garzik wrote: Mark Lord wrote: .. I triggered this by accident, issuing an IDENTIFY command which incorrectly specified ATA_PROT_NODATA. My error, for sure, but libata never recovered from the "stuck DRQ bit" that resulted. .. Maybe we do need to recover from a stuck DRQ bit, but I'll wai

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Jeff Garzik
Mark Lord wrote: Tejun, While working on the new hdparm (version 7.0, released today), I ran into trouble when a buggy SG_IO/ATA_16 packet caused the libata EH to get confused. I triggered this by accident, issuing an IDENTIFY command which incorrectly specified ATA_PROT_NODATA. My error, for

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Alan Cox
> In the IDE driver, we had code to try and cope with stuck DRQ, > by just looping and reading from the data port a few times. > That could have been done better, but it worked a lot of the time, > back in those simpler days. It works very well. The current "old" IDE has some changes in the area b

Re: libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
Mark Lord wrote: Tejun, While working on the new hdparm (version 7.0, released today), I ran into trouble when a buggy SG_IO/ATA_16 packet caused the libata EH to get confused. I triggered this by accident, issuing an IDENTIFY command which incorrectly specified ATA_PROT_NODATA. My error, for

libata fails to recover from HSM violation involving DRQ status

2007-04-28 Thread Mark Lord
rite cache: enabled, read cache: enabled, doesn't support DPO or FUA ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen ata1.00: cmd ec/00:00:00:00:00/00:00:00:00:00/00 tag 0 cdb 0x0 data 0 res 58/00:00:00:00:00/00:00:00:00:00/00 Emask 0x2 (HSM violation) ata1: soft rese

Re: [PATCH 5/5] ahci: consider SDB FIS containing spurious NCQ completions HSM violation (regenerated)

2007-02-23 Thread Jeff Garzik
errors. Consider spurious NCQ completions HSM violation and freeze the port after it. EH will turn off NCQ after this happens several times. Eventually drives which show this behavior should be blacklisted for NCQ. Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> --- Regenerated against the c

[PATCH 5/5] ahci: consider SDB FIS containing spurious NCQ completions HSM violation (regenerated)

2007-02-20 Thread Tejun Heo
spurious NCQ completions HSM violation and freeze the port after it. EH will turn off NCQ after this happens several times. Eventually drives which show this behavior should be blacklisted for NCQ. Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> --- Regenerated against the current #up

Re: [PATCH 5/5] ahci: consider SDB FIS containing spurious NCQ completions HSM violation

2007-02-20 Thread Jeff Garzik
errors. Consider spurious NCQ completions HSM violation and freeze the port after it. EH will turn off NCQ after this happens several times. Eventually drives which show this behavior should be blacklisted for NCQ. Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> --- drivers/ata/ahci.c

[PATCH 5/5] ahci: consider SDB FIS containing spurious NCQ completions HSM violation

2007-02-01 Thread Tejun Heo
spurious NCQ completions HSM violation and freeze the port after it. EH will turn off NCQ after this happens several times. Eventually drives which show this behavior should be blacklisted for NCQ. Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> --- drivers/ata/ahci.c

[PATCH 5/5] ahci: consider SDB FIS containing spurious NCQ completions HSM violation

2007-02-01 Thread Tejun Heo
spurious NCQ completions HSM violation and freeze the port after it. EH will turn off NCQ after this happens several times. Eventually drives which show this behavior should be blacklisted for NCQ. Signed-off-by: Tejun Heo <[EMAIL PROTECTED]> --- drivers/ata/ahci.c