John,
I
would generally concur with your statement, with the exception that you may
not have gotten R_ERR - you could have gotten an illegal link layer transition
like a SYNC or other unexpect primitive from the device.
With
regards to retries, it would seem that the designer (and the spec) would want to
put a limit either on the number of retries, or a time limit for the retries. I
know of a design that does not retry and sets the P bit in the SError
register on R_ERR.
I have
noticed that software reset is not going to work in many cases due to the way
the device state machine was specified. There are many cases where the link is
idle, but the device CL is in the middle of what it thinks is an operation. If
that operation is hung, it will not be looking at any FIS comming in. Seems to
me that we should specify in the spec that when a FIS arrives in one ot these
statesm that if SRST is set it should take effect.
Software timeouts are the name of the game in ATA, but
if SRST isn't going to do much in SATA, then we are left with hardware reset, a
pretty heavy tool.
Craig
Stoops
Expert
I/O
-----Original Message-----
From: John Masiewicz [mailto:[EMAIL PROTECTED]
Sent: Thursday, October 09, 2003 12:07 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: [t13] HTCM1 and HTCM2 and command retriesCraig,It seems to me that checking the FIS Status with error means R_ERR was received, and therefore also means that the command was NOT received by the device, possibly due to a transmission or reception error. It does not mean that the command was rejected. In this case, BSY was not set at the device, but it is still set at the host. A retry may be successful so there is no need to clear it at the host unless there is some type of higher level timeout.So in the state diagram, the transport handled the R_ERR by retrying the command FIS, or by inaction allowing it to be retried. This is implementation specific.My many questions about what happens in illegal operations or hangs by one side or the other, the typical response has been "ATA times-out, and that's how it works". Specific implementations may be much better.I also believe that the retry algorithm is implementation dependent, and I agree with you that the spec does not give this guidance very well. In my opinion, the designer should try to eliminate "hangs" everywhere they might occur. You will notice that there is a transition that says SRST was written by the host. I think the intent was that the host timed-out and wrote SRST to the device Control Register, which then SYNC terminated the retry process and sent a SRST via CONTROL FIS to the device from state HTI1 (mandatory transition).It is not very clearly worded, but I think it is basically correct.John MasiewiczWestern Digital-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Craig Stoops
Sent: Monday, October 06, 2003 10:55 AM
To: [EMAIL PROTECTED]
Subject: [t13] HTCM1 and HTCM2 and command retriesIn HTCM2, if the link layer reports an R_ERR, it would appear that the state machine goes back to HTCM1 and retries the command.Q1: I thought that retries were optional at this layer, and would be retried by software. But it looks manatory in the spec.Q2: In the case of either illegal link transition, or R_ERR, shouldn't the BSY bit be cleared (if not retrying) and some other bit beset in the shadow regs? Is is not clear (and not just in this section) what bits should be set if the command was not sucessfullysent to the device.Craig StoopsExpert I/O