Title: Message
John,
 
I would generally concur with your statement, with the exception that you may not have gotten R_ERR - you could have gotten an illegal link layer transition like a SYNC or other unexpect primitive from the device.
 
With regards to retries, it would seem that the designer (and the spec) would want to put a limit either on the number of retries, or a time limit for the retries. I know of a design that does not retry and sets the P bit  in the SError register on R_ERR.
 
I have noticed that software reset is not going to work in many cases due to the way the device state machine was specified. There are many cases where the link is idle, but the device CL is in the middle of what it thinks is an operation. If that operation is hung, it will not be looking at any FIS comming in. Seems to me that we should specify in the spec that when a FIS arrives in one ot these statesm that if SRST is set it should take effect.
 
Software timeouts are the name of the game in ATA, but if SRST isn't going to do much in SATA, then we are left with hardware reset, a pretty heavy tool.
 
Craig Stoops
Expert I/O
www.expertio.com
-----Original Message-----
From: John Masiewicz [mailto:[EMAIL PROTECTED]
Sent: Thursday, October 09, 2003 12:07 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: [t13] HTCM1 and HTCM2 and command retries

Craig,
It seems to me that checking the FIS Status with error means R_ERR was received, and therefore also means that the command was NOT received by the device, possibly due to a transmission or reception error. It does not mean that the command was rejected. In this case, BSY was not set at the device, but it is still set at the host. A retry may be successful so there is no need to clear it at the host unless there is some type of higher level timeout.
 
So in the state diagram, the transport handled the R_ERR by retrying the command FIS, or by inaction allowing it to be retried. This is implementation specific.
 
My many questions about what happens in illegal operations or hangs by one side or the other, the typical response has been "ATA times-out, and that's how it works". Specific implementations may be much better.
 
I also believe that the retry algorithm is implementation dependent, and I agree with you that the spec does not give this guidance very well. In my opinion, the designer should try to eliminate "hangs" everywhere they might occur. You will notice that there is a transition that says SRST was written by the host. I think the intent was that the host timed-out and wrote SRST to the device Control Register, which then SYNC terminated the retry process and sent a SRST via CONTROL FIS to the device from state HTI1 (mandatory transition).  
 
It is not very clearly worded, but I think it is basically correct.
 
John Masiewicz
Western Digital
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]On Behalf Of Craig Stoops
Sent: Monday, October 06, 2003 10:55 AM
To: [EMAIL PROTECTED]
Subject: [t13] HTCM1 and HTCM2 and command retries

 
In HTCM2, if the link layer reports an R_ERR, it would appear that the state machine goes back to HTCM1 and retries the command.
 
Q1: I thought that retries were optional at this layer, and would be retried by software. But it looks manatory in the spec.
 
Q2: In the case of either illegal link transition, or R_ERR, shouldn't the BSY bit be cleared (if not retrying) and some other bit be
      set in the shadow regs? Is is not clear (and not just in this section) what bits should be set if the command was not sucessfully
      sent to the device.
 
Craig Stoops
Expert I/O

Reply via email to