This message is from the T13 list server.

Myself, over the past few years, I have focused consistently on the connection between 
the host and the device.

One of my more painful lessons was to learn to code defensively against the 
possibility that the read/write engine will terminate quietly without reporting an 
error.

Time and again I have experienced this in development.  If I don't code defensively 
and I'm lucky, the device will hang when this happens: half the device is trying to 
move more bits than the other half.  If I'm unlucky, the request to finish and return 
good status will win.

I've even found it helpful to snoop the pins.  If the read/write engine is still 
clocking data - for example, if Drq is still asserted for Ide Pio transfers - then the 
odds are good I should disbelieve its report of completion-without-error.

Sometimes the issue is as simple as what "complete" means.  Yea the write data has 
left the bus, but oh whoops it's still in the fifo.  Sometimes the issue is deeper.

I wonder how slow I should be to believe my experience is unique.  I'm disinclined to 
think that people who built the asic's I've used, that the people who wrote the 
read/write firmware, were unusually competent or incompetent.

Instead I'd guess that eliminating the last bug that causes 
premature-termination-without-error is as hard as eliminating any other class of bugs.

I know the people that worked here before me, that left me a legacy of an error 
classification system, reserved a bucket for premature-termination-without-error.  
This bucket also comes again and again to my attention because new test tools have a 
tradition of not properly checking for it.  Instead they wait until some later 
comparison to report, mysteriously, that the end of some stream of bytes was not 
written.

Pat LaVarre

>>> "Pat LaVarre" <[EMAIL PROTECTED]> 11/26/01 01:47PM >>>

> SHALL NEVER, NEVER, NEVER

To go and find something, there's nothing like shipping millions of chip programmed to 
look for it.

...

Only a host able to compare the count of bytes transferred to the count intended would 
correctly report that an error occurred.

Such is the difference between text and reality.    Pat LaVarre


Subscribe/Unsubscribe instructions can be found at www.t13.org.

Reply via email to