RE: [t13] a PIO example - noone should care

Pat LaVarre Mon, 17 Dec 2001 14:51:07 -0800

This message is from the T13 list server.


> "Mcgrath, Jim" <[EMAIL PROTECTED]> 12/17/01 12:29PM

> Why are you worried about this case at all?

Thank you again for your remarkably continued patience.

Granted, I began to focus on this before I understood how nothing but a device cutting 
the transfer short unexpectedly introduces an inaccuracy into Atapi Dma byte counts.

I see now that, in Atapi UDma as in Atapi Pio, the sender and the receiver always both 
know how many "word"s of data clocked across the bus.  I see now that many receivers 
of clocks do agree out-of-band to complain if anything but the expected number of 
clocks arrive.

> if you want to complete the command successfully,
> you have to re issue the command anyway.

Yes for block writes.

No for the class of Scsi command blocks that require the device to swallow quietly the 
idea of moving less than the max transfer permitted by its interpretation of the 
command block.

> Once again, it is a difference at a low level of the protocol

Good to hear.  Maybe we are seeing the same thing now?

Do we both see unexpectedly short byte counts inaccurate by as much as X * 2 + 1, 
rising with burst rate?

> If you have a Bad Media problem the command ends in error.
> It makes no difference at all how many bytes were transferred
> ...
> irrelevant at the higher level protocol

How do we know changing the count of bytes transferred makes no difference?  This 
count passes all the way up thru the Windows stack, all the way back to the app?

Do we both agree now that Atapi Dma & Pio are not bug-for-bug compatible at the level 
of counting bytes?  We are visibly no longer bug-for-bug compatible?  Trace of device 
A differs from trace of device B.  If device B came second, device B is at fault.  
That's business?

How confidently do you mean to affirm this irrelevance?  Could we make a business of 
insuring people silly enough to worry about this?

Any time I control device and host, I'll choose to be bug-for-bug compatible because I 
can trivially extend the protocol to support this with an analogue of Scsi's 
IgnoreWideResidue message.  (More often that that, I can choose to use Dma for nothing 
but block transfers.)

Therefore the only time I'll not be bug-for-bug compatible is when I'm looking less 
closely - when I'm relabelling stuff other people build.  The only time when I might 
discover I care is when I won't be looking carefully.  Ouch.

Bad Media is common enough to matter yet rare enough to be "rude" i.e. to have not 
been tested much in the host.  I mean, after all, I find even test cases as trivial as 
Absent Media often haven't been tested much in the host.

> irrelevant at the higher level protocol
> (since in this case an error blows away everything anyway),
> and so is not an issue.

I agree worries come in sizes.

Personally, my preferred answer is to make byte counts exact: have the host claim to 
have allocated no more than exactly the correct count of bytes, move exactly that, say 
anything else is an error.

My trouble is, I have to remember Windows is not this careful.  To be this careful 
violates the interoperability rule of TalkLikeWindows.  Choose any one: the data 
integrity of a pr*prietary sealed box, or Windows compatibility.

But more generally, to require exactly correct byte counts, because by Atapi/Scsi I 
cannot trust the device to complain of a too-small count of bytes, I have to have an 
Ide host that can accurately count bytes, which means to use Atapi Dma I have to go 
beyond the standard Ansi protocol, which brings me here.

> My preferred answer ...

I'm asking for yet another evil option: an option, not a requirement, of exact byte 
counts.

I wish exact byte counts had been standard for AtapiDma in the beginning, same as in 
AtapiPio, but I for one wasn't paying attention back then.  Pio was enough of a mess 
on its own.

With the 1998/1999 generic UsbMass standard, I actually had a chance to vote for the 
simplifying answer of always deciding byte counts in advance, and I let that chance 
pass me by.

Insisting on exact byte counts is a good way of digging out those hosts/devices that 
are more or less harmlessly imprecise in their byte counts, like the Win95B that 
couldn't transfer odd byte counts.

> I literally cannot imagine a situation where this should cause a problem.

I think I've heard that Ata does not have this trouble as follows.

Do you agree, to imagine a problem in the real world, you only have to let go of a 
few, maybe even just a particular one, of any of these assumptions?

1) With Ata hard drives, errors are dramatically rare.  And write errors are 
dramatically more rare than read errors.

2) The host and device coordinate out-of-band that the block size shall always be x200 
(512).

3) The Ata command set (distinct from the Atapi command set) says the device owns the 
job of counting bytes.  Ata cannot express the idea of reading/writing zero blocks, or 
of reading/writing just as many bytes as fit without bothering to report exactly how 
many bytes did fit.

4) An Ata device that moves less bytes than it believes the command block asked to 
move owns the job of reporting an ERR.

5) An Ata host owns the job of passing back any ERR, or of retrying the whole command 
to try to make the ERR go away.

6) Ata is designed to be helpfully fragile: to incorporate a large Hamming distance 
between legitimate and illegitimate traffic.  Comparatively few people manufacture & 
plug in internal Ata cables, in comparison with how many people manufacture & plug in 
external (Scsi, parallel port, Usb, 1394, Ethernet, ...) cables.  A slightly corrupt 
command block rarely looks good enough to give a false appearance of correct function.

> I literally cannot imagine a situation where this should cause a problem.

I think I hear device folk saying of course the host folk have this covered and host 
folk saying of course the device folk have this covered.

I'm sure you can imagine why hearing that from both sides does not inspire confidence 
in me.

> > Maybe I've been having trouble
> > distinguishing a claim that noone should care
> > from a claim that people can't reproduce my results.

I continue to have this trouble.

I've chosen to read your last reply as not explicitly discussing how reproducible or 
not is my observation of Ide Dma making unexpectedly short byte counts inaccurate by 
as much as X * 2 + 1, rising with burst rate.

As I imagine you've noticed by now, in this reply of mine, I've given you lots of 
opportunity to address that issue as well.

Pat LaVarre

Subscribe/Unsubscribe instructions can be found at www.t13.org.

RE: [t13] a PIO example - noone should care

Reply via email to