This message is from the T13 list server.


Pat,

You have indicated that in some DATA OUT cases you see more (and a variable)
number of bytes being transferred than you would expect given the command
being executed.  This is clearly illegal host behavior - you appear to be
the only one unclear about this ("I have philosophical troubles with the
idea that when a disagreement over byte counts arises the hosts in question
are necessarily more broken than the device.").  You may have a problem in
this case, but no one else who has been participating in this thread does  -
the ATA standard clearly states that the host is broken.  It also clearly
states that a device has to make up for it by ignoring the bytes - otherwise
it too is broken.

If you reported seeing a host sending out excess bytes AND THE DEVICE
BECOMING CONFUSED AND USES THEM then there might be a compelling real world
problem.  People would want to see your data and the like, but at least it
would not be dismissed as a non-event.  Right now it is difficult to get
people to agree that there is a problem since the ATA standard clearly
envisioned this issue when it required devices to ignore the excess bytes.
Undoubtedly a lot of people are filtering this input through their own
experience of a billion UDMA ports in use and not a single reported
incidence of an actual problem.

As I mentioned in an earlier post, the odd byte issue is completely
different, and I think may have more traction with people.  In my case at
least I'm open minded since I know that:
    1) the existing ATA software stacks have no problem
       (even if they did, ATA cannot fix it as Hale pointed out); and
    2) a bridge (USB/Firewire/etc...) with a device driver or
       microprocessor has no problem (they are shipping today);

BUT a bridge without a driver or microprocessor using a software stack other
than the one used by ATA drives today (e.g. including USB) might have a
problem.  But I have not seen the data yet.  

And if the odd byte issue did present a real problem, then I'd suggest the
solution of a byte counter is more than is needed.  Only a pad byte
indicator (1 bit) is needed.  The latter is an easier, less intrusive
solution to the problem, and would usually be preferred.  You usually get a
lot more resistance if you suggest a solution that is overkill for the
problem and forces people to do a lot more work.

Jim


-----Original Message-----
From: Pat LaVarre [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, January 30, 2002 11:05 AM
To: [EMAIL PROTECTED]
Subject: [t13] but we say the disagreeable hosts are broken


This message is from the T13 list server.


> Do you have real examples
> of hosts which behave this way ?

Yes.

Else I wouldn't be here.  I see the imprecision in AtapiDma byte counting as
a slowly growing horror.

> <[EMAIL PROTECTED]> 01/30/02 00:02 AM
> It seems you missed the point ...
> If the host attempts to send extra bytes,
> it is BROKEN.

I heard, but, I consciously chose not to quote, this SHOUT.

I hold little hope we can discuss this claim usefully, given our past
history here.  But not zero hope, hence this post.

I have philosophical troubles with the idea that when a disagreement over
byte counts arises the hosts in question are necessarily more broken than
the device.

As a device designer, who am I to say the host designer has more bugs than I
have?  All we can see in a bus trace is a disagreement over byte counts: we
cannot directly observe who is at fault, we can only infer blame.

"Broken on paper" isn't sufficient to motivate me to ignore an issue, given
the market power of the host software in question, and given the paper in
question is worded as if this problem does not commonly arise.

To ignore an issue, I need to hear broken and not fixable.  I need to hear
at the device level I can't make this the host's problem.  I got into this
issue when one Usb/Atapi bridge vendor told me Dma is not fixable, so just
use Pio.

Wrong.  Yes we can make AtapiDma count bytes as precisely as AtapiPio, just
not within the current standard.  In Atapi, aka Scsi over Ide, we left out
the IgnoreWideResidue feature, for Dma only.  Whoops, sorry, hello.

> Do you have real examples
> of hosts which behave this way ?

Yes.

I have _bus traces_ of this kind of thing.

I am accordingly fully persuaded that hosts and devices commonly disagree
over byte counts.  I'm working ondemoes of how easy such disagreement is to
provoke in all flavours of Windows.

Indeed I have real examples of disagreements that cannot be resolved by the
host: two or more devices that interpret precisely the same Cdb differently.

Sff actually encouraged this kind of disagreement for the ModeSense ops: Sff
flipped the sense of the Dbd bit of the op x5A ModeSense10 Cdb.  Talk about
rude.

What I don't have is a good way of characterising the actual distribution of
disagreement in the real world.

Take evidence like:

> http://www.torque.net/scsi/SCSI-2.4-HOWTO.html

> In the linux 2.4 kernel series there has been an increase in problems when
the ide-scsi driver is used so that cdrecord can control ATAPI (IDE) cd
writers. 

> ... users have reported success with one of these two ... turns off DMA
completely ... "multiword DMA mode 2".

How much of this is due to disagreements over byte count?

Clearly AtapiDma isn't as usable as AtapiPio.  Windows includes analogous
checkboxes.  A configuration checkbox is a public admission of design
failure: let's invite the often more clueless luser to make a choice we
failed to make reliably ourselves.

But why doesn't AtapiDma just plain work?

I hear Cd-rw burners commonly track device behaviour by op x12 Inquiry
string, in order to get past such disagreements.

But why doesn't AtapiDma just plain work?

Is it because of actual disagreement over byte counts?  If yes, then in what
part?

> the actual distribution of disagreement

Say H is the count the host wants to move, negative for in, positive for
out.  Say D is the count the device wants to move.

How often does sign(D) != sign(H)?  How often is abs(D) < abs(H)?  How often
is abs(H) < abs(D)?  I know these probabilities aren't zero: they're large
enough I get paid to fix these specific problems, but I have no idea
specifically what the probabilities are in general.

My idea of a demo is a software tool to let people provoke and measure such
disagreements.  If we could gather infoover how often the actual devices
disagree with paper standards, we'd have a beginning at knowing how often
devices disagree with hosts.

> the actual distribution of disagreement

By the way, on my own, I don't much care what the distribution of
disagreement is.

I think the lower level protocol - AtapiPio and AtapiDma - owns the job of
establishing how many bytes were exchanged, no matter how good the upper
layers are or are not at coping when that count of bytes is not what was
expected.

I thought that was a widely accepted principle of network design ... maybe
not.

> I have _bus traces_

It's Hard to trigger an Atapi bus analyser on this kind of disagreeement:
none of the Atapi protocols bother to tell the device where the host
expected to end the data transfer.

It's much less Hard to trigger a Usb bus analyser on this kind of
disagreement ... but then you have to ask yourself how representative this
Usb test is of direct-attach Atapi traffic.

I'm guessing the Usb traffic is representative.

Microsoft wrongly-on-paper filters a certain amount of traffic accordingly
to complex plug 'n play heuristics.  The traffic can change if you offer
mode page 5.  The traffic can change if you declare a different Scsi command
set in Usb plug 'n play data.  The traffic can change if you x12 Inquiry
string is not "COMPAQ".

But the traffic remains similar enough that I have often repeated with
direct-attach Atapi phenomena first observed on Usb.  This kind of exercise
is necessary for me to give the problem to the Atapi people: I have to prove
that the commodity Usb/Atapi bridging is fully transparent.

> ... less Hard to trigger a Usb bus analyser ...

The one case that remains hard to trigger in Usb is what UsbMass terms Do <
Ho.  Given Ho = abs(H) when 0 < H else 0, given Do = abs(D) when 0 < D else
0, Do < Ho means the host tried to move more out than the device wanted.

Usb commonly buffers enough out traffic that this case is visible only as a
nonzero UsbMass dCSWDataResidue, not a Ub STALL.  That makes it hard to see.

It bothers me that the case hardest to see on Usb is the case that AtapiDma
handles least well ... but I can't see anything I could do to improve that
situation, except to spread the word on how to fix AtapiDma.

> ...

Clearer than mud, yet?

Hope so.    Pat LaVarre

Subscribe/Unsubscribe instructions can be found at www.t13.org.
Subscribe/Unsubscribe instructions can be found at www.t13.org.

Reply via email to