...Well, is it generally agreed that "assume" in C9 should be understood in this way? The whole concept of communications etc standards is based on a principle that the communicating entities need know no more about one another than that they both follow the same standard. If the entities start making use of private knowledge, certainly as anything more than a hint for efficiency, then the standards scenario is being violated.
I would consider A and B to be different versions of the same process. I read the word assume to mean make an assumption without definite knowledge. If process B *knows* something is true it can exploit that knowledge. ...
OK. But, if we are talking about communications between sub-processes of a particular process, then we are talking about the internals of a process. And those internals are not subject to Unicode standardisaton; and so it is invalid to argue that errors in Unicode cannot be corrected because of their impact on the internal encoding within a process.... If on the other hand it is receiving data from a process outside its control (owned by a third party perhaps) then it can't guess that the data have any particular charateristics. It is common for a process to be composed of sub-processes. If they can't exploit their knowledge of one another then you have serious problems. ...
And if we are not talking about sub-processes, we are talking about separate processes which communicate according to the Unicode standard. If these processes are relying on knowledge of one another not covered by the standard, they are not following the standard but using a private protocol similar to but not conformant to the standard.
... I would expect the operating system documentation to make very clearOK. But according to Unicode, if what is stored is a Unicode string and what is returned is canonically equivalent, these two defined as identical, and a storage system is entitled to make whatever canonically equivalent changes it may choose to make. Of course that applies only if the storage system is presented with a Unicode string; if it is presented with a sequence of bytes or longer words, it should return that same sequence.
if the storage routines don't return what you gave them in the first place.
Tim
-- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/

