> On 29/10/2003 15:07, John Cowan wrote: > > >Not necessarily. A process may check its input for normalization and > >reject it if it is not normalized, and XML consumers are encouraged > >(not required) to do so. > > > > > > > This looks to me like a clear breach of C9, at least of the derived > principle > > > no process can assume that another process will make a distinction > > between two different, but canonical-equivalent character sequences. > > Another process may not be assumed to make a distinction between > normalised and non-normalised forms and so may not be assumed to > normalise, accurately or at all.
It's perfectly reasonable if the a specification calls for input to be in a particular normalisation form for the process to reject input that isn't. In requiring a particular normalisation form you are adding a requirement for the data in addition to those entailed by saying the data is in Unicode, which is no different that adding a requirement that particular characters be given particular meaning above the semantics they have as Unicode characters (e.g. XML does this with < and >). This extra requirement is supplied "on top of" C9, and encountered before C9 comes into play. Similarly there is no *assumption* about the treatment of canonical-equivalent character sequences; rather there is a specification proscribing the use of NFC and allowing processes to reject non NFC data.