At 2:32 AM +0900 1/30/02, Soobok Lee wrote: > >In Stringprep section 6.3, > >" Newer stored string -- Suppose that an requesting application is using > oldVersion and the stored string was created using a profile that uses > newVersion. Because the requesting application passed through any > unassigned code points, the user can query on stored strings that use > code points in newVersion. No stored strings can have code points that > are unassigned in newVersion, since that is illegal. In this case, the > querying application has to enter the unassigned code points in the > correct order, and has to use unassigned code points that would make it > through both the mapping and the normalization steps." > > The old querying application using oldVersion stringprep *cannot* predict > > 1) the correct order of combining sequences of newly assigned characters > > 2) the normalized & casefolded form of newly assigned characters.
Correct, but it does not need to, unless the Unicode Consortium goes against their promises about how they will handle normalization of newly-assigned characters. No new characters will be casefolded. > This choices are often IME(platform) dependent and enforcing such > normalized character inputs are beyond end users' choices and capabilities > and also are out of control of independant application authors Agree. Fortunately, this is unneeded. > Non-Unicode legacy IMEs have been producing their own preferred >character sequences > for new scripts and their 1-1 mapping to new Unicode points may >produce unnormalized outputs > and they will be fed into old blind applications and make >troubles and confusions and failtures. This makes no sense because new scripts don't have decompositions. > Therefore, the last sentence in the cited paragraph does not make >sense, IMHO. It makes sense if you trust the Unicode Consortium to stick to its word about newly-assigned characters. If you don't, all of Section 6 is useless, not just this subsection. --Paul Hoffman, Director --Internet Mail Consortium
