On 5/6/16 8:40 PM, John C Klensin wrote:
(sorry... earlier copy sent from wrong address)
I'm sorry that it has taken me 4 months to reply to this message! :(
--On Friday, May 06, 2016 15:54 +0900 "Martin J. Dürst"
<[email protected]> wrote:
Hello Peter,
On 2016/05/05 07:43, Peter Saint-Andre wrote:
I suggested that we add some text about this to 7564bis. Here
is a proposed paragraph for insertion in §5.2.3
("Case-Mapping Rule"):
The Unicode toCaseFold() operation defined by the Unicode
Default Case Folding algorithm is most appropriate when an
application needs to compare two strings. When an
application merely wishes to convert uppercase and
titlecase code points to the lowercase equivalents while
preserving lowercase code points, the Unicode toLower()
operation is more appropriate and is less likely to
violate the "Principle of Least Astonishment". Therefore,
application developers are advised to carefully consider
whether they truly need to use the toCaseFold() operation
in a given situation, or whether the toLower() operation
would be more appropriate than the toCaseFold() operation.
Suggestions for improvement are welcome, especially from
John. (E.g., we might want to more explicitly call out
comparison vs. other contexts in the normative text elsewhere
in §5.2.3).
I think 'compare' should be changed to 'search'. That's the
prototypical use case for CaseFold.
Hmm. If we have to choose, I think I prefer "compare". I just
looked at the subsections on "Default Case Folding" and "Default
Caseless Matching" in Section 3.13 of TUS 8.0 and it says a lot
about comparison and nothing about search. Recommended
compromise: Make the relevant sentence fragment read "most
appropriate when an application needs to compare two strings
such as in search operations."
+1
I'd still prefer to denounce toCaseFold completely, especially
where identifiers are concerned. It just has far too much
potential for being destructive and creating false results
(either positive or negative) when the language context is
unknown. People/designers/implementers who are not prepared to
understand those issues and their implications should really not
be using the thing.
Also, the language in the "Therefore" sentence is somewhat
convoluted. It's unclear which alternative this text prefers.
I suggest that if we want to put the two alternatives on an
equal footing (i.e. make sure the application designer thinks
carefully), then a more parallel sentence structure, avoiding
words such as "carefully", "truly", and "would", would be more
appropriate. What about:
Therefore, application
developers
are advised to carefully consider whether toCaseFold() or
toLower() is more appropriate.
For the reasons above, I'm not sure that an even footing is
appropriate. I'd rather have the guidance be closer to "use
toLowerCase, which your users are likely to understand, unless
you need CaseFolding for some particular reason and understand
its implications"
That is indeed more consistent with the concerns you have expressed
within the working group.
Going in that direction will require some adjustments to 7613bis and
7700bis, but the changes should be straightforward. I will do that
before publishing the -02 versions (hopefully today).
Peter
_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis