On 5/6/16 8:40 PM, John C Klensin wrote:
(sorry... earlier copy sent from wrong address)

I'm sorry that it has taken me 4 months to reply to this message! :(

--On Friday, May 06, 2016 15:54 +0900 "Martin J. Dürst"
<[email protected]> wrote:

Hello Peter,

On 2016/05/05 07:43, Peter Saint-Andre wrote:

I suggested that we add some text about this to 7564bis. Here
is a proposed paragraph for insertion in §5.2.3
("Case-Mapping Rule"):

   The Unicode toCaseFold() operation defined by the Unicode
   Default Case Folding algorithm is most appropriate when an
   application needs to compare two strings.  When an
   application merely wishes to convert uppercase and
   titlecase code points to the lowercase equivalents while
   preserving lowercase code points, the Unicode toLower()
   operation is more appropriate and is less likely to
   violate the "Principle of Least Astonishment".  Therefore,
   application developers are advised to carefully consider
   whether they truly need to use the toCaseFold() operation
   in a given situation, or whether the toLower() operation
   would be more appropriate than the toCaseFold() operation.

Suggestions for improvement are welcome, especially from
John. (E.g., we might want to more explicitly call out
comparison vs. other contexts in the normative text elsewhere
in §5.2.3).

I think 'compare' should be changed to 'search'. That's the
prototypical use case for CaseFold.

Hmm.  If we have to choose, I think I prefer "compare".  I just
looked at the subsections on "Default Case Folding" and "Default
Caseless Matching" in Section 3.13 of TUS 8.0 and it says a lot
about comparison and nothing about search.   Recommended
compromise:  Make the relevant sentence fragment read "most
appropriate when an application needs to compare two strings
such as in search operations."

+1

I'd still prefer to denounce toCaseFold completely, especially
where identifiers are concerned.  It just has far too much
potential for being destructive and creating false results
(either positive or negative) when the language context is
unknown.  People/designers/implementers who are not prepared to
understand those issues and their implications should really not
be using the thing.

Also, the language in the "Therefore" sentence is somewhat
convoluted. It's unclear which alternative this text prefers.
I suggest that if we want to put the two alternatives on an
equal footing (i.e. make sure the application designer thinks
carefully), then a more parallel sentence structure, avoiding
words such as "carefully", "truly", and "would", would be more
appropriate. What about:

                                        Therefore, application
developers
    are advised to carefully consider whether toCaseFold() or
    toLower() is more appropriate.

For the reasons above, I'm not sure that an even footing is
appropriate.  I'd rather have the guidance be closer to "use
toLowerCase, which your users are likely to understand, unless
you need CaseFolding for some particular reason and understand
its implications"

That is indeed more consistent with the concerns you have expressed within the working group.

Going in that direction will require some adjustments to 7613bis and 7700bis, but the changes should be straightforward. I will do that before publishing the -02 versions (hopefully today).

Peter


_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis

Reply via email to