On 9/3/14, 11:07 AM, Dan Chiba wrote:
Hi Andrew,
On 9/2/2014 11:46 PM, Andrew Sullivan wrote:
On Tue, Sep 02, 2014 at 05:42:28PM -0700, Dan Chiba wrote:
usernames "fussball" and "fußball" should be treated as equal for case
insensitive matching, or as distinct for case sensitive matching.
I don't see any case in those two strings. What am I missing?
My bad, both of them are lowercase. My intent was to present two
distinct strings that are evaluated as equal in case insensitive
matching. "fussball" and "Fussball" would be good examples.
But anyway, it's not clear to me that the usernames "fussball" and
"fußball" should in fact always and everywhere be treated as equal.
In my understanding that is not always the case.
For my understanding is that the use of ß is very much a localization
issue: the ligature is never used in Swiss written German. One could
argue that it'd be extremely unwise to permit distinct user
registrations that are otherwise identical except for ss and ß
(certainly I would). But if we think that's a good reason to insist
on a mapping, then that would also seem to be a good reason to say,
"No case folding, period." Indeed, that's one of the big differences
between IDNA2003 and IDNA2008; IDNA2008 violates a deep tradition in
the DNS in this way, basically because we couldn't make the mapping
work the way we'd like.
I used the ß character as a case in which distinct strings are to be
evaluated equal after precis preparation. I think it is a good idea to
disallow confusable characters like this. I do not know the background
of IDNA well, but I think there could be a precis profile for it.
Disallowing confusable characters is a laudable goal, but very difficult
in practice. Section 9.5 of the precis-framework document covers these
issues in some detail. Here's one of my favorite examples:
https://stpeter.im/journal/1420.html
username string matching. If the profile doesn't fully specify the
spec but
it's up to the deployment/service policies, I would like the
components to
be able to identify the policies precisely.
While I agree with that in principle, it sounds like an
interoperability and implementation nightmare (which issue is, I
think, what got us the current compromise text).
I am concerned about interoperability problems that may happen due to
the ambiguity in the profile spec. Right now, an implementation of
UsernameIdentifierClass may use arbitrary character mappings. I agree
the compromise was required, but I think it has introduced the problems
while partially resolving the other requirement to match the various
current practices of mapping some characters in username handling.
As far as I can see, the only alternative to the compromise we've
brokered here is (my understanding of) what John suggests: lock down the
base string classes, effectively disallow profiles, and tell existing
applications to conform to the new rules. If they want exceptions, then
they don't have to use PRECIS.
In my opinion, that would be slightly at odds with the spirit of how the
PRECIS WG was formed, since all of the Stringprep "customers" in the
room at the NewPrep BoF had existing profiles and wanted to figure out
how to use modern versions of Unicode while supporting as many of their
existing strings/identifiers as possible. If we now tell those customers
that we can't meet their needs, I don't think they will be particularly
happy. But, as John also says, this is about pleasing (or at least
minimizing confusion for) end users.
Peter
_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis