Re: [precis] usernames in PRECIS and http-auth

Peter Saint-Andre Mon, 24 Mar 2014 20:36:28 -0700

On 3/14/14, 8:48 PM, Yutaka OIWA wrote:

Dear Peter and all PRECIS related members,


Thank you very much, and I'm really sorry that our definition
had a serious mistake which have confused many of you.
I'll talk with co-authors (especially, Nemoto-san) again and
revise the document as soon as possible,
to restart the next steps as fast as possible.

# That's why our specification has mentioned about
# the mapping for spaces, which are not included by current definition.

When I talked with Peter and Julian, we mentioned about
another character classes during the talk, and at that time
I should be noticed that I possibly referred the incorrect class.

A wider question: why do you think that the definition of an HTTP
username needs to be so loose? Do you define things this way to be
backward compatible with existing implementations, or do you really
think that this is a best practice? I'm truly curious. (And I wonder
if we even want to call this construct a "username"...)

I think the reasons are both.

1: For backward compatibility, we need to keep all ASCII "printable"
     (U+0020 - U+007E, including SP) characters as is, as well as
     Latin-1 printable (U+00A1 - U+00FF, except SHY) be independent.


I think "saslprepbis" does this.

2: For more semantic reasons, HTTP authentication will be a vehicle
     for many different kinds of existing application frameworks, including
     IMs, Web mails, social network, and others.
     It should be able to accept all kinds of "user name" formats,
     for example a simple "user ID" (yoiwa), user "Name" (Yutaka OIWA),
     a mail address ([email protected]),
     Social ID formats (@yoiwa or =yoiwa),


Here again, I think that "saslprepbis" handles those.

and many others.


Here we might have some disagreement.

If we think that http-auth needs to allow just about any string as a"userid" (to use the term from RFC 2617 - I don't want to call it ausername), then even a PRECIS profile as loose asdraft-ietf-precis-nickname is too strict for your purposes.


However, I question whether those purposes are justified.

Do we really expect that an actual userid in http-auth might be any ofthe following?


"Y                     u            taka  O   i    w     a      "
"♖♘♗♕♔♗♘♖" (i.e., the back line of white chess pieces)
"mycatisa  bby" (where those spaces are actually a tab)

and so on.

Do we have evidence from existing applications that we need to supportstrings of characters like those as userids in HTTP authentication? Orare we being way too liberal in what we accept?

     Unlike SASL or XMPP which have its own semantics in framework,
     the authentication names in HTTP must be semanticless,
     unstructured strings, which can later be added a meaningful
     semantics for each application which uses Web/HTTP.

     We are not likely to correct all possible use cases of
     IDs which are to be used with HTTP (including future uses) and
     then take a union set of these,
     so instead we're defining a "grand-father" ID notations,
     expecting that all ID string use-cases are likely to be subsets of it.

My concern is that this "grandfather set" is every Unicode character,and if we allow that then we're really not providing any kind of helpfulguidance to application developers.

One fundamental assumption underlying the PRECIS work is that it is ourresponsibility as internationalization experts to prevent applicationdevelopers from shooting themselves in the foot. In particular, theIdentifierClass in the PRECIS framework tries to provide a safe subsetof characters, and the username construct in "saslprepbis" profiles theIdentifierClass so that application developers can avoid trouble. As faras I can see, what you are proposing would invite such trouble, and I'mnot comfortable with that.

     At the same time, defining it just a "UTF-8" makes users' confusion
     and inter-operability mess about possible "visiblly-same" strings,
     so we must care about that side of string preparation with PRECIS.
     For example, NBSP and other spaces should be replaced.


Hmm. So you are saying that if a user inputs the following string:

"Iam&nbsp;cey"

Then "&nbsp;" would be replaced with U+0020 (SPACE) as follows:

"Iam cey"

"Just UTF-8" is just as useless for internationalization as "just TLS"is for security. I definitely agree that we need something more than"just UTF-8", which is why we've put so much work into PRECIS. Althoughwe cannot solve the problem of confusable characters, we can define somestring classes that "first do no harm" (is there a Hippocratic Oath fori18n?). So far, I do not think that what you are proposing does no harm,in fact I think it is actively harmful to allow such a wide range ofUnicode characters into userids.


Peter

_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis

Re: [precis] usernames in PRECIS and http-auth

Reply via email to