Re: [precis] review of codepoint table in draft-ietf-precis-framework-17

John C Klensin Fri, 29 Aug 2014 14:04:33 -0700


--On Friday, August 29, 2014 14:31 -0600 Peter Saint-Andre
<[email protected]> wrote:


> On 8/29/14, 12:38 PM, John C Klensin wrote:
>> 
>> 
>> --On Friday, August 29, 2014 17:01 +0000 "Joe Hildebrand
>> (jhildebr)" <[email protected]> wrote:
>> 
>>> ...
>>> I agree we want one set over time. But I thought that we had
>>> agreed in  Toronto that we were going to try to get Precis
>>> right, then look at  removing the tables from IDNA201x,
>>> including Precis by reference in those  documents.
>> 
>> Joe,
>> 
>> I did not come away from Toronto very sure about what had been
>> agreed.  The Etherpad minutes
>> (http://etherpad.tools.ietf.org:9000/p/notes-ietf-90-precis?u
>> seMonospaceFont=true to be sure I am looking at the right
>> thing) don't help much and I'm far more interested in Peter's
>> interpretation as it will show up in the next draft than I am
>> in mine.
> 
> My interpretation is that sections 7.1-7.10 of the
> precis-framework document will be changed to reference
> sections 2.1-2.10 of RFC 5982, not copy text from RFC 5892.
> 
> This gives us one set of rules ("A" through "J" from RFC 5892,
> "K" through "R" for PRECIS).

That definitely works for me.

> This does not give us one registry for such rules. My
> understanding of the discussion in Toronto comports with
> Joe's: it would be good to figure out a way to have one
> rulespace / tablespace, but that's not something we can do
> immediately since it necessitates updates to IDNA2008 (at
> least from the registration perspective).

Maybe.  Maybe we can figure out a way around that.  

My big concern was eliminating the perception that there were
different rules with same names or that the rules could diverge
from each other as Unicode, IDNA, and PRECIS evolved.  It isn't
an ideal solution (see below for discussion of better ones), but
I imagine that it would be fairly easy and non-intrusive to
insert a note in the IDNA Rule Registry that said, "for non-IDNA
(PRECIS) uses, there are additional rules "K" through "R" which
appear in ... Registry".  With a similar one in the
PRECIS-related rules registry pointing back to the IDNA one, I
think we have a 95% solution and that should be good enough.
Cross references like that are pretty routine and not normative
enough to require major standards changes.  IIR, at least some
of them have gone in because someone has suggested that one
would be helpful and appropriate ADs have nodded approvingly --
basically zero procedure.  The PRECIS user who actually needed
to look at the rule registry would have to look in two places,
but it seems to me that is a fairly small price to pay to avoid
the "divergence" and "different experts making different
decisions" possibilities or the appearance of it.

We could probably also do something more elegant and
unified-looking if we were really careful to preserve links and
cross-references that various entities think they have been
promised would be kept completely stable.  For example, having a
single "Unicode derived property rules registry" with part 1 (or
"subregistry 1") being IDNA and rules A-J and part 2 being the
additional PRECIS rules might do it at the price of a heading ad
a few blank lines.    More generally, probably the whole idea of
promising stable URLs for various IANA registries and/or their
elements was a bad one and we ought to finish URNbis and then
redesign things with at least one level of indirection.  But
that, fortunately, is neither a problem we need to solve today
nor is it on the PRECIS charter.

> My interpretation is also that we will remove the codepoint
> table from the precis-framework document, since the algorithms
> govern how code points are treated.

Good.

>> None of that really interacts with the intent of my note,
>> which was to stress the importance of moving swiftly to a
>> single set of rules and tables.  That is an important
>> distinction relative to what you said above: at least in the
>> context of what I understood of the Toronto discussions, it
>> didn't make much difference whether the single set of rules
>> and tables was the IDNA ones, the PRECIS ones, or a new
>> synthesis that both referenced.  What is important is that we
>> have a fairly firm commitment make the combination, not a
>> "probably over time" or "look at it in the future" situation.
>> If we don't have a clear plan and a sufficient commitment of
>> resources that Pete (at least) is convinced that the plan
>> will be executed, then my pre-Toronto concerns are unchanged
>> and the compromises reached there are meaningless.
> 
> As I see it, creating one rulespace and tablespace would
> require a new synthesis that in PRECIS terms treats IDNA2008
> as defining a base string class ("NameClass"?) alongside the
> IdentifierClass and FreeformClass. This is the modern
> equivalent of what we had under stringprep: NamePrep for
> domain names and other classes for other kinds of strings.

Perhaps.  But my primary concern has never been "one rulespace";
it has been that we clearly do not have overlapping but
potentially inconsistent rulespaces.  I think your strategy
accomplishes that.

>> That wasn't what caused my note.  Again, I've been waiting for
>> Peter because I don't think it is worth commenting on what I
>> think might be in a draft that has not been posted.   What did
>> prompt me to write it is that, in addition to the small bugs
>> Takahiro found, three things have happened (or gotten more on
>> my radar) since Toronto.  I was planning to raise them, if
>> still relevant, only after Peter got the Framework spec
>> posted, but here goes:
>> 
>> (1) WHATEWG and W3C are charging ahead with an "Encoding"
>> specification that will apparently be normatively referenced
>> from HTML5.  With luck, an update to/ replacement for the very
>> important "Charmod" spec will be right behind it.  The
>> recommendations of those two documents are different from the
>> proposed PRECIS ones.  One way of looking at those specs in
>> the PRECIS context is that they represent yet another profile
>> and that, if two or three (or more if IDNA is counted) are
>> acceptable, than one more doesn't make much difference.  At
>> the other extreme, some people have taken the position about
>> the W3C Charmod and Encoding specs that the web, web
>> browsers, web applications, and web access to other protocols
>> so dominates the Internet that any conflicting specifications
>> are irrelevant (or various worse words).  If the latter view
>> is correct than the PRECIS effort is, to a considerable
>> degree, a waste of time and, worse, a source of more
>> confusion, at least for any protocol that might be accessed
>> via a URI or from a web page.   My own view is that reality
>> lies somewhere between those positions, but my ability to
>> predict the near-term future is notoriously bad.
> 
> If the PRECIS effort is a waste of time, we might as well
> recognize our sunk costs and scuttle the effort RIGHT NOW. I
> am sure that everyone on this list has plenty of other things
> they could be doing.

Concur.  While I'd like to see a lot less flexibility about, or
implied encouragement of, multiple profiles, I have never
believed that PRECIS is a waste of time and don't believe it
now.  

> Unfortunately, the alternative appears to be using stringprep
> + Unicode 3.2 forever, and doesn't seem viable.

Concur.  In retrospect, I think the PRECIS work should have been
organized differently, but it is too late for that and doesn't
reduce the relevance of the work.

> Is there an alternative I'm missing? Use some emerging
> web-encoding rules for everything on the Internet?


To be polite, I don't think so and almost completely disagree
with the model that says that because the web is important, the
present and past practices of the web browser vendors and page
authors, no matter how objectively unfortunate get to set the
rules.  I think it is bad for the web, bad for the Internet, and
that it will eventually turn to be bad for those vendors.  But
what do I know?

>> (2)  There has been a heated discussion on the IDNA list.  It
>> involves both rather small issues (the treatment of a single
>> code point that is now to Unicode 7.0 and perhaps a few code
>> points that are perceived as "like" it) and a very large one.
>> The latter involves two questions that might ultimately be the
>> same one: (1) Whether some of the fundamental assumptions that
>> were made in designing the rule structure for IDNA,
>> assumptions about code point assignment and Unicode
>> evolution, were incorrect and, if so, whether IDNA2008 itself
>> is workable.  (2) Whether some fundamental assumptions made
>> in IDNA about the nature and implications about
>> normalization, especially normalization prior to comparison,
>> are correct and, whether they are correct or not, whether
>> normalization as it actually works is appropriate for
>> contexts that involve short strings and no language
>> information or requirements to conform to the rules of a
>> single language.  Those assumptions are fundamental enough to
>> the IDNA design and the parts of the PRECIS design that are
>> derived from the IDNA rule structure that, if there is a
>> problem with IDNA, there is probably (almost certainly) a
>> problem with PRECIS too.
> 
> Yes, PRECIS is downstream from IDNA in those ways.
> 
>> (3) Independent of anything going on in the IETF, I've gotten
>> another reminder about something I mentioned briefly during
>> the meeting in Toronto.  There is a sufficient cluster of
>> inter-organizational and political issues surrounding IDNA
>> that any attempt to open it and change its definitional
>> method, even if we assured people that the changes had no
>> actual effect on anything other than the definitional method,
>> could cause some very unpleasant reactions, some of which
>> might be harmful to the IETF.  I'm just not sure that
>> "eventually conform IDNA to PRECIS" is really practical,
>> especially since one of the arguments that would certainly be
>> made would compare the in-depth review in multiple
>> communities that IDNA2008 received and its present deployment
>> compared to those for PRECIS documents.
> 
> Would updating the registries to have one rulespace &
> tablespace entail modifications to the rules, or would it just
> be a matter of bookkeeping by the IANA? (Likely something in
> between.)

I hope pretty much the former, but it would need to be done
really carefully.  See semi-snide comment about URNBIS above :-)

    john

_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis

Re: [precis] review of codepoint table in draft-ietf-precis-framework-17

Reply via email to