Re: [precis] review of codepoint table in draft-ietf-precis-framework-17

Peter Saint-Andre Fri, 29 Aug 2014 14:10:57 -0700

On 8/29/14, 3:03 PM, John C Klensin wrote:



--On Friday, August 29, 2014 14:31 -0600 Peter Saint-Andre
<[email protected]> wrote:

On 8/29/14, 12:38 PM, John C Klensin wrote:



--On Friday, August 29, 2014 17:01 +0000 "Joe Hildebrand
(jhildebr)" <[email protected]> wrote:

...
I agree we want one set over time. But I thought that we had
agreed in  Toronto that we were going to try to get Precis
right, then look at  removing the tables from IDNA201x,
including Precis by reference in those  documents.


Joe,

I did not come away from Toronto very sure about what had been
agreed.  The Etherpad minutes
(http://etherpad.tools.ietf.org:9000/p/notes-ietf-90-precis?u
seMonospaceFont=true to be sure I am looking at the right
thing) don't help much and I'm far more interested in Peter's
interpretation as it will show up in the next draft than I am
in mine.


My interpretation is that sections 7.1-7.10 of the
precis-framework document will be changed to reference
sections 2.1-2.10 of RFC 5982, not copy text from RFC 5892.

This gives us one set of rules ("A" through "J" from RFC 5892,
"K" through "R" for PRECIS).


That definitely works for me.

This does not give us one registry for such rules. My
understanding of the discussion in Toronto comports with
Joe's: it would be good to figure out a way to have one
rulespace / tablespace, but that's not something we can do
immediately since it necessitates updates to IDNA2008 (at
least from the registration perspective).


Maybe.  Maybe we can figure out a way around that.

My big concern was eliminating the perception that there were
different rules with same names or that the rules could diverge
from each other as Unicode, IDNA, and PRECIS evolved.  It isn't
an ideal solution (see below for discussion of better ones), but
I imagine that it would be fairly easy and non-intrusive to
insert a note in the IDNA Rule Registry that said, "for non-IDNA
(PRECIS) uses, there are additional rules "K" through "R" which
appear in ... Registry".  With a similar one in the
PRECIS-related rules registry pointing back to the IDNA one, I
think we have a 95% solution and that should be good enough.

I completely agree that we absolutely do not want to have differentrules with the same names, and I'm sorry if the precis-frameworkdocument as it stands today gave that impression.

Cross references like that are pretty routine and not normative
enough to require major standards changes.  IIR, at least some
of them have gone in because someone has suggested that one
would be helpful and appropriate ADs have nodded approvingly --
basically zero procedure.  The PRECIS user who actually needed
to look at the rule registry would have to look in two places,
but it seems to me that is a fairly small price to pay to avoid
the "divergence" and "different experts making different
decisions" possibilities or the appearance of it.


Yes, indeed.

We could probably also do something more elegant and
unified-looking if we were really careful to preserve links and
cross-references that various entities think they have been
promised would be kept completely stable.  For example, having a
single "Unicode derived property rules registry" with part 1 (or
"subregistry 1") being IDNA and rules A-J and part 2 being the
additional PRECIS rules might do it at the price of a heading ad
a few blank lines.    More generally, probably the whole idea of
promising stable URLs for various IANA registries and/or their
elements was a bad one and we ought to finish URNbis and then
redesign things with at least one level of indirection.  But
that, fortunately, is neither a problem we need to solve today
nor is it on the PRECIS charter.

My interpretation is also that we will remove the codepoint
table from the precis-framework document, since the algorithms
govern how code points are treated.


Good.

None of that really interacts with the intent of my note,
which was to stress the importance of moving swiftly to a
single set of rules and tables.  That is an important
distinction relative to what you said above: at least in the
context of what I understood of the Toronto discussions, it
didn't make much difference whether the single set of rules
and tables was the IDNA ones, the PRECIS ones, or a new
synthesis that both referenced.  What is important is that we
have a fairly firm commitment make the combination, not a
"probably over time" or "look at it in the future" situation.
If we don't have a clear plan and a sufficient commitment of
resources that Pete (at least) is convinced that the plan
will be executed, then my pre-Toronto concerns are unchanged
and the compromises reached there are meaningless.


As I see it, creating one rulespace and tablespace would
require a new synthesis that in PRECIS terms treats IDNA2008
as defining a base string class ("NameClass"?) alongside the
IdentifierClass and FreeformClass. This is the modern
equivalent of what we had under stringprep: NamePrep for
domain names and other classes for other kinds of strings.


Perhaps.  But my primary concern has never been "one rulespace";
it has been that we clearly do not have overlapping but
potentially inconsistent rulespaces.  I think your strategy
accomplishes that.


Great.

That wasn't what caused my note.  Again, I've been waiting for
Peter because I don't think it is worth commenting on what I
think might be in a draft that has not been posted.   What did
prompt me to write it is that, in addition to the small bugs
Takahiro found, three things have happened (or gotten more on
my radar) since Toronto.  I was planning to raise them, if
still relevant, only after Peter got the Framework spec
posted, but here goes:

(1) WHATEWG and W3C are charging ahead with an "Encoding"
specification that will apparently be normatively referenced
from HTML5.  With luck, an update to/ replacement for the very
important "Charmod" spec will be right behind it.  The
recommendations of those two documents are different from the
proposed PRECIS ones.  One way of looking at those specs in
the PRECIS context is that they represent yet another profile
and that, if two or three (or more if IDNA is counted) are
acceptable, than one more doesn't make much difference.  At
the other extreme, some people have taken the position about
the W3C Charmod and Encoding specs that the web, web
browsers, web applications, and web access to other protocols
so dominates the Internet that any conflicting specifications
are irrelevant (or various worse words).  If the latter view
is correct than the PRECIS effort is, to a considerable
degree, a waste of time and, worse, a source of more
confusion, at least for any protocol that might be accessed
via a URI or from a web page.   My own view is that reality
lies somewhere between those positions, but my ability to
predict the near-term future is notoriously bad.


If the PRECIS effort is a waste of time, we might as well
recognize our sunk costs and scuttle the effort RIGHT NOW. I
am sure that everyone on this list has plenty of other things
they could be doing.


Concur.  While I'd like to see a lot less flexibility about, or
implied encouragement of, multiple profiles, I have never
believed that PRECIS is a waste of time and don't believe it
now.

That's good, because I was just about to send the following message tothis discussion list:

"I am PENCILS DOWN (no more work) on all PRECIS and PRECIS-relatedspecifications until we figure this out."

Unfortunately, the alternative appears to be using stringprep
+ Unicode 3.2 forever, and doesn't seem viable.


Concur.  In retrospect, I think the PRECIS work should have been
organized differently, but it is too late for that and doesn't
reduce the relevance of the work.

Is there an alternative I'm missing? Use some emerging
web-encoding rules for everything on the Internet?



To be polite, I don't think so and almost completely disagree
with the model that says that because the web is important, the
present and past practices of the web browser vendors and page
authors, no matter how objectively unfortunate get to set the
rules.  I think it is bad for the web, bad for the Internet, and
that it will eventually turn to be bad for those vendors.  But
what do I know?


More than me, that's for sure.

(2)  There has been a heated discussion on the IDNA list.  It
involves both rather small issues (the treatment of a single
code point that is now to Unicode 7.0 and perhaps a few code
points that are perceived as "like" it) and a very large one.
The latter involves two questions that might ultimately be the
same one: (1) Whether some of the fundamental assumptions that
were made in designing the rule structure for IDNA,
assumptions about code point assignment and Unicode
evolution, were incorrect and, if so, whether IDNA2008 itself
is workable.  (2) Whether some fundamental assumptions made
in IDNA about the nature and implications about
normalization, especially normalization prior to comparison,
are correct and, whether they are correct or not, whether
normalization as it actually works is appropriate for
contexts that involve short strings and no language
information or requirements to conform to the rules of a
single language.  Those assumptions are fundamental enough to
the IDNA design and the parts of the PRECIS design that are
derived from the IDNA rule structure that, if there is a
problem with IDNA, there is probably (almost certainly) a
problem with PRECIS too.


Yes, PRECIS is downstream from IDNA in those ways.

(3) Independent of anything going on in the IETF, I've gotten
another reminder about something I mentioned briefly during
the meeting in Toronto.  There is a sufficient cluster of
inter-organizational and political issues surrounding IDNA
that any attempt to open it and change its definitional
method, even if we assured people that the changes had no
actual effect on anything other than the definitional method,
could cause some very unpleasant reactions, some of which
might be harmful to the IETF.  I'm just not sure that
"eventually conform IDNA to PRECIS" is really practical,
especially since one of the arguments that would certainly be
made would compare the in-depth review in multiple
communities that IDNA2008 received and its present deployment
compared to those for PRECIS documents.


Would updating the registries to have one rulespace &
tablespace entail modifications to the rules, or would it just
be a matter of bookkeeping by the IANA? (Likely something in
between.)


I hope pretty much the former, but it would need to be done
really carefully.  See semi-snide comment about URNBIS above :-)


Don't confuse me, URNBIS is what I work on next week. ;-)

Peter


_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis

Re: [precis] review of codepoint table in draft-ietf-precis-framework-17

Reply via email to