I was selected as General Area Review Team reviewer for this specification
(for background on Gen-ART, please see
http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).
Summary:
Review Comments: This document is almost ready for publication as a Proposed
Standard. I have a small number of nittish comments (more than editorial),
but if the authors agree, I believe any of these changes could be RFC Editor
notes. The ones I'd really like to see Brian look closely at are in 3.2,
4.2.1, and 4.2.2.
Thanks,
Spencer
3.2. Wildcards
Spencer: two minor concerns with the following text:
(1) I'm not sure how the first two sentences work together. Does the first
sentence say "there can only be one wildcard character in the string a
client uses to select a collation", or does "a wildcard" mean something
besides "one wildcard"? The second sentence is my greater confusion, because
I'm reading the first sentence as saying that "aa*aa*" would NOT be OK,
because it has more than one wildcard character, and reading the second
sentence as saying that "aa**aa" would NOT be OK, because it has adjacent
wildcard characters, but it's NOT OK anyway, because it has more than one
wildcard character (whether adjacent or not). Please clue me in.
(2) I would love to see a sentence explaining why the third sentence is
"SHOULD NOT use wildcards" and not "MUST NOT use wildcards". To be honest,
I'm trying to understand why this restriction exists at all (at either
SHOULD NOT or MUST NOT strength), but the absence of SHOULD NOT
qualification doesn't help me with this, and I expect that it would help.
And why is "the server SHOULD select the collation" a SHOULD, and not a
MUST? Mumble.
The string a client uses to select a collation MAY contain a wildcard
("*") character which matches zero or more collation-chars. Wildcard
characters MUST NOT be adjacent. Clients which support disconnected
operation SHOULD NOT use wildcards to select a collation, but clients
which provide collation operations only when connected to the server
MAY use wildcards. If the wildcard string matches multiple
collations, the server SHOULD select the collation with the broadest
scope (preferably international scope), the most recent table
versions and the greatest number of supported operations.
3.3. Ordering Direction
Spencer: this is at the edge of a nit, but "collation-order" and
"collation-sel" haven't been introduced previously, and I'm having to guess
that "sel" is short for "selection", or something. Mumble.
When used as a protocol element for ordering, the collation name MAY
be prefixed by either "+" or "-" to explicitly specify an ordering
direction. As mentioned previously, "+" has no effect on the
ordering function, while "-" negates the result of the ordering
function. In general, collation-order is used when a client requests
a collation, and collation-sel is used when the server informs the
client of the selected collation.
4.2.1. Equality
Spencer: I'm confused here (note the trend :-). Is the following text
saying, "MAY return either "error" or "no-match" if the input strings are
not valid character strings ..."? The current text doesn't seem to say what
happens when the input strings aren't valid and the equality function
doesn't return "error", which is only a MAY strength ("so don't be surprised
when your server does this").
The equality function always returns "match" or "no-match" when
supplied valid input, and MAY return "error" if the input strings are
not valid character strings or violate other collation constraints.
4.2.2. Substring
Spencer: the following text requiring the ending offset seems inconsistent
with 5.2, which (as I understand it) allows either the ending offset OR the
length to be returned. If they ARE inconsistent, I'd much rather see 4.2.2
prevail, because I don't feel good about telling application developers that
sometimes they may get (10, 15) that means "six characters/octets long" and
other times they may get (10, 15) which means "15 characters/octets long".
Application protocols MAY return position information for substring
matches. If this is done, the position information SHOULD include
both the starting offset and the ending offset in the string.
4.3. Internal Canonicalization Algorithm
Spencer: I don't believe that "The output of the canonicalization algorithm
MAY have no meaning to a human" is an upper-case MAY - not a requirement.
A collation specification MUST describe the internal canonicalization
algorithm. This algorithm can be applied to individual strings and
the result strings can be stored to potentially optimize future
comparison operations. A collation MAY specify that the
canonicalization algorithm is the identity function. The output of
the canonicalization algorithm MAY have no meaning to a human.
7.1. Collation Registration Procedure
Spencer: I'm not trying to change existing practice, but the IESG is having
enough fun reviewing appeals these days that if the appeal track started
with the APPS area directors, I'm sure that the other ADs would be thrilled.
:-(
The IETF will create a mailing list, [EMAIL PROTECTED], which can be
used for public discussion of collation proposals prior to
registration. Use of the mailing list is encouraged but not
required. The actual registration procedure will not begin until the
completed registration template is sent to [EMAIL PROTECTED] The IESG
will appoint a designated expert who will monitor the
[EMAIL PROTECTED] mailing list and review registrations forwarded
from IANA. The designated expert is expected to tell IANA and the
submitter of the registration within two weeks whether the
registration is approved, approved with minor changes, or rejected
with cause. When a registration is rejected with cause, it can be
re-submitted if the concerns listed in the cause are addressed.
Decisions made by the designated expert can be appealed to the IESG
and subsequently follow the normal appeals procedure for IESG
decisions.
9.2.1. ASCII Casemap Collation Description
Spencer: the following text really clarified the text describing ACAP and
Sieve previously - use this sentence in that section as well?
For historical reasons, in the context of ACAP and Sieve, the name
"i;ascii-casemap" is a synonym for this collation.
9.5.1. Octet Collation Description
Spencer: Ouch! is there a less ambiguous naming set than "first string" and
"second string"? I'm almost sure I've also used programming languages that
thought the first string was the search target, so it took me a second to
grok that the second string was the search target. If I'm the only one who
is confused, that's not a problem.
The substring function returns "match" if the first string is the
empty string, or if there exists a substring of the second string of
length equal to the length of the first string which would result in
a "match" result from the equality function. Otherwise the substring
function returns "no-match".
_______________________________________________
Gen-art mailing list
[email protected]
https://www1.ietf.org/mailman/listinfo/gen-art