I do remember now, seeing that section in RFC 7564, and thinking that the
other profiles contradicted it. Seems like a good change to me!

On Tue, 6 Sep 2016 at 07:47 Peter Saint-Andre <[email protected]> wrote:

> On 9/4/16 6:34 PM, Peter Saint-Andre wrote:
> > On 9/4/16 5:30 PM, Erin Millard wrote:
> >>     >>> * §2.2 Specifies that UTF-8 MUST be used as the encoding; do
> >> we really
> >>     >>> want to limit this to UTF-8 only? Is this for comparison
> >> purposes?
> >>     >>> Then again, 99.99% of the time UTF-8 is what you should be using
> >>     >>> anyways, so I'm not sure that it matters.
> >>     >>
> >>     >> UTF-8 is your friend, and everything in PRECIS is UTF-8.
> >>     >
> >>     > PRECIS is mostly encoding agnostic; implementations might favor a
> >>     > specific encoding, but I don't think anything in the spec
> >> specifically
> >>     > *needs* UTF-8. That being said, there are so few reasons to use
> >>     > anything other than UTF-8 that I don't think it really matters,
> >> it was
> >>     > just curious to me that some of the PRECIS related specs called
> out
> >>     > UTF-8 and some didn't.
> >>
> >>     I thought they all did, but will double-check.
> >>
> >>
> >> This actually became a bigger issue when attempting to implement PRECIS
> >> prepare in JavaScript for the browser. JavaScript doesn't have native
> >> UTF-8 support, so this meant the extra bloat of bringing in a UTF-8
> >> library.
> >>
> >> It didn't make a lot of sense to me either, since all the encoding
> >> affects is how you go from string to code points, and vice versa. It had
> >> no effect on the rest of my implementation. I could absolutely be
> >> missing something, but compared to how focused the rest of the spec is,
> >> the UTF-8 requirement seemed like an afterthought.
> >>
> >> Can anyone explain which parts of PRECIS are actually predicated on the
> >> original string being encoded in UTF-8?
> >
> > Are we perhaps getting confused between the encoding that is sent over
> > the wire and the encoding that is used within the processing application?
> >
> > In general, we in the IETF prefer to send UTF-8 over the wire. However,
> > it's true that this is a matter for the "using protocol" (e.g., I
> > distinctly recall an extremely long thread in the XMPP WG years ago
> > about whether to support only UTF-8 or to give clients and servers the
> > ability to also use UTF-16 - and "UTF-8 only" won that debate). Given
> > that some protocols or other technologies that use PRECIS might use
> > UTF-16 or give applications the ability to choose an encoding, you're
> > probably right that it makes sense to relax the rule for PRECIS itself.
> >
> > I'll think about this some more and propose some text.
>
> As promised, I've thought about it further and I agree that specifying
> an encoding of UTF-8 is not really appropriate in 7613bis and 7700bis.
> In fact, RFC 7564 (the PRECIS framework) states the following in §13.1:
>
>     Although strings that are consumed in PRECIS-based application
>     protocols are often encoded using UTF-8 [RFC3629], the exact encoding
>     is a matter for the application protocol that uses PRECIS, not for
>     the PRECIS framework.
>
> Thus, for instance, it's fine for RFC 7622, which defines the address
> format in XMPP, to specify an encoding of UTF-8, but not for 7613bis or
> 7700bis to do so.
>
> I notice that RFC 5890 (for IDNA) has text like this
>
>     o  A "U-label" is an IDNA-valid string of Unicode characters, in
>        Normalization Form C (NFC) and including at least one non-ASCII
>        character, expressed in a standard Unicode Encoding Form (such as
>        UTF-8).
>
> Text similar to that might be best for 7613bis and 7700bis.
>
> Peter
>
>
_______________________________________________
precis mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/precis

Reply via email to