I do remember now, seeing that section in RFC 7564, and thinking that the other profiles contradicted it. Seems like a good change to me!
On Tue, 6 Sep 2016 at 07:47 Peter Saint-Andre <[email protected]> wrote: > On 9/4/16 6:34 PM, Peter Saint-Andre wrote: > > On 9/4/16 5:30 PM, Erin Millard wrote: > >> >>> * §2.2 Specifies that UTF-8 MUST be used as the encoding; do > >> we really > >> >>> want to limit this to UTF-8 only? Is this for comparison > >> purposes? > >> >>> Then again, 99.99% of the time UTF-8 is what you should be using > >> >>> anyways, so I'm not sure that it matters. > >> >> > >> >> UTF-8 is your friend, and everything in PRECIS is UTF-8. > >> > > >> > PRECIS is mostly encoding agnostic; implementations might favor a > >> > specific encoding, but I don't think anything in the spec > >> specifically > >> > *needs* UTF-8. That being said, there are so few reasons to use > >> > anything other than UTF-8 that I don't think it really matters, > >> it was > >> > just curious to me that some of the PRECIS related specs called > out > >> > UTF-8 and some didn't. > >> > >> I thought they all did, but will double-check. > >> > >> > >> This actually became a bigger issue when attempting to implement PRECIS > >> prepare in JavaScript for the browser. JavaScript doesn't have native > >> UTF-8 support, so this meant the extra bloat of bringing in a UTF-8 > >> library. > >> > >> It didn't make a lot of sense to me either, since all the encoding > >> affects is how you go from string to code points, and vice versa. It had > >> no effect on the rest of my implementation. I could absolutely be > >> missing something, but compared to how focused the rest of the spec is, > >> the UTF-8 requirement seemed like an afterthought. > >> > >> Can anyone explain which parts of PRECIS are actually predicated on the > >> original string being encoded in UTF-8? > > > > Are we perhaps getting confused between the encoding that is sent over > > the wire and the encoding that is used within the processing application? > > > > In general, we in the IETF prefer to send UTF-8 over the wire. However, > > it's true that this is a matter for the "using protocol" (e.g., I > > distinctly recall an extremely long thread in the XMPP WG years ago > > about whether to support only UTF-8 or to give clients and servers the > > ability to also use UTF-16 - and "UTF-8 only" won that debate). Given > > that some protocols or other technologies that use PRECIS might use > > UTF-16 or give applications the ability to choose an encoding, you're > > probably right that it makes sense to relax the rule for PRECIS itself. > > > > I'll think about this some more and propose some text. > > As promised, I've thought about it further and I agree that specifying > an encoding of UTF-8 is not really appropriate in 7613bis and 7700bis. > In fact, RFC 7564 (the PRECIS framework) states the following in §13.1: > > Although strings that are consumed in PRECIS-based application > protocols are often encoded using UTF-8 [RFC3629], the exact encoding > is a matter for the application protocol that uses PRECIS, not for > the PRECIS framework. > > Thus, for instance, it's fine for RFC 7622, which defines the address > format in XMPP, to specify an encoding of UTF-8, but not for 7613bis or > 7700bis to do so. > > I notice that RFC 5890 (for IDNA) has text like this > > o A "U-label" is an IDNA-valid string of Unicode characters, in > Normalization Form C (NFC) and including at least one non-ASCII > character, expressed in a standard Unicode Encoding Form (such as > UTF-8). > > Text similar to that might be best for 7613bis and 7700bis. > > Peter > >
_______________________________________________ precis mailing list [email protected] https://www.ietf.org/mailman/listinfo/precis
