On Wed, Nov 16, 2016 at 10:01:10PM +0900, Job Snijders wrote:
> Unless some implementors make significant arguments along the lines of
> "we CANNOT implement this Shutdown Communication functionality, SOLELY
> because of utf8 and lack of representation filtering capabilities", i'd
> water down the utf8 requirement to 7bit ascii (because in the end its
> better to have 'something' than nothing). Another line of argumentation
> against utf8 would be if major security concerns are articulated.

In general, IESG comments will push any user-displayable string to UTF-8
anyway, so I wouldn't stress over this being the requirement.  It's pretty
much an IETF-wide expectation these days.

I think your doc is the first one that I've seen bothering to cite the
unicode considerations documents.  Ideally that'd be a pointer to somewhere
else in a single ref, but I don't think I've seen such an IETF document.

> I hope to capture in the draft that an implementation can choose which
> characters of the Shutdown Communication they represent in the syslog or
> 'show bgp neighbor xxx' output. For instance, I'd recommend to squash
> all newline/newpage/newfeed/newparagraph style chars and make sure that
> the Communication is represented on a single line. I don't have the
> proper words for the draft to express that (yet).

Again, perhaps too much to tackle in this document.

A portion of what you're interested in is covered under the control
characters section:

https://en.wikipedia.org/wiki/Unicode_control_characters

If you try to get too normative you're going to spend a huge amount of text
trying to close all of the holes.

> Also I don't mind if an implementation consciously chooses to only
> represent 7bit ASCII. That should be an implementor decision. They can
> upgrade later. In theory the protocol spec shouldn't be delayed or
> obstructed due to an implementor's current internationalisation
> capabilities (which can change over time).

ASCII is conformant UTF-8, which is one of the nice properties about that
encoding.  What tends to be problematic in many people's implementations is
emitting Latin-1 or similar encodings that are 8-bit as if they're valid
UTF-8, which they're not.

The better question is what the general expected behavior is when things
cannot be displayed.  Some of that will depend on the i18n capabilities of
someone's implementation.


-- Jeff

_______________________________________________
GROW mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/grow

Reply via email to