On Wed, Nov 16, 2016 at 10:01:10PM +0900, Job Snijders wrote: > Unless some implementors make significant arguments along the lines of > "we CANNOT implement this Shutdown Communication functionality, SOLELY > because of utf8 and lack of representation filtering capabilities", i'd > water down the utf8 requirement to 7bit ascii (because in the end its > better to have 'something' than nothing). Another line of argumentation > against utf8 would be if major security concerns are articulated.
In general, IESG comments will push any user-displayable string to UTF-8 anyway, so I wouldn't stress over this being the requirement. It's pretty much an IETF-wide expectation these days. I think your doc is the first one that I've seen bothering to cite the unicode considerations documents. Ideally that'd be a pointer to somewhere else in a single ref, but I don't think I've seen such an IETF document. > I hope to capture in the draft that an implementation can choose which > characters of the Shutdown Communication they represent in the syslog or > 'show bgp neighbor xxx' output. For instance, I'd recommend to squash > all newline/newpage/newfeed/newparagraph style chars and make sure that > the Communication is represented on a single line. I don't have the > proper words for the draft to express that (yet). Again, perhaps too much to tackle in this document. A portion of what you're interested in is covered under the control characters section: https://en.wikipedia.org/wiki/Unicode_control_characters If you try to get too normative you're going to spend a huge amount of text trying to close all of the holes. > Also I don't mind if an implementation consciously chooses to only > represent 7bit ASCII. That should be an implementor decision. They can > upgrade later. In theory the protocol spec shouldn't be delayed or > obstructed due to an implementor's current internationalisation > capabilities (which can change over time). ASCII is conformant UTF-8, which is one of the nice properties about that encoding. What tends to be problematic in many people's implementations is emitting Latin-1 or similar encodings that are 8-bit as if they're valid UTF-8, which they're not. The better question is what the general expected behavior is when things cannot be displayed. Some of that will depend on the i18n capabilities of someone's implementation. -- Jeff _______________________________________________ GROW mailing list [email protected] https://www.ietf.org/mailman/listinfo/grow
