On 6/29/2014 11:44 AM, Koji Ishii wrote:
Surrogate code points, private-use characters, and control characters are not 
given the Default_Ignorable_Code_Point property. To avoid security problems, 
such characters or code points, when not interpreted and not displayable by 
normal rendering, should be displayed in fallback rendering with a fallback 
glyph
By looking at this, my questions are as follows:

1. Should control characters that browsers do not interpret be displayed in 
fallback rendering?
2. Should private-use characters (U+E000-F8FF, 0F0000-0FFFFD, 100000-10FFFD) 
without glyphs be displayed in fallback rendering?

These two questions are probably yes from what I understand the text quoted 
above,

By displaying a fall-back rendering the user is alerted that something is present, but normally not visible to the user.

However, these are not the only invisible characters, and many should not (must not) be rendered, ever (except in diagnostic modes). So, it is a bit unclear to me what precisely this recommendation buys you, as it is incomplete.

The recommendation is prefixed with "To avoid security problems,...". If this is taken to mean that it should apply in contexts that require strict attention to security issues, then they probably define a minimum of what should be done, and other measures need to be taken in addition.

but things get harder the more I think:

3. When the above text says “surrogate code points”, does that mean everything 
outside BMP? It reads so to me, but I’m surprised that characters in BMP and 
outside BMP have such differences, so I’m doubting my English skill.

No, those would be supplementary code points. Surrogates are values that are intended to be used in pairs as code units in UTF-16. Ill-formed data may contain unpaired values, those are referred to as Surrogate code points.

4. Should every code point that are not given the Default_Ignorable_Code_Point 
property and that without interpretations nor glyphs displayed in fallback 
rendering? I could not find such statement in Unicode spec, but there are some 
people who believe so.
5. Is there anything else Unicode recommends to display in fallback rendering, 
or not to display? This must be RTFM, but pointing out where to read would be 
appreciated.

_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode

Reply via email to