Why multiple pairs ?
If the intent is just to mark within the encoded text those sequences that
are not interpretable as plain-text alone, because it is mixing characters
from an upper-layer syntax, and characters from a pictographic script, a
single pair of format controls would be enough.
William_J_G Overington wrote:
Would it be a good idea to define a new block of characters within
Unicode/10646 such that characters would be encoded in pairs, possibly
with visible glyphs as context-specific markup brackets?
[...]
I am thinking that this would mean that where some
On Wednesday 28 November 2012, Doug Ewell d...@ewellic.org wrote:
William_J_G Overington wjgo underscore 10009 at btinternet dot com wrote:
For example, there is my research on communication through the language
barrier...
No, stop right there. This is an excellent example of
2012/11/28 Doug Ewell d...@ewellic.org
Using the PUA to extend Unicode substantially beyond what a character
encoding standard is supposed to be, and (especially) expecting others
to adopt that non-character PUA usage, or expecting it to be ipso facto
a step toward formal encoding, is
William_J_G Overington wjgo underscore 10009 at btinternet dot com
wrote:
Do NOT try to make this system conceptually part of Unicode.
Well, consider please the following example, from a simulation, of the
text of a plain text email.
Margaret Gattenford
[...]
Embedding these items
Yes I know and I was clear about this that this was not in scope of the
current standard encoding policy
Which however still does not prevent another upward-compatible standard to
emerge using another encoding policy (e.g. for encoding glyphs or coporate
logos, in an Internet-based registry, with
Philippe Verdy verdy underscore p at wanadoo dot fr wrote:
Date: Wed, 28 Nov 2012 09:04:41 +0100
Yes I know and I was clear about this that this was not in scope of
the current standard encoding policy
Which however still does not prevent another upward-compatible
standard to emerge using
Philippe is (apparently) referring to higher-level protocols for markup of
hieroglyphic text. See, e.g., Table 14-10 and Figure 14-2, p. 489 in Section
14.18, Egyptian Hieroglyphs in TUS 6.2:
http://www.unicode.org/versions/Unicode6.2.0/ch14.pdf
Similar kinds of higher-level protocols are
Well, first, it is 17 planes (or have we switched to using hexadecimal
numbers on the Unicode list already?
Second, of course this is in connection with UTF-16. I wasn't involved
when UTF-16 was created, but it must have become clear that 2^16 (^
denotes exponentiation (to the power of))
That's a valid computation if the extension was limited to use only
2-surrogate encodings for supplementary planes.
If we could use 3-surrogate encodings, you'd need
3*2ˆn surrogates
to encode
2^(3*n)
new codepoints.
With n=10 (like today), this requires a total of 3072 surrogates, and you
Note that the **curent bet** that the existing 17 planes will be sufficient
is valid only if there's no international desire to encode something else
than just what is in the current focus of Unicode.
Say (for example) that the WIPO absolutely wants to encode corporate logos.
Or ISO or the IETF
On Tuesday 27 November 2012, Philippe Verdy verd...@wanadoo.fr wrote:
This is not complicate to parse it in the foreward direction, but for the
backward direction, it means that when you see the final low surrogate, you
still need to rollback to the previous position: it can only be a
There isn't an actual problem here which needs a solution, satisfactory, or
otherwise. The persistence of the 17 planes may not be enough meme on this
list is an interesting phenomenon in itself, but has no practical impact on any
of the actual ongoing work on maintenance of the encoding
To this, my mother would say: Why keep it simple when we can make it
complicated?.
Regards,Martin.
On 2012/11/27 21:01, Philippe Verdy wrote:
That's a valid computation if the extension was limited to use only
2-surrogate encodings for supplementary planes.
If we could use 3-surrogate
Philippe Verdy wrote:
And it will still remain enough place in the remaining planes to
define later a few more surrogates of a new type, if really needed for
a future, upward compatible, standard if it ever comes to reality —
such as having an open registry of corporate logos or glyph designs,
15 matches
Mail list logo