Norbert echoes my thoughts perfectly:
Using a Unicode escape for non-textual data seems like abuse to me -
Unicode is a character encoding standard. For Unicode, anything beyond six
hex digits is excessive.
Allen, what use cases for using Unicode escapes / strings for non-textual
data did you
On Tue, Jan 24, 2012 at 5:14 PM, Allen Wirfs-Brock
al...@wirfs-brock.com wrote:
The current 16-bit character strings are sometimes uses to store non-Unicode
binary data and can be used with non-Unicode character encoding with up to
16-bit chars. 21 bits is sufficient for Unicode but perhaps is
On Jan 24, 2012, at 11:45 PM, Norbert Lindenberg wrote:
I don't see the standard allowing character encodings other than UTF-16 in
strings. Section 8.4 says When a String contains actual textual data, each
element is considered to be a single UTF-16 code unit. This aligns with
other
On Wed, Jan 25, 2012 at 12:46 PM, Allen Wirfs-Brock
al...@wirfs-brock.comwrote:
Arbitrary 16-bit values can be placed in a String using either
String.fromCharCode (15.5.3.2) or the \u notation in string literals.
Neither of these enforce a requirement that individual String elements are
The current 16-bit character strings are sometimes uses to store non-Unicode
binary data and can be used with non-Unicode character encoding with up to
16-bit chars. 21 bits is sufficient for Unicode but perhaps is not enough
for other useful encodings. 32-bit seems like a plausable unit.
You can't use \u10 as syntax, because that could be \u10FF followed by
literal FF. A better syntax is \u{...}, with 1 to 6 digits, values from 0
.. 10.
Mark
*— Il meglio è l’inimico del bene —*
*
*
*
[https://plus.google.com/114199149796022210033]
*
On Wed, Jan 25, 2012 at 10:59,
(oh, and I agree with your other points)
Mark
*— Il meglio è l’inimico del bene —*
*
*
*
[https://plus.google.com/114199149796022210033]
*
On Wed, Jan 25, 2012 at 11:11, Mark Davis ☕ m...@macchiato.com wrote:
You can't use \u10 as syntax, because that could be \u10FF followed by
literal
Mark--
Of course. Sorry. That should have been \U10 is equivalent to
\udbff\udfff, with a capital U, or \u{10} is equivalent to \udbff\udfff.
--Rich
On Jan 25, 2012, at 11:11 AM, Mark Davis ☕ wrote:
You can't use \u10 as syntax, because that could be \u10FF followed by
literal
On Jan 25, 2012, at 9:54 AM, John Tamplin wrote:
On Wed, Jan 25, 2012 at 12:46 PM, Allen Wirfs-Brock al...@wirfs-brock.com
wrote:
Arbitrary 16-bit values can be placed in a String using either
String.fromCharCode (15.5.3.2) or the \u notation in string literals.
Neither of these
On Wed, Jan 25, 2012 at 2:33 PM, Allen Wirfs-Brock al...@wirfs-brock.comwrote:
It isn't clear from your source code what encoding issues you have
actually identified. I suspect that you are talking about what happens
when an external resource (a application/javascript file) which may be in
On Jan 25, 2012, at 11:37 AM, John Tamplin wrote:
On Wed, Jan 25, 2012 at 2:33 PM, Allen Wirfs-Brock al...@wirfs-brock.com
wrote:
It isn't clear from your source code what encoding issues you have actually
identified. I suspect that you are talking about what happens when an
external
On Jan 25, 2012, at 10:59 AM, Gillam, Richard wrote:
The current 16-bit character strings are sometimes uses to store non-Unicode
binary data and can be used with non-Unicode character encoding with up to
16-bit chars. 21 bits is sufficient for Unicode but perhaps is not enough
for other
On Wed, Jan 25, 2012 at 2:55 PM, Allen Wirfs-Brock al...@wirfs-brock.comwrote:
The primary intent of the proposal was to extend ES Strings to support a
uniform represent of all Unicode characters, including non-BMP. That means
that any Unicode character should occupy exactly one element
On Jan 25, 2012, at 12:25 PM, John Tamplin wrote:
On Wed, Jan 25, 2012 at 2:55 PM, Allen Wirfs-Brock al...@wirfs-brock.com
wrote:
The primary intent of the proposal was to extend ES Strings to support a
uniform represent of all Unicode characters, including non-BMP. That means
that any
On Tue, Jan 24, 2012 at 12:33 PM, Allen Wirfs-Brock
al...@wirfs-brock.comwrote:
Note that this proposal isn't currently under consideration for inclusion
in ES.next, but the answer to you question is below
[...]
Just as the current definition of string specifies that a String is a
sequence
On Jan 24, 2012, at 2:11 PM, Mark S. Miller wrote:
On Tue, Jan 24, 2012 at 12:33 PM, Allen Wirfs-Brock al...@wirfs-brock.com
wrote:
Note that this proposal isn't currently under consideration for inclusion in
ES.next, but the answer to you question is below
[...]
Just as the current
I don't see the standard allowing character encodings other than UTF-16 in
strings. Section 8.4 says When a String contains actual textual data, each
element is considered to be a single UTF-16 code unit. This aligns with other
normative references to UTF-16 in sections 2, 6, and 15.1.3.
http://wiki.ecmascript.org/doku.php?id=strawman:support_full_unicode_in_strings#unicode_escape_sequences
states:
To address this issue, a new form ofUnicodeEscapeSequence is added that is
explicitly tagged as containing var variable number (up to 8) of hex digits.
The new definition is:
18 matches
Mail list logo