David Starner wrote:
Yes, I too wonder why the Web people would chose decimal for
thier Unicode references, [...]
Probably because (one of) the most widespread browsers did not support
hexadecimal entities till not much time ago.
_ Marco
Roozbeh Pournader wrote:
...U+29C8.
But it's not a math symbol (from that document, it seems that Squared
Square is a binary operator). It's a bullet. Unify?
No one else has responded to the list yet about this.
According to the 3.0 book, under Character Properties,
4.9 Mathematical
Peter Constable wrote:
It seems to me that you are still missing the point I'm making.
end quote
Peter Constable then quoted part of a sentence that I had written.
For example, in everyday use of the English language, if I write the word
horse then you have a knowledge of what that word means
On Thu, 26 Apr 2001, Marco Cimarosti wrote:
As a second thought, using codes higher than 0x0FFF is even safer,
because it also accounts for the fact that, theoretically, ISO 10646 uses 31
bits.
But this was the Unicode mailing list ;)
Of course, all this is only possible for
On Thu, 26 Apr 2001, Marco Cimarosti wrote:
A. Intentional private use for non-exchanged data [...]
I agree that little or no coordination is needed for case A. If PUA
codepoints remain totally internal to an application, there is going to be
no interchange problem at all, as far as
Roozbeh Pournader wrote:
On Thu, 26 Apr 2001, Marco Cimarosti wrote:
A. Intentional private use for non-exchanged data [...]
I have some objection. One should not use PUA codes for
internal purposes [...]
If only a few internal ones needed, use noncharacters like the ones in
Roozbeh Pournader wrote:
Surrogating the noncharacters in the FDD0..FDEF range works
for internally 16-bit apps.
OK. But you won't implement Devanagari rendering with 32 glyphs...
At the risk of being the victim of the first digital autodafé, I will add
that codes DC00..DFFF (Low Surrogate)
On Thu, 26 Apr 2001, Marco Cimarosti wrote:
OK. But you won't implement Devanagari rendering with 32 glyphs...
If you want to render Devanagari, please use some other mechanisms, not
simple character streams. Change the internals to 32-bit, or keep other
things with the codepoints. Yes,
Matitiahu Allouche (Mati) has prepared a document Guidelines
of a Logical User Interface for Editing Bidirectional Text.
It is being discussed at the SII.
With his kind permission, I have placed it at
http://www.qsm.co.il/Hebrew/logicUI22.htm
Thanks for the link! Matitiahu's document is
-Original Message-
From: Marco Cimarosti [mailto:[EMAIL PROTECTED]]
Sent: Thursday, April 26, 2001 7:55 PM
Matitiahu Allouche (Mati) has prepared a document Guidelines
of a Logical User Interface for Editing Bidirectional Text.
It is being discussed at the SII.
With his
On Thu, Apr 26, 2001 at 09:16:42AM -0700, Paul Deuter wrote:
I am wondering if there isn't a need for the Unicode Spec to also
dictate a way of encoding Unicode in an ASCII stream. Perhaps
the %u is already that and I am just ignorant. Another
alternative would be to use the U+
Paul Deuter wrote:
I am wondering if there isn't a need for the Unicode Spec to also
dictate a way of encoding Unicode in an ASCII stream. Perhaps
How many more ways to we need?
To be 8-bit-friendly, we have UTF-8.
To get everything into ASCII characters, we have UTF-7.
W3C specifies to use
From: William Overington [EMAIL PROTECTED]
I have updated my suggestion. Here is the latest version for discussion.
Lets consider the fact that what you are looking for is summarized at the
end of your message: I hope to gain fairly widespread agreement within the
unicode user community. I
Based on the responses, I guess my original question/problem was not
very well written.
UTF-7 won't work because it cannot be distinguished from ASCII without
something that identifies it as UTF-7.
The %XX idea does not work because this it already in use by lots of
software
to encode many
On 04/26/2001 06:14:21 PM William Overington wrote:
Peter Constable asks If I write chat, do you know what I mean?.
Hmm, let me ponder! :-)
Is it possible that you are referring to the answer that an Australian
numismatist might give if asked what is the bird on the reverse of a
British
On 04/27/2001 03:23:36 AM unicode-bounce wrote:
From: William Overington [EMAIL PROTECTED]
I have updated my suggestion. Here is the latest version for
discussion.
Lets consider the fact that what you are looking for is summarized at the
end of your message: I hope to gain fairly widespread
Pollard has been smiled upon from the divine realm. There is now a
mailing list set up to discuss it and its encoding in Unicode.
Instructions follow:
Your list is ready. [EMAIL PROTECTED]. People can subscribe by the
usual method: send a blank message to [EMAIL PROTECTED] and say
Paul,
It sounds like you want URL's in UTF-8 and data is some code page. The HTTP
header protocol only allows a single charset specification. Even if you
pass UTF-8 URLs the browser should not handle them properly unless you also
have the data in UTF-8 as well.
If you send pages in UTF-8 you
William Overington wrote:
I have updated my suggestion. Here is the latest version for discussion.
...
Specific protocols to use with such tagging can be devised.
...
The suggestion is open for discussion and I hope to gain fairly widespread
agreement within the unicode user community.
And
W3C specifies to use %-encoded UTF-8 for URLs.
I think that's an overstatement.
Neither the W3C nor the IETF make such a specification.
http://www.w3.org/TR/charmod/#sec-URIs
contains many ambiguities, conflicts with XML and HTTP,
and is not yet a recommendation.
At 11:28 01/04/26 -0700, Markus Scherer wrote:
Paul Deuter wrote:
I am wondering if there isn't a need for the Unicode Spec to also
dictate a way of encoding Unicode in an ASCII stream. Perhaps
How many more ways to we need?
To be 8-bit-friendly, we have UTF-8.
To get everything into ASCII
Hello Paul,
At 19:41 01/04/25 -0700, Paul Deuter wrote:
I am struggling to figure out the correct method for encoding Unicode
characters in the
query string portion of a URL.
There is a W3C spec that says the Unicode character should be converted to
UTF-8 and
then each byte should be encoded as
At 15:02 01/04/26 -0700, Paul Deuter wrote:
Based on the responses, I guess my original question/problem was not
very well written.
The %XX idea does not work because this it already in use by lots of
software
to encode many different character sets. So again we need something that
identifies
Hello Mike,
At 19:09 01/04/26 -0600, Mike Brown wrote:
W3C specifies to use %-encoded UTF-8 for URLs.
I think that's an overstatement.
Neither the W3C nor the IETF make such a specification.
True. Neither W3C nor IETF make such a general statement,
because we can't just remove the about 10
Wm Seán Glen asked:
Couldn't one just embed the glyphs that aren't specified by Unicode along
with the text?
end quote
Yes one could, in a file such as a Word document file where the format of
the Word file can handle the embedding of illustrations.
However, if one is using a plain unicode
I have updated my suggestion. Here is the latest version for discussion.
Let there exist the idea that there is U+12 (PUA INTERPRETATION TAG) and
a set of private use area tag characters (U+100020 U+10007F) all of
which code points are in the upper private use area.
May I suggest that
Thanks Addison.
I appreciate that the UTF-8 solution is the right one.
However we must acknowledge that this right solution does not
appear to be implemented in anywhere. And I have come to the
conclusion that it also will not be.
The reason is the one that you mentioned: because the %XX
On Mon, 23 Apr 2001, Mike Brown wrote:
A char corresponds to a Unicode value -- a UTF-16 code value, which could
either represent a Unicode character or one half of a surrogate pair. In the
latter case, it would take a sequence of two chars to make one Unicode
character. It is my
28 matches
Mail list logo