Doug Ewell wrote:
When 137,468 private-use characters aren't enough?
In my opinion, a base character plus tag sequence has the potential to
be used for many large scale applications for the future.
A base character plus tag sequence encoding has the advantage over a
Private Use Area encoding
On 23/03/2020 03:56, Markus Scherer via Unicode wrote:
> On Sat, Mar 21, 2020 at 12:35 PM Doug Ewell via Unicode
> wrote:
>
>> I thought the whole premise of GB18030 was that it was Unicode mapped into
>> a GB2312 framework. What characters exist in GB18030 that don't exist in
>> Unicode, and
On Sat, Mar 21, 2020 at 12:35 PM Doug Ewell via Unicode
wrote:
> I thought the whole premise of GB18030 was that it was Unicode mapped into
> a GB2312 framework. What characters exist in GB18030 that don't exist in
> Unicode, and have they been proposed for Unicode yet, and why was none of
> the
On Sat, 21 Mar 2020 13:33:18 -0600
Doug Ewell via Unicode wrote:
> Eli Zaretskii wrote:
> > Emacs uses some of that for supporting charsets that cannot be
> > mapped into Unicode. GB18030 is one example of such charsets. The
> > internal representation of characters in Emacs is UTF-8, so it
Eli Zaretskii wrote:
>> When 137,468 private-use characters aren't enough?
>
> Why is that relevant to the issue at hand?
You're right. I did ask what the uses of non-standard UTF-8 were, and you gave
me an example.
> I don't remember off hand, but last time I looked at GB18030, there
> were a
On 2020-03-21, Eli Zaretskii via Unicode wrote:
>> Date: Sat, 21 Mar 2020 11:13:40 -0600
>> From: Doug Ewell via Unicode
>>
>> Adam Borowski wrote:
>>
>> > Also, UTF-8 can carry more than Unicode -- for example, U+D800..U+DFFF
>> > or U+11000..U+7FFF (or possibly even up to 2³⁶ or 2⁴²),
> From: "Doug Ewell"
> Cc:
> Date: Sat, 21 Mar 2020 13:33:18 -0600
>
> > Emacs uses some of that for supporting charsets that cannot be mapped
> > into Unicode. GB18030 is one example of such charsets. The internal
> > representation of characters in Emacs is UTF-8, so it uses 5-byte
> >
Eli Zaretskii wrote:
>>> Also, UTF-8 can carry more than Unicode -- for example,
>>> U+D800..U+DFFF or U+11000..U+7FFF (or possibly even up to 2³⁶ or
>>> 2⁴²), which has its uses but is not well-formed Unicode.
>>
>> I'd be interested in your elaboration on what these uses are.
>
> Emacs uses
> Date: Sat, 21 Mar 2020 11:13:40 -0600
> From: Doug Ewell via Unicode
>
> Adam Borowski wrote:
>
> > Also, UTF-8 can carry more than Unicode -- for example, U+D800..U+DFFF
> > or U+11000..U+7FFF (or possibly even up to 2³⁶ or 2⁴²), which has
> > its uses but is not well-formed Unicode.
>
Adam Borowski wrote:
> Also, UTF-8 can carry more than Unicode -- for example, U+D800..U+DFFF
> or U+11000..U+7FFF (or possibly even up to 2³⁶ or 2⁴²), which has
> its uses but is not well-formed Unicode.
I'd be interested in your elaboration on what these uses are.
--
Doug Ewell |
On 20/03/2020 23:41, Adam Borowski via Unicode wrote:
> Also, UTF-8 can carry more than Unicode -- for example, U+D800..U+DFFF or
> U+11000..U+7FFF (or possibly even up to 2³⁶ or 2⁴²), which has its uses
> but is not well-formed Unicode.
This would definitely no longer be UTF-8! Martin.
data format.
> > CSV is a text data format.
> >
> > Question #1: Is the binaryness/textness of a data format a
> > property?
> >
> > Question #2: If the answer to Question #1 is yes, then what is the
> > name of this binaryness/textness property?
I'd sug
On Fri, Mar 20, 2020 at 07:22:45AM -0700, J Decker via Unicode wrote:
> On Fri, Mar 20, 2020 at 5:48 AM Adam Borowski via Unicode <
> > For example, most Unix-heads will tell you that UTF16LE is a binary rather
> > than text format. Microsoft employees and some members of this list will
> >
> JPEG is a binary data format.
> > CSV is a text data format.
> >
> > Question #1: Is the binaryness/textness of a data format a property?
> >
> > Question #2: If the answer to Question #1 is yes, then what is the name
> of
> > this binaryness/textness pro
On Fri, Mar 20, 2020 at 12:21:26PM +, Costello, Roger L. via Unicode wrote:
> [Definition] Property: an attribute, quality, or characteristic of something.
>
> JPEG is a binary data format.
> CSV is a text data format.
>
> Question #1: Is the binaryness/textness of a data
#1: Yes.
#2: [ my suggestion ] File type category
A.D.
-Ursprüngliche Nachricht-
Von: Unicode Im Auftrag von Costello, Roger L.
via Unicode
Gesendet: Freitag, 20. März 2020 13:21
An: unicode@unicode.org
Betreff: Is the binaryness/textness of a data format a property?
Hello Data
Hello Data Format Experts!
[Definition] Property: an attribute, quality, or characteristic of something.
JPEG is a binary data format.
CSV is a text data format.
Question #1: Is the binaryness/textness of a data format a property?
Question #2: If the answer to Question #1 is yes, then what
17 matches
Mail list logo