Base character plus tag sequences (from RE: Is the binaryness/textness of a data format a property?)

2020-03-23 Thread wjgo_10...@btinternet.com via Unicode
Doug Ewell wrote: When 137,468 private-use characters aren't enough? In my opinion, a base character plus tag sequence has the potential to be used for many large scale applications for the future. A base character plus tag sequence encoding has the advantage over a Private Use Area encoding

Re: Is the binaryness/textness of a data format a property?

2020-03-22 Thread Martin J . Dürst via Unicode
On 23/03/2020 03:56, Markus Scherer via Unicode wrote: > On Sat, Mar 21, 2020 at 12:35 PM Doug Ewell via Unicode > wrote: > >> I thought the whole premise of GB18030 was that it was Unicode mapped into >> a GB2312 framework. What characters exist in GB18030 that don't exist in >> Unicode, and

Re: Is the binaryness/textness of a data format a property?

2020-03-22 Thread Markus Scherer via Unicode
On Sat, Mar 21, 2020 at 12:35 PM Doug Ewell via Unicode wrote: > I thought the whole premise of GB18030 was that it was Unicode mapped into > a GB2312 framework. What characters exist in GB18030 that don't exist in > Unicode, and have they been proposed for Unicode yet, and why was none of > the

Re: Is the binaryness/textness of a data format a property?

2020-03-21 Thread Richard Wordingham via Unicode
On Sat, 21 Mar 2020 13:33:18 -0600 Doug Ewell via Unicode wrote: > Eli Zaretskii wrote: > > Emacs uses some of that for supporting charsets that cannot be > > mapped into Unicode. GB18030 is one example of such charsets. The > > internal representation of characters in Emacs is UTF-8, so it

RE: Is the binaryness/textness of a data format a property?

2020-03-21 Thread Doug Ewell via Unicode
Eli Zaretskii wrote: >> When 137,468 private-use characters aren't enough? > > Why is that relevant to the issue at hand? You're right. I did ask what the uses of non-standard UTF-8 were, and you gave me an example. > I don't remember off hand, but last time I looked at GB18030, there > were a

Re: Is the binaryness/textness of a data format a property?

2020-03-21 Thread Julian Bradfield via Unicode
On 2020-03-21, Eli Zaretskii via Unicode wrote: >> Date: Sat, 21 Mar 2020 11:13:40 -0600 >> From: Doug Ewell via Unicode >> >> Adam Borowski wrote: >> >> > Also, UTF-8 can carry more than Unicode -- for example, U+D800..U+DFFF >> > or U+11000..U+7FFF (or possibly even up to 2³⁶ or 2⁴²),

Re: Is the binaryness/textness of a data format a property?

2020-03-21 Thread Eli Zaretskii via Unicode
> From: "Doug Ewell" > Cc: > Date: Sat, 21 Mar 2020 13:33:18 -0600 > > > Emacs uses some of that for supporting charsets that cannot be mapped > > into Unicode. GB18030 is one example of such charsets. The internal > > representation of characters in Emacs is UTF-8, so it uses 5-byte > >

RE: Is the binaryness/textness of a data format a property?

2020-03-21 Thread Doug Ewell via Unicode
Eli Zaretskii wrote: >>> Also, UTF-8 can carry more than Unicode -- for example, >>> U+D800..U+DFFF or U+11000..U+7FFF (or possibly even up to 2³⁶ or >>> 2⁴²), which has its uses but is not well-formed Unicode. >> >> I'd be interested in your elaboration on what these uses are. > > Emacs uses

Re: Is the binaryness/textness of a data format a property?

2020-03-21 Thread Eli Zaretskii via Unicode
> Date: Sat, 21 Mar 2020 11:13:40 -0600 > From: Doug Ewell via Unicode > > Adam Borowski wrote: > > > Also, UTF-8 can carry more than Unicode -- for example, U+D800..U+DFFF > > or U+11000..U+7FFF (or possibly even up to 2³⁶ or 2⁴²), which has > > its uses but is not well-formed Unicode. >

Re: Is the binaryness/textness of a data format a property?

2020-03-21 Thread Doug Ewell via Unicode
Adam Borowski wrote: > Also, UTF-8 can carry more than Unicode -- for example, U+D800..U+DFFF > or U+11000..U+7FFF (or possibly even up to 2³⁶ or 2⁴²), which has > its uses but is not well-formed Unicode. I'd be interested in your elaboration on what these uses are. -- Doug Ewell |

Re: Is the binaryness/textness of a data format a property?

2020-03-20 Thread Martin J . Dürst via Unicode
On 20/03/2020 23:41, Adam Borowski via Unicode wrote: > Also, UTF-8 can carry more than Unicode -- for example, U+D800..U+DFFF or > U+11000..U+7FFF (or possibly even up to 2³⁶ or 2⁴²), which has its uses > but is not well-formed Unicode. This would definitely no longer be UTF-8! Martin.

Re: Is the binaryness/textness of a data format a property?

2020-03-20 Thread Richard Wordingham via Unicode
data format. > > CSV is a text data format. > > > > Question #1: Is the binaryness/textness of a data format a > > property? > > > > Question #2: If the answer to Question #1 is yes, then what is the > > name of this binaryness/textness property? I'd sug

Re: Is the binaryness/textness of a data format a property?

2020-03-20 Thread Adam Borowski via Unicode
On Fri, Mar 20, 2020 at 07:22:45AM -0700, J Decker via Unicode wrote: > On Fri, Mar 20, 2020 at 5:48 AM Adam Borowski via Unicode < > > For example, most Unix-heads will tell you that UTF16LE is a binary rather > > than text format. Microsoft employees and some members of this list will > >

Re: Is the binaryness/textness of a data format a property?

2020-03-20 Thread J Decker via Unicode
> JPEG is a binary data format. > > CSV is a text data format. > > > > Question #1: Is the binaryness/textness of a data format a property? > > > > Question #2: If the answer to Question #1 is yes, then what is the name > of > > this binaryness/textness pro

Re: Is the binaryness/textness of a data format a property?

2020-03-20 Thread Adam Borowski via Unicode
On Fri, Mar 20, 2020 at 12:21:26PM +, Costello, Roger L. via Unicode wrote: > [Definition] Property: an attribute, quality, or characteristic of something. > > JPEG is a binary data format. > CSV is a text data format. > > Question #1: Is the binaryness/textness of a data

AW: Is the binaryness/textness of a data format a property?

2020-03-20 Thread Dreiheller, Albrecht via Unicode
#1: Yes. #2: [ my suggestion ] File type category A.D. -Ursprüngliche Nachricht- Von: Unicode Im Auftrag von Costello, Roger L. via Unicode Gesendet: Freitag, 20. März 2020 13:21 An: unicode@unicode.org Betreff: Is the binaryness/textness of a data format a property? Hello Data

Is the binaryness/textness of a data format a property?

2020-03-20 Thread Costello, Roger L. via Unicode
Hello Data Format Experts! [Definition] Property: an attribute, quality, or characteristic of something. JPEG is a binary data format. CSV is a text data format. Question #1: Is the binaryness/textness of a data format a property? Question #2: If the answer to Question #1 is yes, then what