Re: Proposal to add Roman transliteration schemes to ISO 15924.

2019-12-02 Thread Mark Davis ☕️ via Unicode
Filed the following, thanks Richard. CLDR-13445 Release link for "latest" goes to zip file On Tue, Dec 3, 2019 at 2:31 AM Richard

Re: Proposal to add Roman transliteration schemes to ISO 15924.

2019-12-02 Thread Vishvas Vasuki
On Tue, Dec 3, 2019 at 7:28 AM Markus Scherer wrote: > > The subtag I would use for IAST seems to be: >> sa-Latn-t-sa-m0-iast (https://r12a.github.io/app-subtags/ is unable to >> confirm that the extension >> >> t-sa-m0-iast is

Re: A neat description of encoding characters

2019-12-02 Thread Mark E. Shoulson via Unicode
On 12/2/19 7:01 AM, Costello, Roger L. via Unicode wrote: >From the book titled "Computer Power and Human Reason" by Joseph Weizenbaum, p.74-75 It's a reasonably good explanation of binary numbers and "encoding" in a more usual sense than we use it here in Unicode-land.  Actually makes for

Re: Proposal to add Roman transliteration schemes to ISO 15924.

2019-12-02 Thread Richard Wordingham via Unicode
On Tue, 3 Dec 2019 01:27:39 + Richard Wordingham wrote: > On Mon, 2 Dec 2019 09:09:02 -0800 > Markus Scherer via Unicode wrote: > > > On Mon, Dec 2, 2019 at 8:42 AM Roozbeh Pournader via Unicode < > > unicode@unicode.org> wrote: > > > > > You don't need an ISO 15924 script code.

Re: A neat description of encoding characters

2019-12-02 Thread James Kass via Unicode
On 2019-12-03 12:59 AM, Richard Wordingham via Unicode wrote: On Mon, 2 Dec 2019 12:01:52 + "Costello, Roger L. via Unicode" wrote: From the book titled "Computer Power and Human Reason" by Joseph Weizenbaum, p.74-75 Suppose that the alphabet with which we wish to concern ourselves

Re: Proposal to add Roman transliteration schemes to ISO 15924.

2019-12-02 Thread Markus Scherer via Unicode
On Mon, Dec 2, 2019 at 5:47 PM विश्वासो वासुकिजः (Vishvas Vasuki) via Unicode wrote: > But that says that the definitions are at >> > >> https://github.com/unicode-org/cldr/releases/tag/latest/common/bcp47/transform.xml >> , >> but all one currently gets from that is an error message 'XML

Re: Proposal to add Roman transliteration schemes to ISO 15924.

2019-12-02 Thread Vishvas Vasuki
On Tue, Dec 3, 2019 at 6:59 AM Richard Wordingham via Unicode < unicode@unicode.org> wrote: > > > You don't need an ISO 15924 script code. You need to think in terms > > > of BCP 47. Sanskrit in Latin would be sa-Latn. > > > > > > > Right! > > > > Now, if you want to distinguish the different

Re: Proposal to add Roman transliteration schemes to ISO 15924.

2019-12-02 Thread Richard Wordingham via Unicode
On Mon, 2 Dec 2019 09:09:02 -0800 Markus Scherer via Unicode wrote: > On Mon, Dec 2, 2019 at 8:42 AM Roozbeh Pournader via Unicode < > unicode@unicode.org> wrote: > > > You don't need an ISO 15924 script code. You need to think in terms > > of BCP 47. Sanskrit in Latin would be sa-Latn. > >

Re: A neat description of encoding characters

2019-12-02 Thread Richard Wordingham via Unicode
On Mon, 2 Dec 2019 12:01:52 + "Costello, Roger L. via Unicode" wrote: > From the book titled "Computer Power and Human Reason" by Joseph > Weizenbaum, p.74-75 > > Suppose that the alphabet with which we wish to concern ourselves > consists of 256 distinct symbols... Why should I wish to

Re: Proposal to add Roman transliteration schemes to ISO 15924.

2019-12-02 Thread Markus Scherer via Unicode
On Mon, Dec 2, 2019 at 8:42 AM Roozbeh Pournader via Unicode < unicode@unicode.org> wrote: > You don't need an ISO 15924 script code. You need to think in terms of BCP > 47. Sanskrit in Latin would be sa-Latn. > Right! Now, if you want to distinguish the different transcription systems for >

Re: Proposal to add Roman transliteration schemes to ISO 15924.

2019-12-02 Thread Roozbeh Pournader via Unicode
You don't need an ISO 15924 script code. You need to think in terms of BCP 47. Sanskrit in Latin would be sa-Latn. Now, if you want to distinguish the different transcription systems for writing Sanskrit in Latin, you can apply to registry a BCP 47 variant. There are also BCP 47 extension T, which

Re: A neat description of encoding characters

2019-12-02 Thread James Tauber via Unicode
Indeed. Unicode separates: (1) selecting a character repertoire; (2) assigning each character a numerical character code; (3) choosing an encoding form to represent those character codes as code units (made up of bytes). (2) and (3) are not conflated. James On Mon, Dec 2, 2019 at 9:54 AM 梁海

Proposal to add Roman transliteration schemes to ISO 15924.

2019-12-02 Thread Vishvas Vasuki
bcc: as an FYI - plz respond on the unicode mailing list as needed. namaste! Sanskrit has traditionally been written in a variety of scripts ranging from Sharada to Grantha. In the past two centuries, it has been written in Latin based scripts as well (please see

Re: A neat description of encoding characters

2019-12-02 Thread 梁海 Liang Hai via Unicode
Grrr… It’s an okayish analog for binary numbers, but not really relevant to character encoding. Encoded characters are just assigned with integers, which could in turn be represented in any base. The binary nature of computers’ way of storing numbers does not have much to do with how character

A neat description of encoding characters

2019-12-02 Thread Costello, Roger L. via Unicode
>From the book titled "Computer Power and Human Reason" by Joseph Weizenbaum, >p.74-75 Suppose that the alphabet with which we wish to concern ourselves consists of 256 distinct symbols. Imagine that we have a deck of 256 cards, each of which has a distinct symbol of our alphabet printed on