Re: Bit arithmetic on Unicode characters?

2016-10-07 Thread Garth Wallace
On Thu, Oct 6, 2016 at 2:28 PM, Ken Whistler wrote: > > On 10/6/2016 12:44 PM, Garth Wallace wrote: > > Some representatives of the WFCC have proposed alternate arrangements that > assume there will be a need for bitwise operations to covert between the > existing chess

Re: Bit arithmetic on Unicode characters?

2016-10-07 Thread Oren Watson
Except that it states at the very start of that file "this file should not be parsed for machine-readable information." On Fri, Oct 7, 2016 at 6:41 PM, Andrew West wrote: > On 7 October 2016 at 23:31, Doug Ewell wrote: > > > > Well, "treacherous" is

Re: Bit arithmetic on Unicode characters?

2016-10-07 Thread Andrew West
On 7 October 2016 at 23:31, Doug Ewell wrote: > > Well, "treacherous" is right. I'd hesitate to trust an algorithm to > recognize PLANCK CONSTANT as the character name that logically fits > between MATHEMATICAL ITALIC SMALL G and MATHEMATICAL ITALIC SMALL I. Well, it could be

RE: Bit arithmetic on Unicode characters?

2016-10-07 Thread Doug Ewell
Andrew West wrote: > Well, it could be picked up from that most treacherous of Unicode data > files http://www.unicode.org/Public/UNIDATA/NamesList.txt Even then, you have: ... 1D454 MATHEMATICAL ITALIC SMALL G # 0067 latin small letter g 1D455 x (planck constant - 210E)

Re: Bit arithmetic on Unicode characters?

2016-10-07 Thread Doug Ewell
Richard Wordingham wrote: >> I can't find anything in the UCD that distinguishes one "font >> variant" from another (UnicodeData.txt shown as an example): > > It's in that most treacherous of properties, the character's name. Well, "treacherous" is right. I'd hesitate to trust an algorithm to

Re: Bit arithmetic on Unicode characters?

2016-10-07 Thread Richard Wordingham
On Fri, 07 Oct 2016 09:06:31 -0700 "Doug Ewell" wrote: > Richard Wordingham wrote: > > Perhaps there is just enough information in the UCD to allow > > exhaustive, automated tests. > I can't find anything in the UCD that distinguishes one "font variant" > from another

Re: font-encoded hacks

2016-10-07 Thread Andrew Cunningham
HI Neil, I tend to prefer refering to them as Pseudo-Unicode solutions, rather than hacked fonts or adhoc fonts, and differentiating them from legacy or 8-bit solutions. My preferred approach would to be to treat them as a separate encoding. But I doubt that will likely happen. It doesn't help

Re: font-encoded hacks

2016-10-07 Thread Andrew Cunningham
Hi Mark, The converters would be interesting to see, and would be personally useful to me. But the type of keyboard layouts and input frameworks reflected in CLDR have limited bearing on issues related to the uptake of Unicode for Myanmar script. Andrew On 7 Oct 2016 17:54, "Mark Davis ☕️"

Re: font-encoded hacks

2016-10-07 Thread Andrew Cunningham
Hi Denis, In some ways, it was easier. But looking at each language, the issues seem to be have a slightly different slant. Sgaw Karen is interesting in comparison to Burmese. There is some use of the hacked Zwekabin font by bloggers, but most content, and key media still use 8 bit fonts.

Re: font-encoded hacks

2016-10-07 Thread Andrew Cunningham
On 7 Oct 2016 17:08, "Martin J. Dürst" wrote: > > Hello Andrew, > > > On 2016/10/07 11:11, Andrew Cunningham wrote: >> >> Considering the mess that adhoc fonts create. What is the best way forward? > > > That's very clear: Use Unicode. > LOL, thanks Martin. That has been

Re: Fwd: Why incomplete subscript/superscript alphabet ?

2016-10-07 Thread Oren Watson
Hmm... "filling in Latin alphabet encoding gaps without clear use cases" is exactly what was done for the blackboard bold letters. I scarcely think that a use case was submitted for every one of the blackboard bold etc letters in the mathematical set; merely the use of blackboard bold for a

Re: Fwd: Why incomplete subscript/superscript alphabet ?

2016-10-07 Thread Ken Whistler
On 10/7/2016 11:25 AM, Oren Watson wrote: Would it be appropriate to submit an omnibus proposal for encoding all remaining english letters in subscript, small caps, and superscript in the SMP for the purpose of not arbitrarily constraining the use of unicode for new linguistic theories and

RE: Why incomplete subscript/superscript alphabet ?

2016-10-07 Thread Doug Ewell
Oren Watson wrote: > Would it be appropriate to submit an omnibus proposal for encoding all > remaining english letters in subscript, small caps, and superscript in > the SMP for the purpose of not arbitrarily constraining the use of > unicode for new linguistic theories and ideas, similar to the

Re: Why incomplete subscript/superscript alphabet ?

2016-10-07 Thread Michael Everson
On 7 Oct 2016, at 19:25, Oren Watson wrote: > > Would it be appropriate to submit an omnibus proposal for encoding all > remaining english letters in subscript, small caps, and superscript in the > SMP for the purpose of not arbitrarily constraining the use of unicode

Fwd: Why incomplete subscript/superscript alphabet ?

2016-10-07 Thread Oren Watson
Would it be appropriate to submit an omnibus proposal for encoding all remaining english letters in subscript, small caps, and superscript in the SMP for the purpose of not arbitrarily constraining the use of unicode for new linguistic theories and ideas, similar to the mathematical characters?

Re: Bit arithmetic on Unicode characters?

2016-10-07 Thread Hans Åberg
> On 7 Oct 2016, at 18:06, Doug Ewell wrote: > I can't find anything in the UCD that distinguishes one "font variant" > from another (UnicodeData.txt shown as an example): > > 1D400;MATHEMATICAL BOLD CAPITAL A;Lu;0;L; 0041N; > 1D434;MATHEMATICAL ITALIC CAPITAL

Re: Why incomplete subscript/superscript alphabet ?

2016-10-07 Thread Doug Ewell
Marcel Schneider wrote: > According to my hypothesis and while waiting, I believe that > the intent of the gap kept in the superscript lowercase range, > is to maintain a limitation to the performance of plain text. > I don't see very well how to apply Hanlon's razor here, because > there seems

Re: Bit arithmetic on Unicode characters?

2016-10-07 Thread Doug Ewell
Richard Wordingham wrote: > Yes, it's a trade-off. The application I had in mind is converting > between mathematical letter variants and their 'plain' forms. Long-time list members might remember a Windows utility I wrote to convert between normal Unicode text and Mathematical Alphanumeric

Re: font-encoded hacks

2016-10-07 Thread Neil Harris
On 07/10/16 07:42, Denis Jacquerye wrote: In may case people resort to these hacks because it is an easier short term solution. All they have to do is use a specific font. They don't have to switch or find and install a keyboard layout and they don't have to upgrade to an OS that supports their

Re: Bit arithmetic on Unicode characters?

2016-10-07 Thread Hans Åberg
> On 7 Oct 2016, at 09:27, Garth Wallace wrote: > > Unicode doesn't really address chess piece properties like white/black beyond > naming conventions. >From the formal point of view, Unicode only assigns character numbers (code >points), which gets a binary representation

Re: Bit arithmetic on Unicode characters?

2016-10-07 Thread Garth Wallace
On Thu, Oct 6, 2016 at 5:42 PM, Shawn Steele wrote: > Presumably a table-based approach would merely require rerunning the > table-building script from the UCD when new versions were released. > For casing, sure, but that's not really relevant in this context, since

Re: Bit arithmetic on Unicode characters?

2016-10-07 Thread Richard Wordingham
On Thu, 6 Oct 2016 21:18:15 -0400 Oren Watson wrote: > On Thu, Oct 6, 2016 at 8:28 PM, Richard Wordingham < > richard.wording...@ntlworld.com> wrote: > > Yes, it's a trade-off. The application I had in mind is converting > > between mathematical letter variants and their

Re: font-encoded hacks

2016-10-07 Thread Mark Davis ☕️
We do provide data for keyboard mappings in CLDR ( http://unicode.org/cldr/charts/latest/keyboards/index.html). There are some further pieces we need to put into place. 1. Provide a bulk uploader that applies our sanity-checking tests for a proposed keyboard mapping, and provides real-time

Re: font-encoded hacks

2016-10-07 Thread Denis Jacquerye
In may case people resort to these hacks because it is an easier short term solution. All they have to do is use a specific font. They don't have to switch or find and install a keyboard layout and they don't have to upgrade to an OS that supports their script with Unicode properly. Because of

Re: font-encoded hacks

2016-10-07 Thread Martin J. Dürst
Hello Andrew, On 2016/10/07 11:11, Andrew Cunningham wrote: Considering the mess that adhoc fonts create. What is the best way forward? That's very clear: Use Unicode. Zwekabin, Mon, Zawgyi, and Zawgyi-Tai and their ilk? Most governemt translations I am seeing in Australia for Burmese are