Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Janusz S. Bien
Quote/Cytat - William_J_G Overington (Mon 13 Mar 2017 12:24:13 PM CET): Prof. Janusz S. Bień wrote: Just yet another reason for introducing the notion of textel? I opine that it would be a good idea to introduce several new words, of which textel would be

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Janusz S. Bien
Quote/Cytat - Richard Wordingham (Sun 12 Mar 2017 09:10:22 PM CET): On Sun, 12 Mar 2017 20:02:28 +0100 "Janusz S. Bien" wrote: If the basic notion has to be referred in a cumbersome way as "extended grapheme cluster" then it is easier

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread William_J_G Overington
Prof. Janusz S. Bień wrote: > Just yet another reason for introducing the notion of textel? I opine that it would be a good idea to introduce several new words, of which textel would be one, with each such new word having a precisely-defined meaning so that in precise discussions of

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Asmus Freytag
On 3/13/2017 3:31 AM, Janusz S. Bien wrote: Just yet another reason for introducing the notion of textel? The main difference between "textel" and "pixel" is that the unit of processing /displaying text is not uniform and fixed,

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Janusz S. Bien
Quote/Cytat - Asmus Freytag (Mon 13 Mar 2017 06:00:08 PM CET): [...] This (or similar) scenarios indicate the impossibility to come to a single, universal definition of a "textel" -- the main reason why this term is of lower utility than "pixel". I agree that it is

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Manish Goregaokar
Do you have examples of AA being split that way (and further reading)? I think I'm aware of what you're talking about, but would love to read more about it. -Manish On Mon, Mar 13, 2017 at 2:47 PM, Richard Wordingham wrote: > On Mon, 13 Mar 2017 23:10:11 +0200 >

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Richard Wordingham
On Mon, 13 Mar 2017 15:26:00 -0700 Manish Goregaokar wrote: > Do you have examples of AA being split that way (and further reading)? > I think I'm aware of what you're talking about, but would love to read > more about it. Just googling for the three words 'Sanskrit',

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Janusz S. Bien
Quote/Cytat - J Decker (Mon 13 Mar 2017 06:55:18 PM CET): texel looks to be defined as a graphic element already. TEXture ELement. I'm aware of it, but homonymy/polysemy is something we have to live with. I think there is no risk of confusing texture elements with text

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread J Decker
I liked the Go implementation of character type - a rune type - which is a codepoint. and strings that return runes from by index. https://blog.golang.org/strings Doesn't solve the problem for composited codepoints though... texel looks to be defined as a graphic element already. TEXture

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Richard Wordingham
On Mon, 13 Mar 2017 20:20:25 -0400 "Mark E. Shoulson" wrote: > Sanskrit external vowel sandhi is comparatively > straightforward (compared to consonant sandhi), and it frequently > loses information. A *or* AA plus I is E; A *or* AA plus U is O (you > need A + O to get AU).

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Mark E. Shoulson
A word ending in A *or* AA preceding a word beginning in A *or* AA will all coalesce to a single AA in Sanskrit. That's four possibilities, and that doesn't count a word ending in a consonant preceding a word beginning in AA, which would be written the same. My memory is rusty, so I should

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Richard Wordingham
On Mon, 13 Mar 2017 19:18:00 + Alastair Houghton wrote: > IMO, returning code points by index is a mistake. It over-emphasises > the importance of the code point, which helps to continue the notion > in some developers’ minds that code points are somehow

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Alastair Houghton
On 13 Mar 2017, at 17:55, J Decker wrote: > > I liked the Go implementation of character type - a rune type - which is a > codepoint. and strings that return runes from by index. > https://blog.golang.org/strings IMO, returning code points by index is a mistake. It

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Khaled Hosny
On Mon, Mar 13, 2017 at 07:18:00PM +, Alastair Houghton wrote: > On 13 Mar 2017, at 17:55, J Decker wrote: > > > > I liked the Go implementation of character type - a rune type - which is a > > codepoint. and strings that return runes from by index. > >

Re: "A Programmer's Introduction to Unicode"

2017-03-13 Thread Richard Wordingham
On Mon, 13 Mar 2017 23:10:11 +0200 Khaled Hosny wrote: > But there are many text operations that require access to Unicode code > points. Take for example text layout, as mapping characters to glyphs > and back has to operate on code points. The idea that you never need >