Re: Another take on the English apostrophe in Unicode

2015-06-11 Thread Bill Poser
To add a factor that I think hasn't been mentioned, there are languages in which apostrophe is used both as a letter by itself and as part of a complex letter. Most of the native languages of British Columbia write glottalized consonants as C+', e.g. t' for an ejective alveolar stop, and many use

Re: Another take on the English apostrophe in Unicode

2015-06-11 Thread Bill Poser
of a quoted sentence. If you really want to cite a single English word terminated by an elision apostrophe, the single quotes won't be usable and you'll use chevrons like in this ‹demo’› and not single or double quotes which are difficult to discriminate. 2015-06-11 19:47 GMT+02:00 Bill Poser billpos

Re: Unicode of Death

2015-05-28 Thread Bill Poser
No doubt the evil Unicode Consortium is in league with the Trilateral Commission, the Elders of Zion,and the folks at NASA who faked the moon landing :) On Thu, May 28, 2015 at 7:53 AM, Doug Ewell d...@ewellic.org wrote: Unicode is in the news today as some folks with waaay too much time on

Re: Swift

2014-06-05 Thread Bill Poser
A few years ago there was a company in Australia that was developing a multilingual language called Protium Blue. The lead was someone named Diarmuid Pigott. As far as I can tell, the project has come to an end, but one can still find bits about the project, e.g. this:

Re: Websites in Hindi

2014-03-02 Thread Bill Poser
In my experience the problem with Hindi web sites is that many of them used encodings other than unique, frequently encodings designed for a particular font. Some fonts did not use anything like a normal encoding. We encountered a newspaper that used a font with 8,000-some glyphs each representing

Re: What to backup after corruption of code units?

2013-08-27 Thread Bill Poser
backup in this context refers to moving to previous bytes in order to find the boundary between the previous, valid character, and the corrupted character that you have encountered. In other words if you have a string consisting of N bytes and at byte K you determine that the current sequence of

Re: pIqaD in actual use

2013-02-20 Thread Bill Poser
If we assume that television sitcoms reflect reality, one can find native speakers of Klingon via their synagogues. :) On Wed, Feb 20, 2013 at 5:26 PM, Phil Carter phil.car...@yahoo.com wrote: From: Richard Wordingham richard.wording...@ntlworld.com To: Unicode Mailing List

Re: Navajo

2013-02-13 Thread Bill Poser
I am familiar with written Navajo. The writing system is almost pure ASCII. The only additional characters needed are combining acute accent for high tone, combining ogonek for nasalization, and the upper- and lower-case barred l's for the voiceless lateral fricative. The list below looks complete

Re: Destruction in Timbuktu

2013-01-31 Thread Bill Poser
It is indeed sad. However, I have now seen reports that most of the manuscripts were stored elsewhere and that it is believed that most of the collection has survived. I hope this is true. http://allafrica.com/stories/201301301130.html On Thu, Jan 31, 2013 at 1:07 PM, Ed Trager

Re: End of story character

2013-01-24 Thread Bill Poser
There's also the venerable U+0003 end of text. It has the virtue (?) of having no associated glyph and so can be realized however one likes. On Thu, Jan 24, 2013 at 4:41 PM, Richard Wordingham richard.wording...@ntlworld.com wrote: On Thu, 24 Jan 2013 20:05:41 -0300 Andrés Sanhueza

Re: Why is endianness relevant when storing data on disks but not when in memory?

2013-01-05 Thread Bill Poser
Endian-ness of data stored in memory is relevant but only if you are working at a very low level. Suppose that you have UTF32 data stored as unsigned C integers. On pretty much any modern computer, each codepoint will occupy four 8-bit bytes. So long as you deal with that data via C, as unsigned

Re: Tool to convert characters to character names

2012-12-19 Thread Bill Poser
If by online you mean on the web then this isn't what you want, but the uniname utility in my unidesc package converts characters to names. I haven't yet updated the data but will soon. http://billposer.org/Software/unidesc.html On Wed, Dec 19, 2012 at 9:03 PM, Martin J. Dürst

Re: problem with combining diacritcs in HTML5

2012-10-09 Thread Bill Poser
No, I was contrasting the behaviour of s followed by U+0332, for which there is no precomposed letter, with U+1E95, which is the precomposed equivalent of z followed by U+0332. On Tue, Oct 9, 2012 at 10:13 AM, Andreas Prilop prilop4...@trashmail.netwrote: On Sat, 6 Oct 2012, Bill Poser wrote

Re: problem with combining diacritcs in HTML5

2012-10-09 Thread Bill Poser
Yes, precisely. It's the combining behaviour that matters, not the distinction between the two slightly different low lines. On Tue, Oct 9, 2012 at 10:51 AM, Jukka K. Korpela jkorp...@cs.tut.fiwrote: 2012-10-09 20:32, Bill Poser wrote: No, I was contrasting the behaviour of s followed by U

Re: problem with combining diacritcs in HTML5

2012-10-07 Thread Bill Poser
of G-d. :) On Sun, Oct 7, 2012 at 1:51 AM, Michael Everson ever...@evertype.comwrote: On 7 Oct 2012, at 08:37, Jukka K. Korpela wrote: 2012-10-07 8:38, Bill Poser wrote: I have a web page that writes into an HTML5 textarea via the javascript dom interface. U+0332 COMBINING LOW LINE

problem with combining diacritcs in HTML5

2012-10-06 Thread Bill Poser
I have a web page that writes into an HTML5 textarea via the javascript dom interface. U+0332 COMBINING LOW LINE is incorrectly rendered as a spacing low line in both Mozilla Firefox and Google Chrome, which is peculiar since they use different rendering agents. Characters with a combining low

Re: texteditors that can process and save in different encodings

2012-10-04 Thread Bill Poser
Another editor that can read and save in a variety of encodings is vim, the gussied-up successor to the Unix vi editor: http://www.vim.org It is available for MS Windows, Mac OS X, Linux, and a variety of other systems.

Re: Compiling a list of Semitic transliteration characters

2012-09-05 Thread Bill Poser
It is also at least logically possible for there to be transliterations from Semitic writing systems to non-Roman writing systems. I'm not aware of such a thing, but one can imagine, for example, Russian work using a Cyrillic-based transliteration. Even if such things are not in scholarly use, I

Re: [unicode] Re: Canadian aboriginal syllabics in vertical writing mode

2012-05-02 Thread Bill Poser
In the case of the Carrier syllabics, I have never seen an example of vertical text so there is no native usage to go by. However, as others have said, rotated text is very difficult to read because of the role of orientation. It's true that the small characters provide evidence as to the

A new character to encode from the Onion? :)

2012-04-30 Thread Bill Poser
Digital typography has reached *The Onion*: http://www.theonion.com/articles/errant-keystroke-produces-character-never-before-s,28030/ .

Re: Civil suit; ftp shutdown; mailing list shutdown

2011-10-07 Thread Bill Poser
There's a discussion of the lawsuit on Slashdot:http://yro.slashdot.org/story/11/10/06/1743226/civil-suit-filed-involving-the-time-zone-database On Thu, Oct 6, 2011 at 10:14 PM, Martin J. Dürst due...@it.aoyama.ac.jpwrote: [By accident, I sent this only to Ken first; he recommended I send it to

Re: searching for PUA characters

2011-08-25 Thread Bill Poser
On Thu, Aug 25, 2011 at 1:17 PM, Lorna Priest lorna_pri...@sil.org wrote: The recent discussion on PUA characters reminded me of a question I've had. I am wondering if anyone has a tool whereby we could search for all documents on a local computer (or server) that use PUA codepoints. I suppose

Re: Unifon

2011-06-28 Thread Bill Poser
Unifon was used at one point to write several languages in northern California, so it has seen practical application. I'm not sure how much material was published in this form. I don't think that any of these tribes is still using Unifon.

Re: Unifon

2011-06-28 Thread Bill Poser
=Hupaeric_displayStartCount=1_pageLabel=ERICSearchResultERICExtSearch_SearchType_0=kwNone of the more recent material in Hupa is in Unifon. On Tue, Jun 28, 2011 at 11:05 AM, Jean-François Colson j...@colson.eu wrote: On 28/06/11 19:22, Bill Poser wrote: Unifon was used at one point to write several

Re: Unifon

2011-06-28 Thread Bill Poser
Here is a document by Bennett that describes the use of Unifon for Hupa, Tolowa, Yurok and Karok:http://eric.ed.gov/ERICWebPortal/contentdelivery/servlet/ERICServlet?accno=ED310889 On Tue, Jun 28, 2011 at 11:05 AM, Jean-François Colson j...@colson.eu wrote: On 28/06/11 19:22, Bill Poser wrote

Re: Application that displays CJK text in Normalization Form D

2010-11-13 Thread Bill Poser
On Sat, Nov 13, 2010 at 4:46 PM, Jim Monty jim.mo...@yahoo.com wrote: Is there even a single software application that properly displays CJK text in Normalization Form D? I just tried your examples in Yudit (http://www.yudit.org) and they seem to work: the NFD text looks the same as the NFC

Re: ? Reasonable to propose stability policy on numeric type = decimal

2010-07-24 Thread Bill Poser
On Sat, Jul 24, 2010 at 1:00 PM, Michael Everson ever...@evertype.com wrote: Digits can be scattered randomly about the code space and it wouldn't make any difference. Having written a library for performing conversions between Unicode strings and numbers, I disagree. While it is not all that

Fwd: ? Reasonable to propose stability policy on numeric type = decimal

2010-07-24 Thread Bill Poser
-- Forwarded message -- From: Bill Poser billpos...@gmail.com Date: Sat, Jul 24, 2010 at 6:02 PM Subject: Re: ? Reasonable to propose stability policy on numeric type = decimal To: Michael Everson ever...@evertype.com On Sat, Jul 24, 2010 at 4:25 PM, Michael Everson ever

Re: ? Reasonable to propose stability policy on numeric type = decimal

2010-07-24 Thread Bill Poser
Bill, Michael is no programmer, hence he doesn't have first hand understanding why programmers distiguish between character set mapping (normally requiring look-up tables) and digit conversion (normally done by offset calculations). That said, there are enough programmers on the committees

Re: Indian Rupee Sign (U+20B9) proposal - copyright/licencing issue

2010-07-20 Thread Bill Poser
A quick check of the Indian government web site indicates that the Government of India does claim copyright in government works (unlike the US federal government), so under Indian law an explicit license may be necessary.