To add a factor that I think hasn't been mentioned, there are languages in
which apostrophe is used both as a letter by itself and as part of a
complex letter. Most of the native languages of British Columbia write
glottalized consonants as C+', e.g. t' for an ejective alveolar stop, and
many use
of a quoted
sentence. If you really want to cite a single English word terminated by an
elision apostrophe, the single quotes won't be usable and you'll use
chevrons like in this ‹demo’› and not single or double quotes which are
difficult to discriminate.
2015-06-11 19:47 GMT+02:00 Bill Poser billpos
No doubt the evil Unicode Consortium is in league with the Trilateral
Commission, the Elders of Zion,and the folks at NASA who faked the moon
landing :)
On Thu, May 28, 2015 at 7:53 AM, Doug Ewell d...@ewellic.org wrote:
Unicode is in the news today as some folks with waaay too much time on
A few years ago there was a company in Australia that was developing a
multilingual language called Protium Blue. The lead was someone named
Diarmuid Pigott. As far as I can tell, the project has come to an end, but
one can still find bits about the project, e.g. this:
In my experience the problem with Hindi web sites is that many of them used
encodings other than unique, frequently encodings designed for a particular
font. Some fonts did not use anything like a normal encoding. We
encountered a newspaper that used a font with 8,000-some glyphs each
representing
backup in this context refers to moving to previous bytes in order to
find the boundary between the previous, valid character, and the corrupted
character that you have encountered. In other words if you have a string
consisting of N bytes and at byte K you determine that the current sequence
of
If we assume that television sitcoms reflect reality, one can find native
speakers of Klingon via their synagogues. :)
On Wed, Feb 20, 2013 at 5:26 PM, Phil Carter phil.car...@yahoo.com wrote:
From: Richard Wordingham richard.wording...@ntlworld.com
To: Unicode Mailing List
I am familiar with written Navajo. The writing system is almost pure ASCII.
The only additional characters needed are combining acute accent for high
tone, combining ogonek for nasalization, and the upper- and lower-case
barred l's for the voiceless lateral fricative. The list below looks
complete
It is indeed sad. However, I have now seen reports that most of the
manuscripts were stored elsewhere and that it is believed that most of the
collection has survived. I hope this is true.
http://allafrica.com/stories/201301301130.html
On Thu, Jan 31, 2013 at 1:07 PM, Ed Trager
There's also the venerable U+0003 end of text. It has the virtue (?) of
having no associated glyph and so can be realized however one likes.
On Thu, Jan 24, 2013 at 4:41 PM, Richard Wordingham
richard.wording...@ntlworld.com wrote:
On Thu, 24 Jan 2013 20:05:41 -0300
Andrés Sanhueza
Endian-ness of data stored in memory is relevant but only if you are
working at a very low level. Suppose that you have UTF32 data stored as
unsigned C integers. On pretty much any modern computer, each codepoint
will occupy four 8-bit bytes. So long as you deal with that data via C, as
unsigned
If by online you mean on the web then this isn't what you want, but
the uniname utility in my unidesc package converts characters to names. I
haven't yet updated the data but will soon.
http://billposer.org/Software/unidesc.html
On Wed, Dec 19, 2012 at 9:03 PM, Martin J. Dürst
No, I was contrasting the behaviour of s followed by U+0332, for which
there is no precomposed letter, with U+1E95, which is the precomposed
equivalent of z followed by U+0332.
On Tue, Oct 9, 2012 at 10:13 AM, Andreas Prilop prilop4...@trashmail.netwrote:
On Sat, 6 Oct 2012, Bill Poser wrote
Yes, precisely. It's the combining behaviour that matters, not the
distinction between the two slightly different low lines.
On Tue, Oct 9, 2012 at 10:51 AM, Jukka K. Korpela jkorp...@cs.tut.fiwrote:
2012-10-09 20:32, Bill Poser wrote:
No, I was contrasting the behaviour of s followed by U
of G-d. :)
On Sun, Oct 7, 2012 at 1:51 AM, Michael Everson ever...@evertype.comwrote:
On 7 Oct 2012, at 08:37, Jukka K. Korpela wrote:
2012-10-07 8:38, Bill Poser wrote:
I have a web page that writes into an HTML5 textarea via the javascript
dom interface. U+0332 COMBINING LOW LINE
I have a web page that writes into an HTML5 textarea via the javascript dom
interface. U+0332 COMBINING LOW LINE is incorrectly rendered as a spacing
low line in both Mozilla Firefox and Google Chrome, which is peculiar since
they use different rendering agents. Characters with a combining low
Another editor that can read and save in a variety of encodings is vim, the
gussied-up successor to the Unix vi editor: http://www.vim.org
It is available for MS Windows, Mac OS X, Linux, and a variety of other
systems.
It is also at least logically possible for there to be transliterations
from Semitic writing systems to non-Roman writing systems. I'm not aware of
such a thing, but one can imagine, for example, Russian work using a
Cyrillic-based transliteration. Even if such things are not in scholarly
use, I
In the case of the Carrier syllabics, I have never seen an example of
vertical text so there is no native usage to go by. However, as others have
said, rotated text is very difficult to read because of the role of
orientation. It's true that the small characters provide evidence as to the
Digital typography has reached *The Onion*:
http://www.theonion.com/articles/errant-keystroke-produces-character-never-before-s,28030/
.
There's a discussion of the lawsuit on
Slashdot:http://yro.slashdot.org/story/11/10/06/1743226/civil-suit-filed-involving-the-time-zone-database
On Thu, Oct 6, 2011 at 10:14 PM, Martin J. Dürst
due...@it.aoyama.ac.jpwrote:
[By accident, I sent this only to Ken first; he recommended I send it to
On Thu, Aug 25, 2011 at 1:17 PM, Lorna Priest lorna_pri...@sil.org wrote:
The recent discussion on PUA characters reminded me of a question I've had.
I am wondering if anyone has a tool whereby we could search for all
documents on a local computer (or server) that use PUA codepoints. I suppose
Unifon was used at one point to write several languages in northern
California, so it has seen practical application. I'm not sure how much
material was published in this form. I don't think that any of these tribes
is still using Unifon.
=Hupaeric_displayStartCount=1_pageLabel=ERICSearchResultERICExtSearch_SearchType_0=kwNone
of the more recent material in Hupa is in Unifon.
On Tue, Jun 28, 2011 at 11:05 AM, Jean-François Colson j...@colson.eu wrote:
On 28/06/11 19:22, Bill Poser wrote:
Unifon was used at one point to write several
Here is a document by Bennett that describes the use of Unifon for Hupa,
Tolowa, Yurok and
Karok:http://eric.ed.gov/ERICWebPortal/contentdelivery/servlet/ERICServlet?accno=ED310889
On Tue, Jun 28, 2011 at 11:05 AM, Jean-François Colson j...@colson.eu wrote:
On 28/06/11 19:22, Bill Poser wrote
On Sat, Nov 13, 2010 at 4:46 PM, Jim Monty jim.mo...@yahoo.com wrote:
Is there even a single software application that properly displays CJK text
in
Normalization Form D?
I just tried your examples in Yudit (http://www.yudit.org) and they seem to
work: the NFD text looks the same as the NFC
On Sat, Jul 24, 2010 at 1:00 PM, Michael Everson ever...@evertype.com wrote:
Digits can be scattered randomly about the code space and it wouldn't make
any difference.
Having written a library for performing conversions between Unicode
strings and numbers, I disagree. While it is not all that
-- Forwarded message --
From: Bill Poser billpos...@gmail.com
Date: Sat, Jul 24, 2010 at 6:02 PM
Subject: Re: ? Reasonable to propose stability policy on numeric type = decimal
To: Michael Everson ever...@evertype.com
On Sat, Jul 24, 2010 at 4:25 PM, Michael Everson ever
Bill,
Michael is no programmer, hence he doesn't have first hand understanding why
programmers distiguish between character set mapping (normally requiring
look-up tables) and digit conversion (normally done by offset calculations).
That said, there are enough programmers on the committees
A quick check of the Indian government web site indicates that the
Government of India does claim copyright in government works (unlike the US
federal government), so under Indian law an explicit license may be
necessary.
30 matches
Mail list logo