Re: U+0140

2004-04-19 Thread Asmus Freytag
At 03:49 PM 4/19/2004, Kenneth Whistler wrote: The Unicode Standard is not prescriptive about rendering, beyond the basics required to simply ensure correct mapping of textual content into streams of characters. If one font vendor wants to have a raised glyph for the MIDDLE DOT and another wants to

Re: Downloading UCD 4.0.0

2004-04-19 Thread Doug Ewell
I wrote: > I think the answer depends on what Theo really wants. He asked about > downloading the data files for 4.0.0, but before that he mentioned > downloading "the latest version," which is not 4.0.0 but 4.0.1. Reading Theo's question again, I see that he was talking about having downloaded

Unihan.txt and the four dictionary sorting algorithm

2004-04-19 Thread Ernest Cline
While I would expect the answer to my question to be true, one never knows what lurks in the heart of data files. Unihan.txt contains at least two properties for each of the four dictionaries used in the sorting algorithm. One property contains only characters that are actually in the dictionary

Re: Downloading UCD 4.0.0

2004-04-19 Thread Doug Ewell
Theo Veenker wrote: > Until now I always downloaded the lastest version of the UCD > and worked with that. Now I want to download the UCD files for > 4.0.0 again. I know it is all in http://www.unicode.org/Public/- > 4.0-Update/, > ... > Do I really need to find out and download all unchanged fil

Re: U+0140

2004-04-19 Thread John Hudson
Peter Constable wrote: And if... someone finds a well documented script in which a true middle dot and an x-height dot are used contrastively, That would be a somewhat surprising and not-to-be-recommended design for a writing system. Not to be completely ruled out, though. But we can probably wait

Re: Diacritic Property and Phillipine Viramas

2004-04-19 Thread Kenneth Whistler
Ernest Cline asked: > Is there a reason for the lack of the Diacritic property on > the Tagalog and Hanunoo virama characters (U+1714 > and U+1734)? Human fallibility? > All of the other virama characters (i.e., > those of combining class 9) have this property and it > seems appropriate based o

Re: U+0140

2004-04-19 Thread Kenneth Whistler
Peter Kirk continued this... > On 19/04/2004 13:03, Kenneth Whistler wrote: > > >... Those other middle dots give > >people textual representation alternatives now, if they need to make > >distinctions, and textual rendering alternatives, if they need to make > >middle dots which display with sli

RE: U+0140

2004-04-19 Thread Peter Constable
> And if... someone finds a well documented script > in which a true middle dot and an x-height dot are used contrastively, That would be a somewhat surprising and not-to-be-recommended design for a writing system. Not to be completely ruled out, though. But we can probably wait to cross that enco

RE: Web Form: Subj: Unicode conversion- Microsoft Visual C++ comp iler

2004-04-19 Thread Rick Cameron
It may be even simpler than that: U+0427 may have appeared in his message in UTF-8 because of his mail client. It could be that he's asking how to convert from an int holding the number 1063 to a wchar_t holding U+0427. The answer to this question is: int charValue = 1063; wchar

Re: U+0140

2004-04-19 Thread Peter Kirk
On 19/04/2004 13:03, Kenneth Whistler wrote: ... Those other middle dots give people textual representation alternatives now, if they need to make distinctions, and textual rendering alternatives, if they need to make middle dots which display with slightly different heights, sizes, or spacings, d

Re: Web Form: Subj: Unicode conversion- Microsoft Visual C++ compiler

2004-04-19 Thread Kenneth Whistler
I think this was just a confused way of asking how to convert UTF-16 into UTF-8: U+0427 is the Unicode encoded character. 0x0427 is the UTF-16 character encoding form for it. 0xD0 0xA7 is the UTF-8 character encoding form for it. Mino, sample code for how to do this is available at: http://www

Re: U+0140

2004-04-19 Thread Kenneth Whistler
John Hudson responded to Michael Everson: > Michael Everson wrote: > > >> This would make the mid-dot too high. The top dot of the colon usually > >> sits toward the top of the x-height; the *mid*-dot should sit lower, > > John, I just don't believe you. I don't believe that in all the history

Re: Web Form: Subj: Unicode conversion- Microsoft Visual C++ compiler

2004-04-19 Thread Raymond Mercier
Mino, This is not at clear: the character U+0427 is Ð in the Cyrillic block, and what does this have to do with the two characters à and Â, which are U+ 00D0 and U+00A7 ? Are you wondering how to store 0x0427 in a binary file ? Or what ? Raymond Mercier > > Contact: [EMAIL PROTECTED] > > Rep

Re: JIS X 0213: 2000 AMD-1 and Unihan.txt

2004-04-19 Thread John Jenkins
Yes, it's reasonable. In fact, the data have already been added, but this was done just too late for inclusion in the 4.0.1 release. On Apr 19, 2004, at 12:23 PM, Ernest Cline wrote: Would it be reasonable to expect that data concerning the ten characters added to JIS X 0213 by Amendment 1 will

FW: Web Form: Subj: Unicode conversion- Microsoft Visual C++ compiler

2004-04-19 Thread Magda Danish \(Unicode\)
Mino, I am sending your question to the Unicode public email list http://www.unicode.org/consortium/distlist.html for a possible answer from one of the list subscribers. Regards, --- Magda Danish Sr. Administrative Director The Unicode Consortium 650-693-3921 [EMAIL PR

Re: Downloading UCD 4.0.0

2004-04-19 Thread Asmus Freytag
At 08:42 AM 4/19/2004, Theo Veenker wrote: Hi, Until now I always downloaded the lastest version of the UCD and worked with that. Now I want to download the UCD files for 4.0.0 again. I know it is all in http://www.unicode.org/Public/- 4.0-Update/, but in http://www.unicode.org/ucd/ I read this:

Re: Downloading UCD 4.0.0

2004-04-19 Thread Kenneth Whistler
Theo Venker asked: > Until now I always downloaded the lastest version of the UCD > and worked with that. Now I want to download the UCD files for > 4.0.0 again. I know it is all in http://www.unicode.org/Public/- > 4.0-Update/, That is an incorrect assumption. > but in http://www.unicode.org/u

JIS X 0213: 2000 AMD-1 and Unihan.txt

2004-04-19 Thread Ernest Cline
Would it be reasonable to expect that data concerning the ten characters added to JIS X 0213 by Amendment 1 will make it into the next version of Unihan.txt? I'm presuming that this is official since ISO-IR-233, which updates ISO-IR-228, was released on 13 April. [Relevant data from ISO-IR-233]

Downloading UCD 4.0.0

2004-04-19 Thread Theo Veenker
Hi, Until now I always downloaded the lastest version of the UCD and worked with that. Now I want to download the UCD files for 4.0.0 again. I know it is all in http://www.unicode.org/Public/- 4.0-Update/, but in http://www.unicode.org/ucd/ I read this: "The complete set of all files for a given

Re: U+0140

2004-04-19 Thread Adam Twardoch
From: "John Hudson" <[EMAIL PROTECTED]> > 'Careful hairsplitting' always takes place when people care about typography. How very true. On one hand, there's people who put a cedilla under "a" when typesetting Polish, on the other hand, there's people who adjust the vertical position of hyphens whe