Re: Classification of U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK

2003-06-05 Thread Philippe Verdy
My opinion is that it can be viewed, depending on its application, as a letter (for some transliteration purpose), or as a diacritic (for some other transliterations). But in reality it is mostly a letter modifier. For UCA, it sorts mostly like the base letter that it modifies, and UCA gives

Re: Tamazight/berber language : How to send mail, write word documents ....

2003-06-06 Thread Philippe Verdy
From: Azzedine Ait Khelifa [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, June 05, 2003 4:55 PM Subject: Tamazight/berber language : How to send mail, write word documents Hello all, I need help about tamazight(berber) language. All letters of tamazght alphabet are into

Re: Tamazight/berber language : How to send mail, write word documents ....

2003-06-06 Thread Philippe Verdy
Wow, thanks for this cool tool! Now I can edit international text for most European languages directly from my extended custom French keyboard (including the missing OE and AE ligatures that I wanted since long on the French keyboard)... (I changed the degree sign to a ring diacritic dead key,

Re: Classification of Alphabetic characters (was: Hiragana/Katakana sound marks)

2003-06-06 Thread Philippe Verdy
From: Mark Davis [EMAIL PROTECTED] This is not an oversight. As I said, many characters are not Alphabetic and are still part of words. Examples include that character and many others. As a simple case, can't is a word in English, although the apostrophe is not alphabetic. There are many,

Re: UNESCO standard keyboards? (Re: Tamazight/berber language : ....)

2003-06-06 Thread Philippe Verdy
From: [EMAIL PROTECTED] Don Osborn wrote on 06/05/2003 07:34:29 PM: There are probably some existing standard for keyboard mappings, promoted by UNESCO and published in a ISO standard. If there were such a thing (for Tamazight or any other African language) I'd be very interested

Re: Tamazight/berber language : How to send mail, write word documents ....

2003-06-06 Thread Philippe Verdy
on, or users are already trained to useand switch from theFrench keyboard and the Arabic keyboard. Adding Tifinagh to the French keyboard is then quite simple and natural. -- Philippe.- Original Message - From: "Marco Cimarosti" [EMAIL PROTECTED]To: "'Philippe Verdy'" [

Re: IPA Null Consonant

2003-05-27 Thread Philippe Verdy
From: [EMAIL PROTECTED] I would NOT recommend using a math symbol for this. Especially considering the above. The CAPITAL O WITH STROKE (Ø) is probably better. It is not better. If anything might be better, it would be a digit zero from a font that has a slash through it. In the past,

Re: Dutch IJ, again

2003-05-27 Thread Philippe Verdy
From: Mark Davis [EMAIL PROTECTED] From: Anto'nio Martins-Tuva'lkin [EMAIL PROTECTED] On 2003.05.25, 00:00, Philippe Verdy [EMAIL PROTECTED] wrote: even if the Dutch language considers it as a single letter, in a way similar to the Spanish ch I see one major difference: When you apply

Re: Dutch IJ, again

2003-05-27 Thread Philippe Verdy
From: Anto'nio Martins-Tuva'lkin [EMAIL PROTECTED] On 2003.05.25, 00:00, Philippe Verdy [EMAIL PROTECTED] wrote: even if the Dutch language considers it as a single letter, in a way similar to the Spanish ch I see one major difference: When you apply extra wide inter-char distance, you

Re: IPA Null Consonant

2003-05-27 Thread Philippe Verdy
From: [EMAIL PROTECTED] Philippe Verdy wrote on 05/27/2003 11:50:39 AM: Don't speak about overwriting sequences using Backspace in Unicode! I wasn't; I was talking about typewriters, though the comparable thing was done in the era of Wordstar and daisy wheel / dot matrix printers. I

Re: javascript and unicode

2003-05-27 Thread Philippe Verdy
From: Markus Scherer [EMAIL PROTECTED] Paul Hastings wrote: would it be correct to say that javascript natively supports unicode? ECMAScript, of which JavaScript and JScript are implementations, is defined on 16-bit Unicode scripts and using 16-bit Unicode strings. In other words, the

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-27 Thread Philippe Verdy
A logo with a yellow or light blue or pale green background would be more appealing on various bright backgrounds. I also think that the grey logo is too dark and difficult to red, and the pink logo is quite strange. The red of the checkmark should contrast more by using asaturated color, and

Re: Dutch IJ, again

2003-05-28 Thread Philippe Verdy
From: Pim Blokland [EMAIL PROTECTED] To: Unicode mailing list [EMAIL PROTECTED] Sent: Wednesday, May 28, 2003 11:45 AM Subject: Re: Dutch IJ, again Philippe Verdy schreef: i+j is a single combined Dutch ij character only if its not followed by a vowel This is not true; where did you get

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-29 Thread Philippe Verdy
From: Marco Cimarosti [EMAIL PROTECTED] Yes, you are right. I never heard the word savvy before this morning. Savvy is better understood in this context as aware, than archaic or informal in your English-Italian dictionnary. It means the author of the website that uses this logo has considered

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-29 Thread Philippe Verdy
From: Theodore H. Smith [EMAIL PROTECTED] Why not put up a call for Unicode logos? Instead of asking for an inhouse one to be made, I'm sure you'd get more logos offered than you could know what to do with. At the worst, you could have a design to learn from. Some of my logos were made

More savvy logos

2003-05-29 Thread Philippe Verdy
I don't know if an attachment here will work, but these are two other alternate logos which look more appealing with a tiny 3D button effect, the Unicode red and white UNi logo (and visible trademark symbol), and the word Savvy in Blue (and a green check mark), or a variant using the term

Re: When do you use U+2024 ONE DOT LEADER instead of U+002E FULL STOP?

2003-05-29 Thread Philippe Verdy
From: Karl Pentzlin [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, May 28, 2003 9:59 PM Subject: When do you use U+2024 ONE DOT LEADER instead of U+002E FULL STOP? When do you use U+2024 ONE DOT LEADER instead of U+002E FULL STOP? Is there a difference of appearance in high quality

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-29 Thread Philippe Verdy
From: [EMAIL PROTECTED] On 28/05/2003 13:56:47 Philippe Verdy wrote: My question is more related to the requirements to display such a logo. After all, one could use this logo on a web site that uses a standardized encoding like ISO-8859-1 Why would you think that when the logo page

Re: Shift-JIS/Unicode mapping in JAVA

2003-05-29 Thread Philippe Verdy
Most probably, Sun upgraded its tables from ICU, and ICU had this bug, which did not exist in their prior tables for MS-CP932. So the source of the data may now be different, or there may be an alias problem in the MS-CP932 encoding name. Submit this bug to Sun, (and probably also to IBM's ICU),

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-29 Thread Philippe Verdy
From: Tom Gewecke [EMAIL PROTECTED] I wonder about this. The Unicode FAQ makes the point that some browsers will not display NCR's unless the charset is UTF-8. It does seem logical that, NCR's or not, a page with the logo should be in one of the three standard Unicode Encoding Forms, UTF-8,

Re: Shift-JIS/Unicode mapping in JAVA

2003-05-29 Thread Philippe Verdy
From: Kazuhiro Kazama [EMAIL PROTECTED] From: Jane Liu [EMAIL PROTECTED] Subject: Shift-JIS/Unicode mapping in JAVA Date: Wed, 28 May 2003 12:36:39 -0700 (PDT) Message-ID: [EMAIL PROTECTED] I am running a JAVA program on Japanese Windows 2000 system, looking at the Unicode conversion of

Re: More savvy logos

2003-05-29 Thread Philippe Verdy
From: Michael Everson [EMAIL PROTECTED] Both logos are around 800 bytes, and 16 colors (using the web palette), with a bit antialiasing. Garish galore, I would say. Purplize the red somewhat? Per your request, these button logos use a darker red (same dimensions as before). (The source

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-29 Thread Philippe Verdy
From: Marco Cimarosti [EMAIL PROTECTED] As this comes from an Unicode official, I guess we should simply accept it... Nevertheless, I wonder whether displaying the Unicode *logo* per se has the same legal implication as displaying a *banner* which contains the Unicode logo. I note that the

Re: Unicode 4.0 in ICU demos

2003-05-30 Thread Philippe Verdy
From: Roozbeh Pournader [EMAIL PROTECTED] On Mon, 28 Apr 2003, Mark Davis wrote: BTW, the ICU demos have been all upgraded to Unicode 4.0, on http://oss.software.ibm.com/icu/demo/. They include: [...] IDNA Demo This simple demo performs IDNA transformations as described in RFC

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-30 Thread Philippe Verdy
From: Carl W. Brown [EMAIL PROTECTED] It looks to me like UNCODE. Has the UN has taken a rode in globalization? Maybe the web page has no scripting but is still savvy. Wrong! You strip the very visible dot from the i letter, you also refse to see that there's a ligature between the U and N.

Re: book end or enclosing characters in most languages?

2003-05-30 Thread Philippe Verdy
From: Ben Dougall [EMAIL PROTECTED] On Wednesday, May 28, 2003, at 06:59 pm, Otto Stolz wrote: PS. In these tow languages, the quote-marks are paired thusly: en_US: U+201C ... U+201D, and U+2018 ... U+2019 de_DE: U+201E ... U+201C, and U+201A ... U+2018 are they the right way

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-30 Thread Philippe Verdy
From: [EMAIL PROTECTED] there are still (even more) browsers that do not display UTF-8 correctly... who still use very often a browser that supports some form their national encoding (SJIS, GB2312, Big5, KSC5601), sometimes with ISO2022-* but shamely do not decode UTF-8 properly (even

Re: Shift-JIS/Unicode mapping in JAVA

2003-05-30 Thread Philippe Verdy
Don't use Windows-31J, it is a encoding name alias that is not used by Microsoft for its 932 codepage! So it would cause problems with other compliant JVMs. Better use CP932 which seems to be the canonical name used by Sun in its reference implementation, or windows-932 documented in the

Re: Emailing logos to the list

2003-05-30 Thread Philippe Verdy
From: Theodore H. Smith [EMAIL PROTECTED] I'm not sure what other people experience, but I see a note saying the attachment was (quite correctly I think) removed from the email, and instead just lists the name and format of the attachment. I'm on the digest format. You may see the GIF

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-05-30 Thread Philippe Verdy
Edward H Trager wrote: John Hudson wrote: John Cowan wrote: Netscape 4.x is dead. I wish it were. Monitoring the web traffic at one of the sites I'm involved with, I am dismayed to see that more than 5% of visitors are using Netscape 4.7. Lots of organizations may have reasons like

Re: book end or enclosing characters in most languages?

2003-05-30 Thread Philippe Verdy
From: Ben Dougall [EMAIL PROTECTED] On Thursday, May 29, 2003, at 02:10 pm, Philippe Verdy wrote: Interestingly, the French first-level quotation marks use what we call chevrons (double angle brackets). However there are some typographical considerations that common fonts forget

Re: [Not OT] localized names of the Unicode Control characters

2003-05-30 Thread Philippe Verdy
From: Patrick Andries [EMAIL PROTECTED] From: Philippe Verdy ([EMAIL PROTECTED]) Microsoft displays these French translations for character names. There are however some strange translations that lack a common formal format that allows easier searching for related characters. I would

Re: Announcement: New Unicode Savvy Logo

2003-05-30 Thread Philippe Verdy
- Original Message - From: William Overington [EMAIL PROTECTED] To: Magda Danish (Unicode) [EMAIL PROTECTED]; [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Friday, May 30, 2003 10:20 AM Subject: Re: Announcement: New Unicode Savvy Logo Now that Mark Davis has made a statement in the

Re: Rare extinct latin letters

2003-05-31 Thread Philippe Verdy
From: [EMAIL PROTECTED] Patrick Andries on 05/29/2003 06:15:10 PM: Could letters like « l molle » (http://pages.infinit.net/hapax/abcmeigret.jpg ) or long-tailed A (between O and P in Baïf's alphabet http://pages. infinit.net/hapax/abcbaif.jpg), letters which I believe cannot be

Re: When do you use U+2024 ONE DOT LEADER instead of U+002E FULL STOP?

2003-05-31 Thread Philippe Verdy
From: John Cowan [EMAIL PROTECTED] Ben Dougall scripsit: why is it not categorised as white space then? or is it? doesn't look like it is to me, but i'm not sure how to actually find out for sure. Well, um, it's not white: there is a dot in it. Not really, in many applications it will

Re: Announcement: New Unicode Savvy Logo

2003-05-31 Thread Philippe Verdy
From: Carl W. Brown [EMAIL PROTECTED] Private Use Areas are by definition not interoperable and clearly not designed to be used on the web. Their use in a page to display text clearly does not qualify, as it requires proprietary fonts to display them. People use special fonts all the

Re: When do you use U+2024 ONE DOT LEADER instead of U+002E FULL STOP?

2003-05-31 Thread Philippe Verdy
From: Jim Allan [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Friday, May 30, 2003 8:05 PM Subject: Re: When do you use U+2024 ONE DOT LEADER instead of U+002E FULL STOP? John Cowan posted: Not really, in many applications it will translate in one or more dots just to create a dotted line

Re: When do you use U+2024 ONE DOT LEADER instead of U+002E FULL STOP?

2003-05-31 Thread Philippe Verdy
From: Kenneth Whistler [EMAIL PROTECTED] That last fact should be taken as a hint that for most purposes, manual leaders should just be sequences of FULL STOP characters (as you will see, for instance in the plain text representations of Internet Drafts or RFCs, for example). But in any rich

Re: When do you use U+2024 ONE DOT LEADER instead of U+002E FULL STOP?

2003-05-31 Thread Philippe Verdy
From: Kenneth Whistler [EMAIL PROTECTED] Philippe Verdy continued: What surprizes me the most in the Unicode spec is that it both says that its purpose is to create arbitrary length of leaders As in plain text, as can be seen in Table of Content listings in many RFCs, for example

Re: [OT] Unicode filename problems

2003-06-01 Thread Philippe Verdy
Zip files should have no problems to contain files with UTF-8 names. In fact the encoding allows it, and the only reason why you can't do it is the limitation of the ZIP tool you use which blindly uses only the encoding of the filesystem from which the file is created. Use the jar zip tool

Re: Not snazzy (was: New Unicode Savvy Logo)

2003-06-01 Thread Philippe Verdy
From: Marion Gunn [EMAIL PROTECTED] Ar 17:51 +0200 2003/05/29, Philippe Verdy entre sur son clavier: I would prefer to say that Netscape 4.0 is dead, but Netscape 4.7x is not (I D'accord. (With the above I'd have to agree.) see no reason why users should continue to use versions before 4.7

Re: Unicode filename problems

2003-06-01 Thread Philippe Verdy
From: Raymond Mercier [EMAIL PROTECTED] Well, you would expect that, since Win9* and WinNT/2000/XP differ fundamentally regarding unicode compliance. Proably true for the filesystem level, but certainly not for the file index stored in a ZIP file where there's no reason why it should not

Re: Language Tag Registrations

2003-06-01 Thread Philippe Verdy
From: Marion Gunn [EMAIL PROTECTED] What, then, is the code for the English of 'Northern Ireland'? (GB+NI=UK.) Since Ulster, as IANA [EMAIL PROTECTED] knows, is divided by an international border, is the logical reply 'encode Ulster English separately for each side of the border'? Is Basque

Re: Fw: Unicode filename problems

2003-06-02 Thread Philippe Verdy
From: Raymond Mercier [EMAIL PROTECTED] At 00:11 01/06/2003 +0200, you wrote: but certainly not for the file index stored in a ZIP file where there's no reason why it should not contain correctly encoded and portable UTF-8 names Doesn't one have to know the binary format of a Zip file to be

Re: Fw: Unicode filename problems

2003-06-03 Thread Philippe Verdy
Noe the following ambiguity in the ZIP file format specification: [QUOTE] file name: (Variable) The name of the file, with optional relative path. The path stored should not contain a drive or device letter, or a leading slash. All slashes should be forward slashes '/' as opposed to backwards

Re: Rare extinct latin letters

2003-06-04 Thread Philippe Verdy
-specific and context dependant, as it obeys to a convention not to a strict definition. -- Philippe. - Original Message - From: Kent Karlsson [EMAIL PROTECTED] To: 'Philippe Verdy' [EMAIL PROTECTED] Sent: Tuesday, June 03, 2003 4:16 PM Subject: RE: Rare extinct latin letters (offline

Re: Address of ISO 3166 mailing list

2003-06-04 Thread Philippe Verdy
From: Marion Gunn [EMAIL PROTECTED] I ask the patience of the Unicode and IETF-L moderators for now posting on their lists this request for contact details for the ISO 3166 mailing lists (if any). Context: Ireland advisability of reserving 'EI' tag for cited usage (baggage-handling at

Re: Rare extinct latin letters

2003-06-04 Thread Philippe Verdy
From: Kent Karlsson [EMAIL PROTECTED] Sorry, may be I was chosing the wrong diacritic (I was confused by its name, and I should have verified in the charts). Isn't U+0316 COMBINING HORN (combining class 216) what I wanted to use? Let me cut my reply short: no. ... script which

Re: Rare extinct latin letters

2003-06-04 Thread Philippe Verdy
From: John Hudson [EMAIL PROTECTED] At 06:39 AM 6/3/2003, [EMAIL PROTECTED] wrote: Philippe Verdy [EMAIL PROTECTED] wrote on 06/03/2003 07:25:46 AM: How do you consider the existing hook diacritic ? If you're talking about U+0309 COMBINING HOOK ABOVE, I don't think it normally

Re: Encoding converion through JDBC

2003-06-05 Thread Philippe Verdy
In all major databases, the native encodings of the OS, of the database when it was created, of the networking protocol, of the SQL queries and results, and of the client application are all independant. When JDBC connects to a database, it gets a lot of environment information from the server

Re: Encoding converion through JDBC

2003-06-05 Thread Philippe Verdy
) Kaplan [EMAIL PROTECTED] To: Philippe Verdy [EMAIL PROTECTED] Sent: Wednesday, June 04, 2003 4:36 PM Subject: Re: Encoding converion through JDBC From: Philippe Verdy [EMAIL PROTECTED] Phillipe, you went on for quite a while and I admit most of the things you talked about are not thing about

Re: IPA Null Consonant

2003-06-05 Thread Philippe Verdy
From: [EMAIL PROTECTED] Jim Allen wrote on 05/30/2003 09:38:12 AM: See also http://www.usefulcontent.org/docs/manuals/REC-MathML2-20010221/isoamso.html for some mathml characters and their unicode encodings. The character empty is encoded as U+2205 plus the variation selector

About the stenographic writing system

2003-06-07 Thread Philippe Verdy
For some references, look at this page (which displays a table of symbols): http://www.archivesnationales.culture.gouv.fr/camt/fr/se/fiche4/fiche4-1.html It describes the Prévost-Delaunay Method (from the official web site of National French Archives) An example of text is on:

Re: Letterforms based on p

2003-06-07 Thread Philippe Verdy
From: Lukas Pietsch [EMAIL PROTECTED] I was hoping to find someone who had additional evidence for this character. I happened to come across it the other day in a modern printed edition of 17th- to 19th century handwritten English letters (Miller, Kerby A., Arnold Schrier, Bruce D. Boling,

Re: Shorthand

2003-06-07 Thread Philippe Verdy
From: Tom Gewecke [EMAIL PROTECTED] There are interesting signs and symbols in this script, which could still have their use today for other applications than live transcriptions. I have a couple of books (published in 1972 and 1974) which describes the system (in business environements), and

Re: Shorthand

2003-06-08 Thread Philippe Verdy
From: Tom Gewecke [EMAIL PROTECTED] http://www.unicode.org/roadmaps/smp/ Thanks for pointing this block. But will it be enough to support at least the most wellknown variants that were (and sometimes are still) tought ? It seems doubtful, given the huge number of systems out there. Many

Re: Caron / Hacek?

2003-06-12 Thread Philippe Verdy
From: Pim Blokland [EMAIL PROTECTED] Antnio Martins-Tuvlkin schreef: [quoting Radovan Garabik] In fact, the apostrophe form is used because there is a lack of convenient space to put carons over tall letters d,t,l, whereas there is no problem with n,e,r. Funny you should bring this

Re: Roman numerals in non-latin text

2003-06-12 Thread Philippe Verdy
Pim Blokland [EMAIL PROTECTED] wrote: No. Encoded like that it may *look* like a roman three, but two of those are definitely not correct. Only U+2162 or its compatibility decomposition, U+0049 U+0049 U+0049 should be used. The other two are bad coding, just as using greek Iotas or

Re: [OT] No more IE for Mac

2003-06-14 Thread Philippe Verdy
From: Roozbeh Pournader [EMAIL PROTECTED] For those who were worried when is Microsoft going to implement good Unicode support for Mac OS's IE, there is now an answer: Never. Read it yourself: http://news.com.com/2100-1045_3-1017126.html It's a great news. It will force websites to stop

Re: [OT] No more IE for Mac

2003-06-14 Thread Philippe Verdy
From: Michael (michka) Kaplan [EMAIL PROTECTED] From: Philippe Verdy [EMAIL PROTECTED] This is an equal opportunity forum intended for discussion of issues relative to Unicode, an industrial consortium that includes (among many others) the companies you are talking about. Excessive anti-ANYONE

Re: [OT] IE support for standards (was: No more IE for Mac)

2003-06-16 Thread Philippe Verdy
From: Doug Ewell [EMAIL PROTECTED] Philippe Verdy verdy_p at wanadoo dot fr wrote: It's a great news. It will force websites to stop using Microsoft specific features and caveats, and adopt the real standards. ... If web sites start using the real standards, people will upgrade

Re: Looking for two mathematical characters

2003-06-16 Thread Philippe Verdy
From: Patrick Andries [EMAIL PROTECTED] I'm looking for two mathematical characters. 2) An angle operator (combining mark ?) looking like this _| , where a ) n| a) n occurrences of a a means a ) n| obviously a should all be

Re: Looking for two mathematical characters

2003-06-16 Thread Philippe Verdy
(combining mark ?) looking like this _| , where a ) n| a) n occurrences of a a means a ) n| obviously ashould all be written on a single line. And Philippe Verdy responded (after a long mathematical analysis

Re: Arabic script web site hosting solution for all platforms

2003-06-17 Thread Philippe Verdy
Excessive cross-posting to multiple newsgroups, forums and list servers is considered bulk (and also opposed to the netiquette). As this message is targetting a too large audience and out of topic, and is also a commercial ad, I can say that bulk+unsollicitated makes it fully qualifiable as SPAM.

Re: Arabic script web site hosting solution for all platforms

2003-06-18 Thread Philippe Verdy
From: Theodore H. Smith [EMAIL PROTECTED] Excessive cross-posting to multiple newsgroups, forums and list servers is considered bulk (and also opposed to the netiquette). As this message is targetting a too large audience and out of topic, and is also a commercial ad, I can say that

Re: Problem with Arial Unicode MS font for BOLD/ITALICS in PDF

2003-06-21 Thread Philippe Verdy
From: Michael Everson [EMAIL PROTECTED] At 16:45 -0700 2003-06-20, Richard Cook wrote: Of course, in pop e-print, nearly everything that can be done to a character is done ... including Bold-Ital-Outline-Shadow ... Hey, there's no reason only Latin typography should be filled with

Re: Classification of U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK

2003-06-21 Thread Philippe Verdy
From: Allen Haaheim [EMAIL PROTECTED] Phillippe, Sorry to reopen a (closed?) case. The below look like loose ends to me. I thought it was closed too. Well I can reply, but I will just give my opinion after reading translations to Japanese performed by other people, and hearing their comments.

Re: Major Defect in Combining classes of Tibetan Vowels

2003-06-21 Thread Philippe Verdy
From: Christopher John Fynn [EMAIL PROTECTED] So following normal Tibetan Dzongkha input and spelling rules the relative ordering of these characters should be: A. 0F71 (CCV=129) B. 0F74 (CCV=132) C. 0F72, 0F7A, 0F7B, 0F7C, 0F7D, 0F80 (CCV=130) D. 0F7E, (CCV=0) 0F82, 0F83 (CCV=230)

Re: Revised N2586R

2003-06-23 Thread Philippe Verdy
On Monday, June 23, 2003 2:54 PM, Michael Everson [EMAIL PROTECTED] wrote: It wouldn't be hard to provide a comparable descriptive paragraph that began with an image of the Stars and Stripes, but I don't think we'd want to encode the US flag as a character. That would be a logo. Most

Re: Revised N2586R

2003-06-23 Thread Philippe Verdy
On Monday, June 23, 2003 10:17 PM, Michael Everson [EMAIL PROTECTED] wrote: There doesn't seem to be a NUT SYMBOL used to warn that products contain nuts, though there are many, many references to Sainsbury's (a British supermarket chain) labelling their peanuts Warning: Contains Nuts. What

Re: Revised N2586R

2003-06-24 Thread Philippe Verdy
On Tuesday, June 24, 2003 7:41 AM, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: Michael Everson wrote on 06/23/2003 07:54:13 AM: Similarly, the fleur-de-lis is a well-known named symbol which can be used to represent a number of things. In text? I've seen it on flags, on license plates, on

Re: Revised N2586R

2003-06-24 Thread Philippe Verdy
On Tuesday, June 24, 2003 6:30 PM, Rick McGowan [EMAIL PROTECTED] wrote: U+2668 HOT SPRINGS is pleasant, but it's a lot less motivated -- to my mind -- than the DO NOT LITTER SIGN. Huh? The Hotspring sign appears in running text all the time -- in Japanese travel brochures, for example.

Re: Major Defect in Combining Classes of Tibetan Vowels

2003-06-25 Thread Philippe Verdy
On Wednesday, June 25, 2003 4:31 PM, Andrew C. West [EMAIL PROTECTED] wrote: On Wed, 25 Jun 2003 15:05:26 +0400, Valeriy E. Ushakov wrote: What I'm suggesting is that although cui 0F45, 0F74, 0F72 and ciu 0F45, 0F72, 0F74 should be rendered identically, the logical ordering of the codepoints

Re: Revised N2586R

2003-06-25 Thread Philippe Verdy
On Wednesday, June 25, 2003 6:11 PM, Michael Everson [EMAIL PROTECTED] wrote: At 08:44 -0700 2003-06-25, Doug Ewell wrote: If it's true that either the UTC or WG2 has formally approved the character, for a future version of Unicode or a future amendment to 10646, then I don't see any

Re: Major Defect in Combining Classes of Tibetan Vowels

2003-06-25 Thread Philippe Verdy
On Wednesday, June 25, 2003 6:13 PM, Mark Davis [EMAIL PROTECTED] wrote: Michael Everson wrote: [EMAIL PROTECTED] wrote: Christopher John Fynn wrote: Any suggestions as to how to create a standardized work around for these incorrect values? Propose new characters, and

Re: Major Defect in Combining Classes of Tibetan Vowels

2003-06-25 Thread Philippe Verdy
From: Michael (michka) Kaplan [EMAIL PROTECTED] From: Michael (michka) Kaplan [EMAIL PROTECTED] From: Andrew C. West [EMAIL PROTECTED] What I'm suggesting is that although cui 0F45, 0F74, 0F72 and ciu 0F45, 0F72, 0F74 should be rendered identically, the logical ordering of the

Re: Major Defect in Combining Classes of Tibetan Vowels

2003-06-25 Thread Philippe Verdy
On Wednesday, June 25, 2003 8:14 PM, Peter Lofting [EMAIL PROTECTED] wrote: At 7:41 PM +0200 6/25/03, Philippe Verdy wrote: If there are real distinct semantics that were abusively unified by the canonicalization, the only safe way would be to create a second character that would have

Re: Major Defect in Combining Classes of Tibetan Vowels

2003-06-25 Thread Philippe Verdy
On Thursday, June 26, 2003 1:04 AM, Andrew C. West [EMAIL PROTECTED] wrote: On Wed, 25 Jun 2003 13:41:27 -0700 (PDT), Kenneth Whistler wrote: Peter asked: How can things that are visually indistinguishable be lexically different? chat (en) chat (fr) And if Unicode

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Philippe Verdy
On Thursday, June 26, 2003 11:50 AM, Andrew C. West [EMAIL PROTECTED] wrote: On Wed, 25 Jun 2003 21:58:28 -0700, Elisha Berns wrote: Some weeks back there were a number of postings about software for viewing Unicode Ranges in TrueType fonts and I had a few questions about that. Most

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Philippe Verdy
On Thursday, June 26, 2003 2:26 PM, Philippe Verdy [EMAIL PROTECTED] wrote: I forgot also the probably better function from the Uniscribe library, which processes strings through a language-dependant shaping algorithm, and can determine appropriate glyph substitution, or use custom composite

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Philippe Verdy
On Thursday, June 26, 2003 4:13 PM, Andrew C. West [EMAIL PROTECTED] wrote: On Thu, 26 Jun 2003 14:26:13 +0200, Philippe Verdy wrote: Isn't there a work-around with the following function (quote from Microsoft MSDN): (with the caveat that you first need to allocate and fill a Unicode

Re: Question about Unicode Ranges in TrueType fonts

2003-06-26 Thread Philippe Verdy
On Thursday, June 26, 2003 8:16 PM, Elisha Berns [EMAIL PROTECTED] wrote: It would appear from your answer that even after implementing the algorithm to search the Unicode block coverage of a font, the actual comparison data, that is which blocks to compare and how many code points, is totally

Re: Biblical Hebrew (U+034F Combining Grapheme Joiner works)

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 3:54 AM, Kenneth Whistler [EMAIL PROTECTED] wrote: John, At 03:36 PM 6/26/2003, Kenneth Whistler wrote: Why is making use of the existing behavior of existing characters a groanable kludge, if it has the desired effect and makes the required distinctions

About combining classes

2003-06-27 Thread Philippe Verdy
When I just look at the history of combining classes, they did not exist in the first Unicode standard, and they still don't exist in ISO10646 as well. This was a technology developed by IBM and offered for free to the community to allow a simplified management of encoded texts, and it has long

Plain-text search algorithms: normalization, decomposition, case mapping, word breaks

2003-06-27 Thread Philippe Verdy
In order to implement a plain-text search algorithm, in a language neutral way that would still work with all scripts, I am searching for advices on how this can be done safely (notably for automated search engines), to allow searching for text matching some basic encoding styles. My first

Re: [cowan: Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)]

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 1:29 PM, John Cowan [EMAIL PROTECTED] wrote: Michael Everson scripsit: Change the character classes in Unicode 4.1, and they *might* decide to freeze support at, say, Unicode 3.0. Or they may simply opt to define their *OWN* normalization standard, distinct from

Re: Plain-text search algorithms: normalization, decomposition, case mapping, word breaks

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 3:36 PM, Jony Rosenne [EMAIL PROTECTED] wrote: For Hebrew and Arabic, add a step: Find the root, remove prefixes, suffixes and other grammatical artifacts and obtain the base form of the word. Removing common suffixes is a separate issue (this requires unification of

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 3:23 PM, Karljürgen Feuerherm [EMAIL PROTECTED] wrote: At 04:22 -0500 2003-06-27, [EMAIL PROTECTED] wrote: Now, Q: I take it the combining classes are linked to the script, rather than say to a dialect--e.g. one can't define BH as a separate dialect from MH with its

Re: Plain-text search algorithms: normalization, decomposition, case mapping, word breaks

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 4:44 PM, Ben Dougall [EMAIL PROTECTED] wrote: i'm a bit confused. i thought that this type of thing was already pretty well covered by the various unicode resources? (i guess there's a strong chance not, if you're asking this question). I'm not discussing about how

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 4:40 PM, John Cowan [EMAIL PROTECTED] wrote: Not so. Sometimes stability is more important than correctness. Very well answered. I don't see why we need to sacrifice stability when correcting something. As the error is not in ISO10646, it is definitely not reasonnable

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 5:05 PM, Michael Everson [EMAIL PROTECTED] wrote: At 10:40 -0400 2003-06-27, John Cowan wrote: Karljürgen Feuerherm scripsit: 1. Everyone is more or less agreed that the present combining class rules as they apply to BH contain mistakes. The clearly

Re: Biblical Hebrew (Was: Major Defect in Combining Classes of Tibetan Vowels)

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 5:53 PM, Karljürgen Feuerherm [EMAIL PROTECTED] wrote: And in any case this should NOT muck things up which aren't broken, like MH. Not breaking Modern Hebrew means not changing the combining classes of the characters it uses. Adding a distinct set for Traditional

Fw: Biblical Hebrew: possible solution for XML

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 6:01 PM, Philippe Verdy [EMAIL PROTECTED] wrote: Given that XML will require normalization for texts identified as being Unicode encoded (UTF-8 and others), couldn't a document be labelled so that the normalization step be removed from the XML processing, using a ISO

Re: Biblical Hebrew

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 10:28 PM, John Hudson [EMAIL PROTECTED] wrote: I don't think it would break any modern Hebrew document, because it is not in any way essential to modern Hebrew that the vowels have fixed position combining classes as in Unicode. That is part of the frustration: the

Re: Unicode Public Review Issues update

2003-06-27 Thread Philippe Verdy
On Friday, June 27, 2003 10:29 PM, Rick McGowan [EMAIL PROTECTED] wrote: The Unicode Technical Committee has posted a new issue for public review and comment. Details are on the following web page: http://www.unicode.org/review/ Briefly, the new issue is: Issue #11 Soft Dotted

Re: Biblical Hebrew

2003-06-27 Thread Philippe Verdy
On Saturday, June 28, 2003 1:15 AM, Kenneth Whistler [EMAIL PROTECTED] wrote: Philippe Verdy said: I understand the frustration: if Unicode had not attempted to define combining classes, which were not necessary to Unicode, all existing combining characters would have been given a CC=0

Re: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-06-30 Thread Philippe Verdy
On Monday, June 30, 2003 1:58 PM, Pim Blokland [EMAIL PROTECTED] wrote: Philippe Verdy schreef: Interesting issue for the Latin Small ij Ligature (U+0133): Normally the Soft_Dotted issupposed to make disappear one dot when there's and additional diacritic above, but many applications may

Re: Soft-dotted (was: RE: Unicode Public Review Issues update)

2003-06-30 Thread Philippe Verdy
On Monday, June 30, 2003 1:33 PM, Kent Karlsson [EMAIL PROTECTED] wrote: Or would this require using a diaeresis instead centered above the digraph? Probably. But are there any examples of this in use (ever, not necessarily Unicode encoded, or at all digitally encoded)? If that kind of

Re: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-06-30 Thread Philippe Verdy
On Monday, June 30, 2003 9:13 PM, James H. Cloos Jr. [EMAIL PROTECTED] wrote: So if you want two dots and an acute use ij, U+0308, U+0301: Of course a given fonts diaeresis will often not line up with the stems of its ij, and a custom one should be used instead. Or features and/or ligs as

Re: Accented ij ligatures (was: Unicode Public Review Issues update)

2003-07-01 Thread Philippe Verdy
On Tuesday, July 01, 2003 1:55 PM, Kent Karlsson [EMAIL PROTECTED] wrote: My feeling about the proposed Public Review document should exclude the ij ligature, waiting for the decision about the new dotless-ij ligature approved in the first rounds by UTC and waiting for approval by ISO

  1   2   3   4   5   6   7   8   9   10   >