FW: New version of TR29:

2002-08-20 Thread Marco Cimarosti
[Resending a message I sent during the Unicode List downtime] -Original Message- From: Marco Cimarosti Sent: Monday, August 19, 2002 7:03 PM To: 'Eric Muller'; [EMAIL PROTECTED] Subject: RE: New version of TR29: Eric Muller wrote: Your definition of LatinVowel is problematic. Is Y

FW: New version of TR29:

2002-08-20 Thread Marco Cimarosti
[Resending a message I sent during the Unicode List downtime] -Original Message- From: Marco Cimarosti Sent: Monday, August 19, 2002 12:55 PM To: 'Philipp Reichmuth' Cc: [EMAIL PROTECTED] Subject: RE: New version of TR29: Philipp Reichmuth wrote: MC Consonants [j] and [w] have the

again cjk character look-up

2002-08-20 Thread Zhang Weiwu
This mail message is in UTF-8, as the mail header indicated. I'm sorry to post a similar question again. Last time when I questioned about ideograph look-up someone gave me a link to something like CJK indexer. Would this kind man give me the link again? I forgot it. Is there any website

Re: FW: New version of TR29:

2002-08-20 Thread John Cowan
Marco Cimarosti scripsit: The issue is making the error window as narrow as possible. My assumption is that is common words such as c', d', j', l', n', qu', s', t' or v' are more common than edge cases like prud'homme. How about this heuristic: Break after an apostrophe that is the second

RE: FW: New version of TR29:

2002-08-20 Thread Marco Cimarosti
John Cowan wrote: Marco Cimarosti scripsit: The issue is making the error window as narrow as possible. My assumption is that is common words such as c', d', j', l', n', qu', s', t' or v' are more common than edge cases like prud'homme. How about this heuristic: Break after an

RE: FW: New version of TR29:

2002-08-20 Thread Addison Phillips [wM]
How about I'll or it's. Regards, Addison -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On Behalf Of John Cowan Sent: Tuesday, August 20, 2002 4:40 AM To: Marco Cimarosti Cc: '[EMAIL PROTECTED]' Subject: Re: FW: New version of TR29: Marco Cimarosti

Re: The mystery of Edwin U+1E9A

2002-08-20 Thread Philipp Reichmuth
Semitic transliteration practice, if I recall correctly. RM It is common enough in transcribing Hebrew and Arabic. A single character a with a half-ring to the upper right or on top of it? What would it stand for in Arabic transliteration, as opposed to separate characters a and half-ring?

Re: FW: New version of TR29:

2002-08-20 Thread Philipp Reichmuth
JC Break after an apostrophe that is the second or third letter in the JC word. Do not break after apostrophes that come later. JC This neatly handles (I think) all the English we'll, we've, it's split. must've or should've don't split. Isn't, don't and doesn't don't, either. Are these

Re: again cjk character look-up

2002-08-20 Thread David Possin
You can go to http://groups.yahoo.com/group/unicode/ to search the archives. Dave - Original Message - From: Zhang Weiwu [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Saturday, August 17, 2002 8:34 PM Subject: again cjk character look-up This mail message is in UTF-8, as the mail

Re: FW: New version of TR29:

2002-08-20 Thread Doug Ewell
John Cowan jcowan at reutershealth dot com wrote: How about this heuristic: Break after an apostrophe that is the second or third letter in the word. Do not break after apostrophes that come later. This neatly handles (I think) all the English, Italian, and Esperanto cases, and a good

Re: FW: New version of TR29:

2002-08-20 Thread Radovan Garabik
On Tue, Aug 20, 2002 at 07:39:38AM -0400, John Cowan wrote: Marco Cimarosti scripsit: The issue is making the error window as narrow as possible. My assumption is that is common words such as c', d', j', l', n', qu', s', t' or v' are more common than edge cases like prud'homme. How

Re: again cjk character look-up

2002-08-20 Thread John Jenkins
On Saturday, August 17, 2002, at 06:34 PM, Zhang Weiwu wrote: This time I'm looking for a ideograph with ½Ç on the left and Ñò on the right. It is read 'xi¨¨' in Pinyin. This character doesn't seem exist in CJK and CJK Ext A. You mean U+89E7 (Ón)? BTW, the Mandarin readings that Unihan has

Re: FW: New version of TR29:

2002-08-20 Thread Andrew C. West
On Tue, 20 August 2002, John Cowan wrote: How about this heuristic: Break after an apostrophe that is the second or third letter in the word. Do not break after apostrophes that come later. This neatly handles (I think) all the English, Italian, and Esperanto cases, and a good many of

RE: FW: New version of TR29:

2002-08-20 Thread Marco Cimarosti
Philipp Reichmuth wrote: MC O'zbek would not split, because the apostrophe is not followed by a, MC e, i, o, u or y. G'iyosaddin would (sorry for the silly word, it's the middle name of a medieval poet, but it's the first thing that came into my mind, and g' is not such a rare

RE: FW: New version of TR29:

2002-08-20 Thread Marco Cimarosti
Philipp Reichmuth wrote: JC Break after an apostrophe that is the second or third letter in the JC word. Do not break after apostrophes that come later. JC This neatly handles (I think) all the English we'll, we've, it's split. must've or should've don't split. Isn't, don't and

RE: FW: New version of TR29:

2002-08-20 Thread Marco Cimarosti
Andrew C. West wrote: On Tue, 20 August 2002, John Cowan wrote: How about this heuristic: Break after an apostrophe that is the second or third letter in the word. Do not break after apostrophes that come later. This neatly handles (I think) all the English, Italian, and Esperanto

Re: FW: New version of TR29:

2002-08-20 Thread Doug Ewell
Andrew C. West andrewcwest at alumni dot princeton dot edu wrote: Does not work with K'ang-hsi or Ch'ien-lung, or apostrophes used in IPA and other systems of phonetic transcription. Seems to me that one apostrophe is not enough - how about a NON- BREAKING APOSTROPHE for cases like

Proposal for DUTR#29

2002-08-20 Thread Marco Cimarosti
The deadline is here... Find attached the last version of my proposal for a better handling of apostrophe in DUTR#29. All the criticism I received was very valuable, and most of it has been included in the proposal, in a form or another. Thanks. _ Marco Proposal to accommodate French

Re: FW: New version of TR29:

2002-08-20 Thread Philipp Reichmuth
G'iyosaddin would (sorry for the silly word, it's the middle name of a medieval poet, but it's the first thing that came into my mind, and g' is not such a rare combination in Uzbek that this is the only case). MC It depends on what you mean by sensibly. You can't expect the result to be

Re: FW: New version of TR29:

2002-08-20 Thread John Cowan
Doug Ewell scripsit: As enticing as it sounds, disunifying it would not solve the problem; it would simply move it from the text boundaries category to the legacy data conversion category. Somewhat off the topic: What I've never understood is why Unicode is so adamant that the ' of English

Re: FW: New version of TR29:

2002-08-20 Thread Andrew C. West
On Tue, 20 August 2002, John Cowan wrote: It has no sound, but neither does Romance quot;hquot;; both exist as a marker of etymology. But in fact the apostrophe may have a sound in dialectal English, where it is used to represent a medial or final glotal stop (e.g. a drin' a wa'er for a

Re: FW: New version of TR29:

2002-08-20 Thread Michael Everson
At 10:10 -0700 2002-08-20, Andrew C. West wrote: On Tue, 20 August 2002, John Cowan wrote: It has no sound, but neither does Romance quot;hquot;; both exist as a marker of etymology. But in fact the apostrophe may have a sound in dialectal English, where it is used to represent a medial or

Re: FW: New version of TR29:

2002-08-20 Thread Martin Heijdra
Just FYI (I have not been following this thread): officially in the bibliographic community, there is NO apostrophe in Wade-Giles K'ang-hsi, or in Korean aspirated characters; it's an ayn (02BBAYN / MODIFIER LETTER TURNED COMMA ). The apostrophe is only used between syllables. Of course,

Re: FW: New version of TR29:

2002-08-20 Thread Michael Everson
At 12:46 -0400 2002-08-20, John Cowan wrote: Doug Ewell scripsit: As enticing as it sounds, disunifying it would not solve the problem; it would simply move it from the text boundaries category to the legacy data conversion category. Somewhat off the topic: What I've never understood is

Re: FW: New version of TR29:

2002-08-20 Thread James E. Agenbroad
On Tue, 20 Aug 2002, Andrew C. West wrote: On Tue, 20 August 2002, John Cowan wrote: It has no sound, but neither does Romance quot;hquot;; both exist as a marker of etymology. But in fact the apostrophe may have a sound in dialectal English, where it is used to represent a

Re: FW: New version of TR29:

2002-08-20 Thread Marion Gunn
Arsa Doug Ewell: John Cowan jcowan at reutershealth dot com wrote: How about this heuristic: Break after an apostrophe that is the second or third letter in the word. Do not break after apostrophes that come later. This neatly handles (I think) all the English, Italian, and Esperanto

Re: FW: New version of TR29:

2002-08-20 Thread John D. Burger
John Cowan wrote: What I've never understood is why Unicode is so adamant that the ' of English words is a punctuation mark, not a letter; why when disambiguating U+0027, English apostrophe is to be mapped to U+2019 and not U+02BC. It's true that historically isn't is derived from is not,

Re: FW: New version of TR29:

2002-08-20 Thread John Cowan
Michael Everson scripsit: U+02BC will have a place in the alphabet and affect sorting in languages like Hawai'ian. U+2019 doesn't. The former is used as a letter; the latter is used as a mark of punctuation. IIRC, practical San orthography ignores click letters for sorting purposes though

Re: FW: New version of TR29:

2002-08-20 Thread James E. Agenbroad
On Tue, 20 Aug 2002, Michael Everson wrote: At 10:10 -0700 2002-08-20, Andrew C. West wrote: On Tue, 20 August 2002, John Cowan wrote: It has no sound, but neither does Romance quot;hquot;; both exist as a marker of etymology. But in fact the apostrophe may have a sound in dialectal

Re: FW: New version of TR29:

2002-08-20 Thread Marion Gunn
Arsa Doug Ewell: John Cowan jcowan at reutershealth dot com wrote: How about this heuristic: Break after an apostrophe that is the second or third letter in the word. Do not break after apostrophes that come later. This neatly handles (I think) all the English, Italian, and Esperanto

Re: FW: New version of TR29:

2002-08-20 Thread Marion Gunn
ps. In re the 'ornamental' use of the apostrophe to anglicize Irish surnames, I believe that practice to be unique to English (viz., inserting an apostrophe where nothing is omitted, and it does not function as a punctuation mark). Am I wrong, or is what I call the English practice actually

Re: FW: New version of TR29:

2002-08-20 Thread Michael Everson
At 13:58 -0400 2002-08-20, James E. Agenbroad wrote: There is also fo'c'sle, the abridged version of forecastle. :-) There are no glottal stops there. -- Michael Everson *** Everson Typography *** http://www.evertype.com

Re: FW: New version of TR29:

2002-08-20 Thread Michael Everson
John, Marco, The practice of writing anglicized Irish names in English orthography is rather interesting. Original Ó Briain is written O'Brien where the apostrophe both mimics the original acute, and incidentally functions as a kind of hyphen, showing that the two parts of the name are

Re: FW: New version of TR29:

2002-08-20 Thread Michael Everson
At 20:07 +0100 2002-08-20, Marion Gunn wrote: In re the 'ornamental' use of the apostrophe to anglicize Irish surnames, I believe that practice to be unique to English (viz., inserting an apostrophe where nothing is omitted, and it does not function as a punctuation mark). It's not ornamental

RE: FW: New version of TR29:

2002-08-20 Thread jarkko.hietaniemi
Then there are two uses of apostrophies in quoting: within secondary quotation marks Urk. I meant within quotation marks as secondary quotation marks.

RE: FW: New version of TR29:

2002-08-20 Thread jarkko.hietaniemi
As another datapoint the following details the use of the apostrophe in Finnish http://www.cs.tut.fi/~jkorpela/kielikello/merkit.html#heittomerkki It's in Finnish :-) so allow me to summarize: (1) if consonant gradation (the change or even elision) of consonants would cause two same vowels

Re: FW: New version of TR29:

2002-08-20 Thread John Cowan
Michael Everson scripsit: [T]he OED notes that the prefix has been variously written: Macdonald, MacDonald, McDonald, Msupc/supDonald, M'Donald. I can't say I've seen the last one in any text more recent than the 18th century, but it is certainly indicative of the use of apostrophe as a

Re: New version of TR29

2002-08-20 Thread Stefan Persson
Some notes about Japanese transcription: $B$J(B - na $B$K(B - ni $B$L(B - nu $B$M(B - ne $B$N(B - no $B$s$"(B - n'a $B$s$$(B - n'i $B$s$&(B - n'u $B$s$((B - n'e $B$s$*(B - n'o I'm not sure if this could cause any problems, though. Stefan

[OT] cover sheet standards - summary

2002-08-20 Thread Janusz S. Bie
The summary is very short: I got *no* answer to my query. Perhaps some of you may suggest a more responsive forum? Best regards Janusz On 12 Aug 2002 [EMAIL PROTECTED] (Janusz S. Bie) wrote: I am interested in world-wide use of cover sheet standards. I would like to trace down the first

Re: [OT] cover sheet standards - summary

2002-08-20 Thread Michael \(michka\) Kaplan
From: Janusz S. Bie [EMAIL PROTECTED] The summary is very short: I got *no* answer to my query. Well, it *is* a question that really has nothing to do with Unicode, a character encoding standard. Perhaps some of you may suggest a more responsive forum? Well, something on topic would likely

Re: FW: New version of TR29:

2002-08-20 Thread Doug Ewell
John D. Burger john at mitre dot org wrote: Thus, I would have an issue with the argument that the apostrophe is merely part of the spelling of the word hill's. There is no such word. I think the folks who make Science Diet dog and cat food might disagree: http://www.hillspet.com More

Re: FW: New version of TR29:

2002-08-20 Thread Peter_Constable
On 08/20/2002 08:32:53 PM John D. Burger wrote: Thus, I would have an issue with the argument that the apostrophe is merely part of the spelling of the word hill's. There is no such word. There certainly is such a wordform; you have used it in your illustrative sentence. It is a phonological