Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: John Cowan [EMAIL PROTECTED] Peter Kirk scripsit: On 13/08/2003 11:09, Philippe Verdy wrote: ... For this reason, defective combining sequences (combining characters without a leading base character) should be forbidden (invalid for XML). If there is even the remotest

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
- Original Message - From: Jon Hanna [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, August 14, 2003 1:49 PM Subject: RE: Questions on ZWNBS - for line initial holam plus alef I do agree: a XML document could require the use at some place of a given attribute or element. If

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jon Hanna
OK, it's safe, but it is a misuse of Unicode. As space plus combining character is a unit in Unicode, it should be treated as a unit by higher level protocols. If higher level protocols are allowed to do arbitrary things within Unicode units, there is no end to the possible confusion. See for

Re: Handwritten EURO sign

2003-08-14 Thread Michael Everson
At 23:35 +0200 2003-08-05, Pim Blokland wrote: I have absolutely no idea what you are talking about. You are lucky not having to put up with bad English like five euro and six cent, living in the Netherlands and speaking Dutch as you do. See http://www.evertype.com/standards/euro if you wish to

RE: Conflicting principles

2003-08-14 Thread Kent Karlsson
Anyway, John J, what code are we talking about that has to work from the positions of the combining marks back to the underlying representation? Are you talking about OCR? No, the issue is more how to start from a base form and work forward to encompass the whole series of

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: Peter Kirk [EMAIL PROTECTED] I note that there is no line break opportunity in space, NBSP. But is there one after the space in space, RLM, NBSP? If so, RLM, NBSP, combining character has a third advantage, that it gives the right line break opportunity when this sequence is word

Re: Handwritten EURO sign (off topic?)

2003-08-14 Thread Peter Kirk
On 14/08/2003 09:54, Michael Everson wrote: Lepton in Greek was accepted from the beginning. Leptó pl leptá. The same word as the original widow's mite (Mark 12:42). Probably worth even less now! -- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/

Diacriticals and descents in upper case (was: Re: Caron / Hacek?)

2003-08-14 Thread Anto'nio Martins-Tuva'lkin
On 2003.06.12, 18:38, Philippe Verdy [EMAIL PROTECTED] wrote: Capital letters simply don't use ascents or descents, and thus they occupy a *smaller* space than the lowercase letters. Some upper case letters commonly (i.e. in some typical fonts) have descents, especially, though not only, in

Re: Unicode 4.0 is online at last!

2003-08-14 Thread Rick McGowan
Peter Kirk suggested... Interesting and a little embarrassing that Unicode's own documentation is not Unicode compatible! I don't think it's very embarrassing... The Unicode consortium after all doesn't produce book editing and typesetting software, we use other peoples' software. I think

Re: Handwritten EURO sign (off topic?)

2003-08-14 Thread Patrick Andries
- Message d'origine - De: Marco Cimarosti [EMAIL PROTECTED] Anto'nio Martins-Tuva'lkin wrote: After all the euro is a common currency and its figures should be written in a common way. Why? Very good question. Multilingual countries like Belgium or Canada already were or are

Re: Compatibility decompositions

2003-08-14 Thread Kenneth Whistler
John Cowan asked: I realize that existing compatibility decompositions are a rag-bag, especially those marked with the generic compat tag rather than one of the specific tags such as font, initial, or super. I wonder what principles, if any, can be enunciated for giving a newly introduced

RE: Pre-orders of The Unicode Standard, Version 4.0

2003-08-14 Thread Magda Danish \(Unicode\)
-Original Message- From: John Cowan [mailto:[EMAIL PROTECTED] Sent: Thursday, August 14, 2003 10:20 AM To: Magda Danish (Unicode) Cc: Unicode Core List; [EMAIL PROTECTED] Subject: Re: Pre-orders of The Unicode Standard, Version 4.0 Thanks. Is the Unicode Consortium in any way

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Mark Davis
Peter, in XML you really don't want to use attributes for any general text; there are too many restrictions on the content. For example, we never put translatable text into them. Attributes should really be treated more like sequences of symbols, with a constrained syntax. This is also not in

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: Peter Kirk [EMAIL PROTECTED] There is some potential for real trouble here, if one process outputs an NMTOKEN starting with a combining character preceded by a separating space, or something else which is changed into a space, and another process takes the new space plus combining

Re: Unicode 4.0 is online at last!

2003-08-14 Thread Peter Kirk
On 11/08/2003 17:37, Kenneth Whistler wrote: Well, I've been promising that good things would come to those who wait. ;-) At last, the Unicode website has been updated with the online chapters for Unicode 4.0. See: http://www.unicode.org/versions/Unicode4.0.0/ Or just go to the Unicode 4.0 link

Re: Pre-orders of The Unicode Standard, Version 4.0

2003-08-14 Thread John Cowan
Magda Danish (Unicode) scripsit: To order, please use the the book order form at http://www.unicode.org/book/bookform.html Thanks. Is the Unicode Consortium in any way benefited (or disadvantaged) if non-members order through it rather than through Amazon or BN? -- John Cowan [EMAIL

Re: Colourful scripts and Aramaic

2003-08-14 Thread Michael Everson
At 13:12 -0700 2003-08-07, Peter Kirk wrote: Well, it seems to me that in the case of the Aramaic proposal we don't even have that. We have an archaic version of the script which is now used mainly for Hebrew, and which many scholars still call Aramaic (in distinction from paleo-Hebrew)

Unicode 4.0.1 Beta period now starting

2003-08-14 Thread Rick McGowan
The beta period for Unicode 4.0.1 has now started. Detailed information is available on the beta page: http://www.unicode.org/versions/beta.html Beta versions of Unicode 4.0.1 data files are now available for public comment here: http://www.unicode.org/Public/4.0-Update1/

[hebrew] Re: Roadmap---Mandaic, Early Aramaic, Samaritan

2003-08-14 Thread Michael Everson
Elaine, I really, really, really don't have time to debug your dissatisfaction with the use of the word Aramaic in the Roadmaps. This is NOT something anyone is working actively on right now. When a proposal comes forth, there will be evidence in it that can be picked at. In actuality, one

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Kenneth Whistler
Peter responded to Mark: On 05/08/2003 14:40, Mark Davis wrote: Where did you get the notion that space is not a base character? And base characters include those that are not control or format characters. Space is neither one. The standard specifically states in a number of places that

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jon Hanna
the solution with SPACE is really tricky due to the special treatment of SPACE notably in HTML, SGML, XML I disagree. There are a few different things that happen with whitespace in such technologies. Some of these only apply to elements that do not allow any character data apart from

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Doug Ewell
Peter Kirk peter dot r dot kirk at ntlworld dot com wrote: Point taken. But when different fonts and rendering engines give different results because the standard is unclear or ambiguous, that is a matter for the discussion here. And when conforming fonts and rendering engines fail to give

RE: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Kent Karlsson
there is no such thing as NFD decompositions. Sorry for the confusion. Still even with a NFKD decomposition, And there is no such thing as NFKD decomposition either. It goes as follows, in steps: 1. Canonical and compatibility decomposition mappings (one-step), and canonical classes.

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Noah Levitt
According to the docs at http://www.microsoft.com/typography/otfntdev/indicot/other.htm, uniscribe renders combining marks in isolation when they are applied to SPACE + ZWJ. (Without the ZWJ, it uses a dotted circle.) Perhaps this is an acceptable solution to the people calling for a new

RE: Newbie Question - what are all those duplicated characters FOR?

2003-08-14 Thread Jill . Ramonsky
Ah, now you're making assumptions about me which are not, in fact, valid. I'm not quite sure exactly what you mean by the text, but I own a copy of The Unicode Standard Version 3.0 and have read it pretty much in entirety. I have also read almost everything I could find on the unicode.org web

Re: Conflicting principles

2003-08-14 Thread Peter Kirk
On 07/08/2003 13:57, John Cowan wrote: Kent Karlsson scripsit: 4) Encode the vowel signs as combining characters, after the base characters they logical follow. Consider them as double [width] combining characters, that happen to have no ink above/below the character they apply to,

Re: AL32UTF8 Vs UTF8

2003-08-14 Thread John Cowan
Jay Chandru scripsit: I wanted to know the differences between AL32UTF8 and UTF8. My database (oracle) will be in AL32UTF8 format. Will the applications that require multibyte characters work as they are functionin in UTF8 format. The Oracle UTF8 format is really CESU-8, whereas the

Re: Conflicting principles

2003-08-14 Thread Michael Everson
At 01:18 +0200 2003-08-09, Philippe Verdy wrote: Such break in a middle of a multiple width diacritic exist in some notations, and are not considered horrible typography. Just look at musical notations where a upper horizontal parenthesis is used to group some elements [...] Music setting is

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Curtis Clark
on 2003-08-06 15:24 Doug Ewell wrote: I'm not a typographer (intelligent or otherwise), but I'm having a tough time seeing how Section 2.10 *requires* fonts and rendering engines to give a space-plus-combining-diacritic combination the exact minimum width of the diacritic alone, or to leave equal

24th Unicode Conference - Last week to $SAVE with early-birdrates!

2003-08-14 Thread Tex Texin
REGISTER THIS WEEK AND SAVE ON EARLY-BIRD CONFERENCE AND HOTEL RATES! Are you falling behind? Version 4.0 of the Unicode Standard is here! Software and Web applications can now support more languages with greater

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Philippe Verdy
On Sunday, August 10, 2003 9:30 AM, Mark Davis [EMAIL PROTECTED] wrote: As for oe-ligature, the French representative to WG3 (or its predecessor) said that France could live without it. Even worse; the story I heard was that the committee had planned from the start to have and in

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
Jon Hanna scripsit: If this is not the case (I'm not entirely sure this bans what XML does with spaces) then all we would need is a change so that rather than a de facto ban on space+combining within names and nmtokens we would have an explicit ban on the same; then we'd all be happy, except

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Peter Kirk
On 05/08/2003 14:40, Mark Davis wrote: Where did you get the notion that space is not a base character? And base characters include those that are not control or format characters. Space is neither one. The standard specifically states in a number of places that to exhibit a combining mark in

Re: Display of Isolated Nonspacing Marks (was Re: Questions onZWNBS...)

2003-08-14 Thread Michael Everson
At 01:30 +0200 2003-08-10, Philippe Verdy wrote: Whateer you think, the SPACE+diacritic is still a hack, and certainly not a canonical equivalent (including for its properties), of the existing spacing diacritics, which also do not fit all usages because they are symbols. It is the formally

RE: Conflicting principles

2003-08-14 Thread Jon Hanna
what code are we talking about that has to work from the positions of the combining marks back to the underlying representation? Such code is not just common and widespread, it is practically ubiquitous. The principle of base characters always coming first are used: Whenever you need to

RE: Does Unicode 3.1 take care of all characters of 'Hong Kong Supplimentary Character Set - 2001' (HKSCS-2001) ?

2003-08-14 Thread Kent Karlsson
Aren't the replies about Unicode 3.2 (or maybe 4.0) rather than 3.1? 1651 - Supplimentary Plane 2 - \2e80 - \u2f00 Plane 2 covers U+2 to U+2, and is not in the BMP (= Plane 0). /kent k

Re: Display of Isolated Nonspacing Marks (problems with UAX#29)

2003-08-14 Thread Peter Kirk
On 10/08/2003 18:44, Doug Ewell wrote: Has it occurred to anyone yet that the very *concept* of spacing diacritics is a hack? Spacing diacritics are used to conduct a sort of meta-discussion about characters, as in A base character o is combined with an acute accent to create . They are not

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Peter Kirk
On 08/08/2003 09:54, Jim Allan wrote: ... It certainly makes sense that in the case of space characters that have a defined width that this width is innate to the definition of the character and in such a case should take precidence over the width of the normally non-spacing combining

Unicode 4.0 is online at last!

2003-08-14 Thread Lisa Moore
My congratulations to Ken, Julie, and Eric! For those who might not know, this trio (especially Eric with the online bit) get our unadulterated love and appreciation...Lots of difficulties on the road to online Unicode 4.0 :-) !! Lisa - Forwarded by Lisa Moore/Santa Teresa/IBM on

Re: Roadmap---Mandaic, Early Aramaic, Samaritan

2003-08-14 Thread Michael Everson
Elaine, I disagree with you. Just because Semitic languages *can* be represented in the Hebrew script does not mean that every script is just a font variant of the Hebrew script. There are genetic relationships of the development of the scripts which are involved in our analysis so far.

Re: Handwritten EURO sign (off topic?)

2003-08-14 Thread Stefan Persson
James H. Cloos Jr. wrote: Anto'nio == Anto'nio Martins-Tuva'lkin [EMAIL PROTECTED] writes: Anto'nio (Let alone the validity of things Anto'nio like k, c etc.) I'm sure things like m, k, M and even G will come into use, though I expect more will use them in front of the digits. I certainly use

Re: Conflicting principles

2003-08-14 Thread Kenneth Whistler
John Cowan asked: I would like to ask the old farts^W^Wrespected elders of the UTC which principle they consider more important, abstractly speaking: the principle that combining marks always follow their base characters (a typographical principle), or that text is stored, with a few minor

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: Jon Hanna [EMAIL PROTECTED] I was saying that it wouldn't be sensible to begin a line with a combining diacritic, since that combining diacritic would be combining with a newline character which it's difficult to think of any possible sensible meaning for. A newline is a control with

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Kenneth Whistler
Ted Hopp asked: I believe that reasonable people might reasonably conclude from factoids 1 and 2 that SPACE is indeed a format character. Reasonable, but evidently wrong. Explanation, please? I provided the text deconstruction in my last email, but to continue, the confusion arises from the

Re: Handwritten EURO sign (off topic?)

2003-08-14 Thread Michael Everson
At 00:52 +0100 2003-08-14, Anto'nio Martins-Tuva'lkin wrote: Using the cent sign is mostly US specific and the symbol is not recognized as such in most European countries, so the cent sign is bound directly to the dollar. If the dollar sign can be used for currencies other than the USD,

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jim Allan
Ken Whistler posted: Of course a standard which mandates space folding is also within its rights to mandate, for example, the non-use of nonspacing marks applied to SPACE characters. It can simply rule out such sequences as valid for its context, in which case the problem goes away. And for such

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kenneth Whistler
Peter Kirk wrote: I think this may be a Peter mistake. I meant to refer to spacing diacritics. Sorry. It is certainly highly inappropriate for spacing diacritics to be considered word boundaries. Why? It is entirely dependent on the orthography and conventions involved. There is probably

Re: Conflicting principles

2003-08-14 Thread Peter Kirk
On 06/08/2003 14:04, John Jenkins wrote: Speaking purely as an old fart, I'd say the former. We already break the latter principle in Thai and Lao, and having be prepared to scan either forward or backward from a base character in order to find its combining marks would add overhead to a lot

Valid encodings

2003-08-14 Thread Jony Rosenne
We need an official Unicode Lint. Jony -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Philippe Verdy Sent: Thursday, August 07, 2003 4:28 PM To: [EMAIL PROTECTED] Subject: SPAM: Re: Questions on ZWNBS - for line initial holam plus alef On

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 08/08/2003 08:54, Philippe Verdy wrote: ... Could there be another codepoint assigned that has these properties: 20CF;ZERO WIDTH SYMBOL;Sk;0;ON;compat 0020N; i.e. being considered symbolic, not a whitespace, with combining class 0 (not combining), and used as an explicit base for a

Re: Which ancestral links

2003-08-14 Thread Raymond Mercier
Indeed, pardon my haste, that was a matter of an addition to the Syriac script. For a comparison of the various scripts used for Sogdian, http://iranianlanguages.com/midiranian/sogdian.htm#Alphabet Raymond - Original Message - From: Michael Everson [EMAIL PROTECTED] To: [EMAIL

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 08/08/2003 13:56, Thomas M. Widmann wrote: Peter Kirk [EMAIL PROTECTED] writes: On 08/08/2003 08:54, Philippe Verdy wrote: ... Could there be another codepoint assigned that has these properties: 20CF;ZERO WIDTH SYMBOL;Sk;0;ON;compat 0020N; [...] But I'm not sure

Re: Roadmap---Mandaic, Early Aramaic, Samaritan

2003-08-14 Thread Kenneth Whistler
Elain Keown responded to Michael: I really, really, really don't have time to debug your dissatisfaction with the use of the word Aramaic in the Roadmaps. This is NOT something anyone is working actively on right now. When a I'm not writing about nomenclature---not the point all. I'm

Re: Assume everything on this list is ignored (was Re: Newbie Question - what are all those duplicated characters FO R?)

2003-08-14 Thread John Cowan
Mark Davis scripsit: I repeat again. Nothing on this list has any guarantee that it will be seen by anyone in the UTC. If you want to submit a FAQ question that's great -- and I strongly encourage it. But please use: http://www.unicode.org/reporting.html to make sure it is tracked. Hearing

Note about CGJ in current MS implementation

2003-08-14 Thread John Hudson
A note for those interested in how CGJ may be used in font lookups: In the current MS implementation (Office 2002, Wordpad, etc.) if CGJ is inserted immediately after a space character it breaks RTL directionality. So for the time being at least, any use of CGJ to affect rendering in Biblical

RE: Conflicting principles

2003-08-14 Thread ekeown
Madison Hi, Only two people asked me what else exists in the complete Hebrew character set, but maybe others care. The significant points here are that there are other pointing systems to be combined with base letters and that there are manuscripts that have TWO pointing systems

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Peter Kirk
On 05/08/2003 09:42, Jim Allan wrote: Peter Kirk posted: If I want to do this, should I explicitly encode a dotted circle, or should I encode nothing and expect the font to generate the dotted circle, as it often does? I think that practise of a font or application automaticaly inserting a

IETF, W3 ....?

2003-08-14 Thread ekeown
Elaine Keown still in Madison Dear John Cowan and Peter Kirk: Could you possibly explain to me why these other organizations---IETF and W3-- are apparently concerned about character properties, to the point where apparently they also have a hand in deciding what will happen

Aramaic scripts

2003-08-14 Thread Raymond Mercier
There are omissions in Michael Everson's chart in http://www.dkuug.dk/jtc1/sc2/wg2/docs/n2311.pdf The chart was based on Semitic languages, although purporting to be about scripts. After all Greek and Latin also derive from the same family of scripts, as we all learn from page 1 of

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Jon Hanna
3) In attribute values that have a declared type other than CDATA, multiple spaces are compressed to a single space, and leading and trailing spaces are removed. After this is done, there can be no spaces in attributes of type ID, IDREF, ENTITY, NMTOKEN, NOTATION, or enumerated

Re: IETF, W3 ....?

2003-08-14 Thread John Cowan
[EMAIL PROTECTED] scripsit: Could you possibly explain to me why these other organizations---IETF and W3-- are apparently concerned about character properties, to the point where apparently they also have a hand in deciding what will happen with Hebrew? For a long time, I thought that

RE: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Kent Karlsson
The NFD decompositions of spacing marks is alredy defined as a SPACE plus a non-spacing combining character. Philippe, please! Those are *compatibility* decompositions. The normal form NFD only uses *canonical* decompositions. And there is no such thing as NFD decompositions. /kent

AL32UTF8 Vs UTF8

2003-08-14 Thread Jay Chandru
Greetings, We are using Oracle9i with application tier as 11i. I wanted to know the differences between AL32UTF8 and UTF8. My database (oracle) will be in AL32UTF8 format. Will the applications that require multibyte characters work as they are functionin in UTF8 format. Would be great if

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 06/08/2003 03:38, Kent Karlsson wrote: Kenneth Whistler wrote: Kent Karlsson said: I see no particular *technical* problem with using WJ, though. In contrast to the suggestion of using CGJ (re. another problem) anywhere else but at the end of a combining sequence. CGJ

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
On Saturday, August 09, 2003 12:49 AM, Michael Everson [EMAIL PROTECTED] wrote: At 14:22 -0700 2003-08-08, Kenneth Whistler wrote: Philippe, you are tilting at windmills, here. There is no chance that the UTC is going to consider such a character, in my assessment, let alone give it the

Re: Conflicting principles

2003-08-14 Thread Philippe Verdy
On Friday, August 08, 2003 9:16 PM, Peter Kirk [EMAIL PROTECTED] wrote: On 07/08/2003 13:57, John Cowan wrote: ... But an immediate problem comes to mind: what if there is a line break between the two base characters? What if there is a line break between the two characters joined by a

Re: Handwritten EURO sign

2003-08-14 Thread Pim Blokland
Michael Everson schreef: More horrifying is the idiotic euro is immune to grammar error which continues to be broadcast daily by our television and radio stations, all because people with power lacked the moral courage to say oops, yeah, that was the wrong interpretation of the Directive

Re: Pigpen/Masonic/Poundex

2003-08-14 Thread Michael Everson
At 18:49 +0200 2003-08-08, Chris Jacobs wrote: This seems to be a clear difference from colorful scripts, where I think there is an agreement about which glyph represents which sound. So I think the analogy between pigpen and colorful scripts does not hold. Two gifs on two websites does not

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Hudson
At 05:27 PM 8/8/2003, Kenneth Whistler wrote: Because the mechanism for doing so -- application to SPACE or to NBSP -- has been specified by the standard for a decade now. True enough, but I'm also a bit concerned about this mechanism because white space characters are another pesky thing that

RE: Conflicting principles

2003-08-14 Thread Michael Everson
Ken's point of course is that however bizarre the backing store for Sindarin and English Tengwar modes may be, combining characters per se must follow their base characters no matter what. -- Michael Everson * * Everson Typography * * http://www.evertype.com

Re: Conflicting principles

2003-08-14 Thread Philippe Verdy
On Thursday, August 07, 2003 11:29 PM, Michael Everson [EMAIL PROTECTED] wrote: Ken's point of course is that however bizarre the backing store for Sindarin and English Tengwar modes may be, combining characters per se must follow their base characters no matter what. Even if that breaks the

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: John Cowan [EMAIL PROTECTED] Peter Kirk scripsit: So far so good, but when I get to an accent with no predefined spacing variant, I have a problem! No you don't. If you want to say Seagull is the diacritic used to represent linguolabial sounds in the IPA, then you just encode

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Philippe Verdy
On Wednesday, August 06, 2003 11:48 PM, Peter Kirk [EMAIL PROTECTED] wrote: OK, what kind of markup should I use, in any well-known markup language, to ensure that an isolated diacritic is centred in the space between the words before and after it? In plain text, I think that this encoding:

RE: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Jony Rosenne
I would like to point out that with all due respect, how particular fonts or rendering engines behave is only marginally relevant to the Unicode list. I think that we should deal only with the Unicode specification. A particular implementation or many implementations may not behave as expected,

Re: Handwritten EURO sign

2003-08-14 Thread Michael Everson
At 08:55 -0700 2003-08-05, Doug Ewell wrote: The original legislative attempt to dictate the exact proportions (and even color) of the euro sign, regardless of the font in use, was just silly. That is very old history, as detailed on my website

RE: Assume everything on this list is ignored

2003-08-14 Thread Jill . Ramonsky
Isn't the very notion of submit[ting] a FAQ question a contradiction in terms? Surely, one merely ASKS a question. If enough people ask the same question, we may then classify it as frequently asked. It's like this. Newbies want to find things out. So they read books, and look around on the web.

The relation between Unicode and ISO/IEC 10646

2003-08-14 Thread Jony Rosenne
As far as I know, there are many topics not covered by ISO, for example (Bbi-directional behavior. (B (BJony (B (B -Original Message- (B From: [EMAIL PROTECTED] (B [mailto:[EMAIL PROTECTED] On Behalf Of souravm (B Sent: Tuesday, August 12, 2003 8:40 AM (B To: unicode (B Subject:

RE: Conflicting principles

2003-08-14 Thread Kent Karlsson
Collation isn't really based on combining sequences (even though UTS 10 specifies a certain spanning over non-blocking (combining) This is a very ignorant question: where in your public documentation are these issues discussed? ... I still don't understand even what happens with basic

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 13/08/2003 15:54, Jony Rosenne wrote: Suggested but not accepted. I am inherently suspicious when pressure is being exerted to decide complex and difficult questions in a hurry. Jony Jony, I am not trying to hurry anything. I am putting a lot of time and effort into trying to reach proper

Re: Display of Isolated Nonspacing Marks (problems with UAX#29)

2003-08-14 Thread Doug Ewell
Philippe Verdy verdy_p at wanadoo dot fr wrote: Note that these two ZW and SP classes of characters are *normative*. Another proof that SPACE+diacritics is really a hack causing lots of problems in the Unicode main standard and its standard annexes. Has it occurred to anyone yet that the very

RE: ADO, SQL-Server and VB6

2003-08-14 Thread Jon Hanna
I might be able to help. Two questions: 1. How firmly have you tracked down the point at which this conversion happens? 2. What is the datatype in the database? (text BLOB?, ntext BLOB? varchar?)

RE: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Kent Karlsson
Michael wrote: The Name Police reject this utterly. ZERO WIDTH cannot have an expanding dynamic width. Then what about ZERO WIDTH SPACE, which, according to TUS3, p. 238, can grow to have a visible width when justified? And it has the NamesList comment: * nominally zero width, but may

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 11/08/2003 16:06, Mark Davis wrote: Some of this seems to be in reference to an earlier contention that Text Boundaries (inc. Lines) break between the space and the non-spacing mark. I think this was attributed to Phillipe. [This may not be true: I don't actually read his email, because the

Unicode Technical Note added

2003-08-14 Thread Rick McGowan
A new Unicode Technical Note on Deterministic Sorting is now available: http://www.unicode.org/notes/tn9/ Unicode Technical Notes provide for the publication of information that may be of interest to implementers or readers of the Unicode Standard, or to users of programs which

Roadmap-Mandaic, Early Aram., Samarit Alternative Mel Gibson

2003-08-14 Thread ekeown
Elaine Keown still in Madison WISC Hello, Responding again to the deep interest in Aramaic expressed on the list, I am writing with a suggested preliminary Alternative or possibly Countercultural version of the Roadmap and a New, Improved Acronym for EUSAS (Egyptian,

Roadmap-Mandaic, Early Aram., Samarit Alternative Mel Gibson

2003-08-14 Thread Michael Everson
I think we will keep the Roadmap as it is for the time being. -- Michael Everson * * Everson Typography * * http://www.evertype.com

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
From: Kenneth Whistler [EMAIL PROTECTED] It is perfectly reasonable, as I see it, to consider the SPACE in a SPACE, NSM sequence to be: a. significant b. part of the characters in a document that are not markup (at least in the cases we are talking about, since the problem is

Re: [A12n-Collab] Creating fonts for Akan language

2003-08-14 Thread John Hudson
At 12:27 AM 8/7/2003, [EMAIL PROTECTED] wrote: My desire is to create (make) a set of fonts for the Akan language for Windows 2000 to begin with. I have been able to create a crude version for my own use but I know that the people of Ghana would be very happy to be able to install a

Which ancestral links

2003-08-14 Thread John Clews
In message [EMAIL PROTECTED] Michael Everson writes: Re: Colourful scripts and Aramaic This is nearly off topic, but I'd be glad of any clarifications, or references that anybody has. In message [EMAIL PROTECTED] Michael Everson wrote in response to Peter Kirk, with a clarification I agree with

Unicode 4.0 is online at last!

2003-08-14 Thread Kenneth Whistler
Well, I've been promising that good things would come to those who wait. ;-) At last, the Unicode website has been updated with the online chapters for Unicode 4.0. See: http://www.unicode.org/versions/Unicode4.0.0/ Or just go to the Unicode 4.0 link from the home page. Enjoy. --Ken P.S.

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
- Original Message - From: Peter Kirk [EMAIL PROTECTED] To: Jon Hanna [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Wednesday, August 13, 2003 3:05 PM Subject: Re: Questions on ZWNBS - for line initial holam plus alef On 13/08/2003 04:44, Jon Hanna wrote: No, the safe thing to do

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 11/08/2003 06:59, Jon Hanna wrote: There are only two theoretical problems that I can see here, the first is that a whitespace character other than space gets converted to space by attribute value normalisation, and that this changes the meaning of the text in some way. This could only occur

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 13/08/2003 04:44, Jon Hanna wrote: No, the safe thing to do (and the thing that is done) is to treat the space as a space ignoring the fact that the NMTOKEN contains a combining character, this is even safer than your suggestion since it can't mis-identify the combining properties of a

Re: Display of Isolated Nonspacing Marks (was Re: Questions on ZWNBS...)

2003-08-14 Thread Jim Allan
Philip Verdy posted: Could ZWS+combining diacritic may be the best solution for isolated diacritics in text? From http://www.unicode.org/book/ch04.pdf: * Such characters may be large enough to effect the placement of their base character relative to preceding and succeeding base characters.

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
- Original Message - From: Doug Ewell [EMAIL PROTECTED] To: Unicode Mailing List [EMAIL PROTECTED] Cc: Peter Kirk [EMAIL PROTECTED]; Kenneth Whistler [EMAIL PROTECTED] Sent: Monday, August 11, 2003 5:39 PM Subject: Re: Questions on ZWNBS - for line initial holam plus alef Peter Kirk

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Philippe Verdy
On Monday, August 11, 2003 12:27 AM, Kenneth Whistler [EMAIL PROTECTED] wrote: A point I keep trying to make, but which often gets overlooked by people trying to code Unicode mechanisms for dealing with edge cases, is that the design goal of the Unicode Standard is, and always has been, to

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread John Cowan
Peter Kirk scripsit: On 13/08/2003 11:09, Philippe Verdy wrote: ... For this reason, defective combining sequences (combining characters without a leading base character) should be forbidden (invalid for XML). If there is even the remotest possibility of this happening, we need to

RE: AL32UTF8 Vs UTF8

2003-08-14 Thread Carl W. Brown
Jay, Oracle's UTF-8 is not really a valid encoding. It encodes surrogates as if they were characters. The kept the old Unicode 2.x code that only supports BMP to provide sort key compatibility for clients who never upgraded to Unicode 3.0 support and are using 16 bit character encoding

Re: Questions on ZWNBS - for line initial holam plus alef

2003-08-14 Thread Peter Kirk
On 12/08/2003 20:28, John Cowan wrote: Peter Kirk scripsit: 2) In attribute values, LF, CR, and TAB characters are normalized to spaces. Not relevant here. This would be relevant if it is legal for the character after LF, CR, and TAB to be a combining mark. Is this legal? In this

Pre-orders of The Unicode Standard, Version 4.0

2003-08-14 Thread Magda Danish \(Unicode\)
Dear Unicode and Unicore List Subscribers, The release of the Unicode Standard, Version 4.0 is right around the corner. There is still time to place your individual or group orders and to get the book sent to you directly from the publisher, fresh off the press. Anyone placing bulk orders is

  1   2   3   >