Re: [unicode] Re: (TC304.2313) AND/OR: antediluvian views

2000-06-24 Thread Asmus Freytag
At 10:35 AM 6/14/00 -0800, you wrote: At 09:57 AM 06/13/2000 -0800, Otto Stolz wrote: Off-topic Am 2000-06-13 um 17:49 h hat Alain geschrieben: [Having pictograms everywhere] is much lighter than having to provide indications, say, in 12 languages (most common example: toilets). Watch out

Re: UTF-8N?

2000-06-26 Thread Asmus Freytag
At 05:29 AM 6/23/00 -0800, [EMAIL PROTECTED] wrote: Yes. The Unicode Standard will deprecate the use of U+FFEF (Note: not U+FFFE) as a zero-width non-breaking space (despite its formal name). And U+FFEF should *only* be used as a byte order mark and/or signature. (That is already ambiguous

Re: Plane 14 language tags

2000-07-02 Thread Asmus Freytag
At 06:31 AM 6/29/00 -0800, you wrote: Thanks to all for your comments. Has anyone actually used these tags yet? Maybe we should postpone these tags for a while until we get a louder answer to your question, Doug. Once coded, here forever. A./

Re: Should furigana be considered part of plain text?

2000-07-05 Thread Asmus Freytag
At 09:16 AM 7/2/00 -0800, Doug Ewell wrote: The problem with the phrase "plain text ceases to be plain if you decide that layout information needs to be encoded" is the word "layout." In the broadest sense, line and paragraph separation could be considered "layout," and nobody would suggest

Re: Bug in TR 19, and fancy HTML in TR's

2000-07-08 Thread Asmus Freytag
Doug's point is well taken. It's been the editorial committee's policy to make sure that TR's can be accessed from a wide variety of browsers. If limiting the range of formatting that is to be used in TR's makes a real difference to people in the implementers community, then that is something

Re: Names of planes, and request for sneak preview

2000-07-11 Thread Asmus Freytag
At 12:18 PM 7/11/00 -0800, [EMAIL PROTECTED] wrote: What about F? I was told that there are 0x10 possible characters? Oh, by the way, if 12 is a dozen and 144 is a gross, what are 16 and 256? There are 0x10 - 34 possible characters! All code values ending in 0xFFFE and Ox do

Re: Euro character in ISO

2000-07-11 Thread Asmus Freytag
At 01:25 PM 7/11/00 -0800, Leon Spencer wrote: Has ISO addressed the Euro character? Yes. It's at 0x20AC in ISO/IEC 10646-1. There has been an attempt to create a series of 'touched up' 8859 standards. The problem with these is that you get all the issues of character set confusion that

Re: Han character names?

2000-07-12 Thread Asmus Freytag
At 12:56 PM 7/11/00 +, [EMAIL PROTECTED] wrote: If you bought a copy of the book, you would have known. I saw 2.0 in the Barnes Noble book store the other evening, but they only had one left and it was a struggle to get to it through the competing crowd... Of course, they were competing

Re: Miscellaneous comments/questions.

2000-07-13 Thread Asmus Freytag
At 07:50 AM 7/13/00 -0800, Antoine Leca wrote: Alex Bochannek wrote: A similar issue was very interesting to observe in France and Germany. The use of the English language in advertisement seems to run rampant in Germany while almost all ads that include English in France (mostly tag

Re: Unicode FAQ addendum

2000-07-20 Thread Asmus Freytag
There's no updating needed. The key is that The Unicode Standard, Version 3.0 recognizes UTF-16 as the default encoding. Therefore code values (or units) which are defined as 'minimal bit combination that can represent a unit of encoded text' are 16-bit. In UTF-16, one sometimes needs two of

Re: Font for Japanese US applications

2000-07-20 Thread Asmus Freytag
At 08:17 AM 7/20/00 -0800, John O'Conner wrote: 2. Compiling your app as a UNICODE application means that all Win32 API calls use Unicode-enabled versions of the API. Text areas expect you to pass Unicode, and it displays correctly when an appropriate font is used. Even if you don't compile an

Re: Unicode in VFAT file system

2000-07-20 Thread Asmus Freytag
At 09:53 AM 7/20/00 -0800, Ken Krugler wrote: 2. Is little-endian UCS-2 a valid encoding that I just don't know about? Yes, it is. Your example of the VFAT system is a near perfect case, since the details of it form what Unicode calls a 'Higher level protocol' and those may legitimately override

Re: Unicode in VFAT file system

2000-07-20 Thread Asmus Freytag
At 11:34 AM 7/20/00 -0800, John Cowan wrote: 1. Could it be using UTF-16LE? I tried creating an entry with a surrogate pair, but the name was displayed with two black boxes on a Windows 2000-based computer, so I assumed that surrogates were not supported. Probably not. So technically it

Re: Unicode in VFAT file system

2000-07-20 Thread Asmus Freytag
At 11:41 AM 7/20/00 -0800, Ken Krugler wrote: No. UCS-2 and UCS-4 have always been bigendian. Read ISO 10646-1:1993, section "6.3 Octet order" (page 7): When serialized as octets, a more significant octet shall precede less significant octets. The section continues: "When not serialized

RE: 127 strokes beyond the radical?!

2000-07-21 Thread Asmus Freytag
At 03:42 AM 7/21/00 -0800, [EMAIL PROTECTED] wrote: Patrick Andries wrote: De : [EMAIL PROTECTED] On page 876, the character U+6B8B is listed as being 127 strokes beyond the radical. I'd say it's more like 6 strokes beyond the radical. I believe it to be 5 strokes and it is already

RE: Unicode in VFAT file system

2000-07-21 Thread Asmus Freytag
At 04:58 AM 7/21/00 -0800, [EMAIL PROTECTED] wrote: If UCS-2LE is a *standard* encoding (and it is in fact mentioned in UTR-17), how does VFAT directories qualify as a "higher level protocol"? My understanding of "higher level protocol" is that it is a *non* standard usage of some kind, allowed

Re: Unicode in VFAT file system

2000-07-21 Thread Asmus Freytag
At 07:14 AM 7/21/00 -0800, [EMAIL PROTECTED] wrote: Why does it say there are three varieties when a 16-bit datum can only be serialised in two orders? If the scheme UTF-16 doesn't have a BOM, isn't it just one of the other two? When it does have a BOM, it can still be serialised in two ways, so

Re: Euro

2000-07-29 Thread Asmus Freytag
At 12:13 PM 7/28/00 -0800, Roozbeh Pournader wrote: I was not talking about the shape. I think all of us have seen it, and many have also read the documents which define its exact shape using a ruler and a compass. I was talking about the origin of the shape. In some sense, except for purists,

Re: Euro

2000-08-06 Thread Asmus Freytag
At 07:46 PM 7/30/00 -0800, John Cowan wrote: Yeah, how WOULD you make a serifed, rounded E that doesn't look silly and doesn't look like a C with an extra line? Well, maybe you can, I dunno. Anyone who can do that, I'd like to see it.

Re: is there any way to change already defined character codes?

2000-08-09 Thread Asmus Freytag
At 11:01 PM 8/7/00 -0800, Jianping Yang wrote: Not really for Unicode in which we have relocated some codepoints for Hangul between Unicode 1.1 and 2.0 :) Regards, Jianping. "Christopher J. Fynn" wrote: Allowing changes like this would break existing implementations of these standards -

Re: Zero-width ligator

2000-08-10 Thread Asmus Freytag
At 09:36 AM 8/10/00 -0800, Roozbeh Pournader wrote: That seems problematic to me, when used for Arabic. How should one use ZWNJ between two Arabic letters to stop the ligature? The'll get disconnected! (in those rare cases...) Use ZWJ ZWNJ ZWJ and you will get the intended effect. A./

Re: surrogate terminology

2000-10-02 Thread Asmus Freytag
This discussion has become quite "surreal". In the meantime, I and other people who have the need to write about these characters have, with more or less encouragement from the Unicode Editorial Committee started to use the terms "Supplementary Planes", "Supplementary Characters" etc. This

Preliminary charts for Unicode 3.2 draft

2000-10-17 Thread Asmus Freytag
Preliminary character charts are now available for those characters that are proposed to go into Unicode 3.2 (and into AMD1 to ISO/IEC 10646-1:2000). The majority of the proposed characters are mathematical symbols and arrows. The new URL is: http://www.unicode.org/charts/draftunicode32/

RE: Preliminary charts for Unicode 3.2 draft

2000-10-21 Thread Asmus Freytag
At 07:44 AM 10/20/00 -0800, you wrote: Asmus, Do you have a list of the Unicode 3.1 codes? Carl They will appear in due course on ...draftunicode31 You guys are all so eager! A./ PS: I've made some font fixes for the draftunicode32 charts - however they don't affect any of the new

Re: Unicode Technical Reports (Formerly: RE: TR22)

2000-12-12 Thread Asmus Freytag
At 09:53 PM 12/9/00 -0800, Asmus Freytag wrote: Hello, UniCoders! Whatever happened to UniCode Technical Report *#12*—what's it about?! Is TR12 closer to adoptation by UniCode? Unicode Technical Report 12 was superseded by additions to Unicode 3.0 before it was even advanced to final TR stage

Re: Character map

2000-12-12 Thread Asmus Freytag
At 12:50 PM 12/11/00 -0800, James Kass wrote: Michael (michka) Kaplan wrote: The Windows NT4 charmap does a fair job of this, and the Windows 2000 one does a better job. For a presentation that follows the Unicode Standard, try Unibook for NT or Win95 on http://www.unicode.org/unibook

[unicode] Re: Moving mail lists

2001-03-22 Thread Asmus Freytag
At 03:38 AM 3/22/01 +, Christopher John Fynn wrote: But you can also filter mails based on the To: header "To: [EMAIL PROTECTED]" - every mail client I've seen that supports filtering lets you filter based on that header. Except if the message is a cc:... Actually of more interest to me

Re: Identifiers

2001-04-16 Thread Asmus Freytag
At 09:24 AM 4/16/01 +0900, Martin Duerst wrote: NFC only eliminates things that are supposed to look exactly the same. NFKC eliminates quite a bit more than that. NFKC eliminates some things that are quite distinct - it should not be seen as a general purpose folding mechanism. A./

Fwd: Re: Byte Order Marks

2001-04-19 Thread Asmus Freytag
Date: Thu, 19 Apr 2001 12:59:43 -0700 To: Tomas McGuinness [EMAIL PROTECTED] From: Asmus Freytag [EMAIL PROTECTED] Subject: Re: Byte Order Marks At 02:58 PM 4/19/01 +0200, you wrote: If its absent is it safe to assume any particular order (i.e. Big or Little Endian?) The default order is Big

Re: Egyptian Hieroglyphics

2001-04-20 Thread Asmus Freytag
At 10:20 AM 4/20/01 -0400, Dean A. Snyder wrote: ... the Unicode Consortium should only entertain proposals to the standard after ACTIVELY seeking the input from the relevant (scholarly) communities - something which the ICE and UFU projects are doing for two cuneiform script systems. And, if it

RE: ASCII adequacy (was: RE: benefits of unicode)

2001-04-20 Thread Asmus Freytag
At 03:50 PM 4/20/01 -0500, [EMAIL PROTECTED] wrote: I say 0 and 1 are adequate. I find this discussion rather pointless since we all already know that ASCII is adequate if the given premise is that ASCII is adequate. I don't see what's there to discuss. We are just trying to see if tautologies

Re: On the possibility of guidance...

2001-04-23 Thread Asmus Freytag
Hear, hear, At 05:43 PM 4/23/01 -0400, Sarasvati wrote: Dear Subscribers -- This mail list is a public free-for-all with uncontrolled distribution. As a corollary, the act of publishing material on this list is tantamount to unrestricted publication. If you mail out something that should be

Re: Tags and the Private Use Area

2001-04-29 Thread Asmus Freytag
Why Unicode will never endorse certain proposals By making the Private Use Area private, the Unicode Consortium imposed on itself a restriction to stay absolutely neutral on the use of these characters. In other words, it cannot promote or

Re: Tags and the Private Use Area

2001-04-29 Thread Asmus Freytag
William Overington wrote: However, there is something that I feel that the Unicode Consortium could do, if it so wished, without violating that rule. I suggest that the Consortium could, if it so chooses, encode one or more regular unicode characters together with a protocol so that

Re: Tags and the Private Use Area

2001-05-01 Thread Asmus Freytag
specifically for use in the kind of protocol that you describe, it would have shown a preference over other users of the PUA who either don't use any protocol or use a set of PUA characters for the same purpose using a different protocol not recognized by the Consortium. In other words: Asmus

Re: RE: Word, Asian characters, and Arial Unicode

2001-05-07 Thread Asmus Freytag
At 09:54 AM 5/7/01 -0700, Rick McGowan wrote: Now, Word2000 or some other product, or some specific set of fonts may not be what a classicist wants, but that limitation is not because the width of many characters are somehow CONSTRAINED by the East Asian Width property. While that is true, any

Re: FW: Enclosed Alphanumerics

2001-05-08 Thread Asmus Freytag
For example the MS Mincho font supports these characters as serifed numbers, the Arial Unicode MS font supports these as sans-serif. I believe it should be possible to use these fonts with Word 97. There are many ways to get these fonts, I usually get them by installing the Far East support for

Limbu script proposal

2001-05-22 Thread Asmus Freytag
of the Limbu Script L2 001-138 Summary proposal form (Limbu) L2 001-139 Printed samples of Limbu Please follow up with Rick or Ken if you have issues with any of the contents. A./ Asmus Freytag Unicode Liaison to WG2

RE: Unicode-based Cyrillic-Latin transliteration table

2001-05-30 Thread Asmus Freytag
At 12:02 AM 5/29/01 -0700, James Williams wrote: Can someone please help me understand whether support for double byte is the same as being Unicode compliant. Any elaboration would be greatly appreciated. If for instance, being Unicode compliant has any additional value/benefits, etc... I'd like

Re: Annotation characters

2001-07-23 Thread Asmus Freytag
At 01:57 PM 7/23/01 +0900, Martin Duerst wrote: The language here is slightly different, and I have no idea whether the intent was exactly the same, but in any case it seems that the intents were very close to each other. IA characters were from the beginning intended for in-process use, in

RE: [OT] o-circumflex

2001-09-07 Thread Asmus Freytag
At 11:50 AM 9/7/01 -0500, Ayers, Mike wrote: Words with the same spelling and different pronunciation are uncommon but exist in English, the classic example being read and its own past tense. Actually, this is a bit more common than you think, since the pronunciation of vowels in English

Re: [OT] o-circumflex

2001-09-07 Thread Asmus Freytag
At 01:06 PM 9/7/01 -0400, David Gallardo wrote: As a practical matter, you need to take the diacritics into account when sorting, even in English where they (may or may not) have linguistic significance, otherwise you'll get nondeterministic behaviour. In other words, résumé and resume should

Re: [OT] o-circumflex

2001-09-08 Thread Asmus Freytag
At 09:04 PM 9/7/01 -0700, Mark Davis wrote: I disagree. What you want is a merged database field. See http://www.macchiato.com/slides/icu_collation.ppt Mark Mark, David took the remainder of our discussion off the alias. I won't repeat it here, just to note that we've agreed that merged

Re: [OT] o-circumflex

2001-09-08 Thread Asmus Freytag
At 02:45 PM 9/8/01 -0700, Mark Davis wrote: If you use a Danish tailoring of the UCA that equates Å and AA (at least at a primary and secondary level), then they will sort the same way. A string search that uses the same tailoring will also find Ålborg when given Aalborg (and vice versa). But if

Re: What code point is assigned for the Newton unit?

2001-09-13 Thread Asmus Freytag
Your letter makes clear that Unicode needs to do a better job of identifying the preferred character code for many situations. The information is there to a large extent, but buried in the fine print or in data tables. You will see that there is a canonical decomposition from U+212B to

Re: PDUTR #26 posted

2001-09-13 Thread Asmus Freytag
At 11:42 AM 9/13/01 +, Marcin 'Qrczak' Kowalczyk wrote: IMHO Unicode would have been a better standard if UTF-16 hadn't existed. Decidedly not. In fact, Unicode would not be widely implemented today. Just UTF-8 and UTF-32, code points in the range U+..7FFF, no surrogates, no

Re: FW: 6 questions

2001-09-18 Thread Asmus Freytag
At 12:26 PM 9/18/01 -0700, Kenneth Whistler wrote: 3.Why don't noBreak formatted Unicode characters have a canonical decomposition (the compatibility decomposition surrounded by glue)? A long story. But the short answer is that such a decomposition would cause problems for

Re: UTF-8 UCS-2/UTF-16 conversion for library use

2001-09-23 Thread Asmus Freytag
At 10:21 AM 9/21/01 -0700, Kenneth Whistler wrote: It is my impression, however, that most significant applications tend, these days, to be I/O bound and/or network transport bound, rather than compute bound. ... We don't hear much, anymore, about how wasteful Unicode is in its storage of

Re: plane business

2001-10-01 Thread Asmus Freytag
There are 66 non-characters as of Unicode 3.1, there were 34 non-characters before. There are no hidden non-characters, but there were 'hidden' planes in Unicode 3.0 - hidden in the limited sense that they were defined as character and non-character locations, but no characters were assigned,

Re: plane business

2001-10-02 Thread Asmus Freytag
At 10:42 PM 10/1/01 -0700, Bernard Miller wrote: --- Asmus Freytag [EMAIL PROTECTED] wrote: There are 66 non-characters as of Unicode 3.1, there were 34 non-characters before. I understand now.. the non characters in 16 higher planes were defined first, then the ones in the arabic

Re: Letters d L l and t with caron

2001-10-24 Thread Asmus Freytag
At 03:12 PM 10/24/01 -0400, tom emerson wrote: Asmus Freytag writes: FWIW, Robert Bringhurst's The Elements of Typographic Style, 2nd Edition has in part this to say about caron: Do you have the date of this book? 1996. It's a fabulous book, 'caron' aside: I don't doubt

Re: origin of term caron

2001-10-24 Thread Asmus Freytag
At 06:32 PM 10/24/01 -0500, G. Adam Stanislav wrote: The first time I encountered the term caron was in the eighties when studying the design of Adobe PostScript fonts. Not being a native English speaker, I simply took it for the English word for this diacritic. This opens up the possibility that

Fwd: Re: origin of the term caron

2001-10-27 Thread Asmus Freytag
Here's some more info on a possible origin. A./ Date: Fri, 26 Oct 2001 13:52:38 -0400 (EDT) From: Barbara Beeton [EMAIL PROTECTED] well, guys, i don't think we're going to get anything much better than this. this recollection predates 1984 by a *long* time. cheers.

Re: Worst case scenarios on SCSU

2001-10-31 Thread Asmus Freytag
At 05:50 PM 10/31/01 -0800, Kenneth Whistler wrote: I have no quarrel with the claim that the SCSU scheme could be implemented directly on UTF-32 data. But as Unicode Technical Standard #6 is currently written, that is not how to do it conformantly. Actually, no specific encoding form is

Re: restricting the meaning of characters over time

2001-11-12 Thread Asmus Freytag
At 11:26 AM 11/7/01 -0800, Eric Muller wrote: Let's rewind to 1996. I encode a document, and I want a math less-than or equal character. The picture I want for it has the equal bar slanted. Looking throughout my Unicode 2.0 standard, I conclude that U+2264, LESS-THAN OR EQUAL is what I want (with

Re: Indic editing (was: RE: The real solution)

2001-11-28 Thread Asmus Freytag
At 12:37 PM 11/27/01 -0800, James Kass wrote: Isn't that where it belongs? Default display for isolated combining marks shows them with the dotted circle. No it does not. That's an artifact of the Unicode code chart notation. 25CC in many fonts (and in the charts for that matter) looks

RE: Indic editing (was: RE: The real solution)

2001-11-28 Thread Asmus Freytag
At 12:32 PM 11/28/01 +0100, Marco Cimarosti wrote: I don't think that Unicode requires that a non spacing mark *has* to be placed on something in order to be displayable. However, some fonts may chose to represent a stand-alone non spacing mark as floating on some default glyph, for either

Re: FW: A product compatibility question

2001-10-09 Thread Asmus Freytag
At 01:43 PM 10/9/01 -0400, Gary P. Grosso wrote: Because of Unicode's Han unification, I was under the impression that to get both Traditional Chinese and Simplified Chinese to really look right would require using different fonts for each. To have different fonts for the same characters in a

RE: FW: A product compatibility question

2001-10-09 Thread Asmus Freytag
At 03:43 PM 10/9/01 -0500, Ayers, Mike wrote: Oooh - a swing and a miss! No -- a pretty complete misunderstanding of my posting on your part. The implication of my statements is that rich text support is required at least at some level of your architecture as soon as you want to go

Ext-B fonts updated

2001-10-13 Thread Asmus Freytag
We've finally been able to obtain better fonts for the new characters in CJK-Extension B. The PDF chart is at http://www.unicode.org/charts/PDF/U2.pdf. Enjoy. A./ PS: Fair warning: the complete PDF file is 13MB and contains only the glyph and code point, no other information about the

Re: Clean and Unicode compliance

2001-12-14 Thread Asmus Freytag
W3C's HTML validation service seems to have no such problems. We've been using it to validate all the files on the unicode site regularly. A validator *should* look between the and in order to catch invalid entity references, esp. invalu NCRs. For UTF-8, it would ideally also check that no

Re: Clean and Unicode compliance

2001-12-14 Thread Asmus Freytag
James, NCRs *are* markup. And validating that the encoding matches the declaration (e.g. UTF-8 is not ill-formed) has nothing whatsoever to do with content, but all with verifying that the file conforms to the HTML specification. All this is completely different from spelling and grammar

Re: U+2028

2001-12-14 Thread Asmus Freytag
At 02:01 AM 12/15/01 +, Christian Cooke wrote: The text annotations to U+000A and U+000D in Unicode 3.0 do not refer to U+2028 and do not recommend the use of U+2028 as the preferred character for for text processing in this context. Does the UTC have a recommendation about using U+2028

RE: Astral planes (was: RE: Plane One use, was Re: HTML Validatio n)

2001-12-18 Thread Asmus Freytag
At 10:38 AM 12/18/01 -0800, Rick Cameron wrote: It looks like UCS-2 and UCS-4 are defined in ISO 10646. Does that standard restrict the valid range of UCS-4 to 0..10? It will with AMD1 to ISO/IEC 10646-1:2000 which is expected to pass final balloting and head for publication in 2002. If

RE: Astral planes (was: RE: Plane One use, was Re: HTML Validatio n)

2001-12-18 Thread Asmus Freytag
At 03:38 PM 12/18/01 -0800, Rick Cameron wrote: Are you planning to add an explicit statement to the Unicode standard that the valid range for scalar values is 0..10? (Or is such a statement there, and I've just missed it?) see below: In particular, as the use of 32-bit variables to hold

Re: Microsoft input method, 950, and Unicode mapping

2001-12-18 Thread Asmus Freytag
On top of that, it looks like 950 maps a bogus symbol or punctuation character to U+2574. (2574 is one of a set of 4, and only 1 is mapped for starters. Fonts covering CP950 give a way different image for that character than you'd expect from either the charts or the names... I let some

Re: Microsoft input method, 950, and Unicode mapping

2001-12-19 Thread Asmus Freytag
At 10:38 AM 12/19/01 +, Kevin Bracey wrote: In message [EMAIL PROTECTED] Asmus Freytag [EMAIL PROTECTED] wrote: On top of that, it looks like 950 maps a bogus symbol or punctuation character to U+2574. (2574 is one of a set of 4, and only 1 is mapped for starters. Fonts

Re: PDUTR #25: Unicode Support for Mathematics

2001-12-29 Thread Asmus Freytag
At 12:34 AM 12/28/01 -0600, [EMAIL PROTECTED] wrote: If you want to define text/math, and provide the disappearing parenthesis and precedence tables and everything, then that's fine, but I don't see why it should be part of Unicode, anymore than full music rendering is part of Unicode. It's a

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-29 Thread Asmus Freytag
At 12:07 PM 12/29/01 +0100, Stefan Persson wrote: Seeing that Unicode already has left-to-right and right-to-left override characters, I wonder if a top-to-bottom override character might also be reasonable. Which are the code points for these characters? Please see

Re: Characters vs. glyphs in scholarly fonts

2001-12-30 Thread Asmus Freytag
At 03:41 PM 12/29/01 -0500, David J. Perry wrote: The ancient Roman monetary unit sestertius is not yet in Unicode. It might well be accepted if proposed, but would be given one codepoint. However, this unit appears in a variety of ways in inscriptions: IIS, HS, II with a horizontal line

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-30 Thread Asmus Freytag
At 02:33 PM 12/30/01 -0500, Tex Texin wrote: It is a bit inconsistent and therefore confusing. I searched for bidirectional which immediately pointed me at the general punctuation pages in a pdf file. Searching for bidrectional in that file turns up empty. This is one of the few cases of an

Re: Vertical scripts (was: Tategaki (was: Re: Updated...))

2001-12-31 Thread Asmus Freytag
At 12:22 PM 12/31/01 -0500, Tex Texin wrote: I was fooled by that earlier in the year as well. The links to the other pages should be at the top of the web page to highlight that the page is a partial list and to make it easy to reference the other pages. Most people will not scroll to the bottom

Fwd: PDUTR #25: Unicode Support for Mathematics

2002-01-04 Thread Asmus Freytag
version 2.3+CL 01/14/2001 with nmh-1.0.4 To: Barbara Beeton [EMAIL PROTECTED], Asmus Freytag [EMAIL PROTECTED], Murray Sargent III [EMAIL PROTECTED] cc: [EMAIL PROTECTED] (linux-utf8) Subject: PDUTR #25: Unicode Support for Mathematics X-URL: http://www.cl.cam.ac.uk/~mgk25/ Date: Thu, 03 Jan

Re: Question

2002-01-15 Thread Asmus Freytag
At 06:26 PM 1/15/02 -0800, Kenneth Whistler wrote: Hello. I am looking for help with Unicode. I was recently told by my credit card processing company that I need to Upgrade my site to unicode 3.2 in order to get a perl script working. There has got to be a disconnect here somewhere.

Re: The benefit of a symbol for 2 pi

2002-01-18 Thread Asmus Freytag
At 10:06 AM 1/18/02 -0700, Robert Palais wrote: Which seems to make Unicode a defender of the status quo. Inaction is as political as action. We are holders of the standards for the technology for encoding symbols, and we won't admit new symbols until they are widely used... not necessarily the

RE: The benefit of a symbol for 2 pi

2002-01-18 Thread Asmus Freytag
Just an aside on terminolgy: At 08:02 PM 1/18/02 +0100, Marco Cimarosti wrote: 3) A newly added operator (ZWL) which allows joining two characters into a it's CGJ for Combinign Grapheme Joiner 4) A set of operators called Ideographic Description Character (IDC) for They are for Ideographic

The original ideals

2002-01-18 Thread Asmus Freytag
At 11:02 AM 1/18/02 -0800, Barry Caplan wrote: I've always been under the impression that one of the original goals of the Unicode effort was to do away with he sort of multi-width encodings we are all too familiar with (EUC, JIS, SJIS, etc.). this was to be accomplished by using a fixed width

Re: The benefit of a symbol for 2 pi

2002-01-18 Thread Asmus Freytag
At 11:36 AM 1/18/02 -0800, Rick McGowan wrote: It is our job as a standarizing organization to standardize what is IN USE so that (as a goal) people can standard-ly communicate those symbols internationally without ambiguity. It is _NOT_ our job, and never will be our job, to invent new symbols

Re: Devanagari

2002-01-20 Thread Asmus Freytag
At 12:48 AM 1/20/02 -0800, James Kass wrote: The arguments about relative size are true, but in this day and age are considered unimportant. Graphics files are extremely large in comparison with text files of any script and so are sound files. Devanagari UTF-8 is three bytes. The four byte

Re: Unicode 3.2: BETA files updated

2002-01-25 Thread Asmus Freytag
At 06:29 AM 1/24/02 +, David Hopwood wrote: Kenneth Whistler wrote: And StandardizedVariants.html has been updated again, with more of the missing glyphs provided. I can't see any difference between plain U+2278 (either in the draft code chart or StandardizedVariants.html) and U+2278

RE: Unicode 3.2: BETA files updated

2002-01-25 Thread Asmus Freytag
At 11:31 AM 1/25/02 -0800, Julie Allen wrote: John Hudson asked, As Unicode continues to grow, I wonder if we can expect another book-- or multiple volumes -- at some stage, or if the standard will become a purely electronic document? Has any decision been taken about this? There are

Re: Unicode 3.2: BETA files updated

2002-01-25 Thread Asmus Freytag
At 10:58 PM 1/24/02 +, David Hopwood wrote: One possibility is to make VS1 specify what is now the reference glyph, and VS2 specify the alternate glyph. Unmarked would mean either. Boy, great minds do think alike. I proposed that in a paper to the UTC last year. ;-) You realize that this

Re: The benefit of a symbol for 2 pi

2002-01-26 Thread Asmus Freytag
At 07:40 PM 1/26/02 -0500, [EMAIL PROTECTED] wrote: One of the new characters scheduled for Unicode 3.2 is U+213F DOUBLE-STRUCK CAPITAL PI (A 500-byte GIF is attached.) Double-struck pi! What better symbol to represent 2 * pi? These double struck symbols are used by mathematical sofware

Re: Variation Selection (Was Re: Unicode 3.2: BETA files updated)

2002-01-27 Thread Asmus Freytag
At 12:33 AM 1/27/02 -0800, Mark Davis \(jtcsv\) wrote: I find it fairly pointless to say that a font supports the variation selection sequence U+03B8, U+FE00 if it does not provide a visual distinction from U+03B8; and such a distinction should be based on the entry description. Thus, of the

Re: Variation Selection (Was Re: Unicode 3.2: BETA files updated)

2002-01-28 Thread Asmus Freytag
At 12:43 PM 1/27/02 -0800, Mark Davis \(jtcsv\) wrote: It sounds like what you are saying, in concrete terms, is that Font #6 at the bottom of: http://www.macchiato.com/utc/variation_selection/variation_selection_f ollowup.htm is conformant. If that is so, then we would have to have an

Re: Proposing Fraktur

2002-01-29 Thread Asmus Freytag
Kana (Hiragana/Katakana): Two (essentially) iso-phonic(?) systems, where each symbol in one set has a corresponding symbol in the other set, both denoting the same sound value. The set of forms are historically unrelated. There is little overlap in the

When to use markup: (Was:Introducing the idea of a ROMAN VARIANT SELECTOR (was: Re: Proposing Fraktur))

2002-01-31 Thread Asmus Freytag
At 09:42 AM 1/30/02 +0100, Karl Pentzlin wrote: The question is, are typesetting rules part of the script? (I mean rules in the sense of obligatory regulations, not guidelines). This distinction is a very German way of approaching the question. If yes, (in my opinion) the plain text must carry

Re: Old Italic and Bidi Mirroring

2002-02-02 Thread Asmus Freytag
Found this message in my outbox; just sending this out now for completeness. At 12:47 PM 1/7/02 +0330, Roozbeh Pournader wrote: Accordingly, the Old Italic script has a default directionality of strong left-to-right in this standard. When directional overrides are used to produce

Re: FW: Using Unicode Characters in ASCII Streams

2002-02-05 Thread Asmus Freytag
At 10:32 AM 2/5/02 -0800, Magda Danish (Unicode) wrote: Begin forwarded message: From: [EMAIL PROTECTED] Date: 2002-02-05 10:44:20 -0800 To: [EMAIL PROTECTED] Subject: Using Unicode Characters in ASCII Streams Hallo, we are a manufacturer of time and attendance terminals which are

Re: Unicode and Security

2002-02-07 Thread Asmus Freytag
At 11:53 AM 2/7/02 -0600, David Starner wrote: a superset of a number of preexisting character sets, so that it was possible for those users to move to Unicode without problems. Since important preexisting character sets seperated Greek, Cyrillic and Latin scripts, Unicode had to. Had Unicode not

Re: Unicode and Security

2002-02-07 Thread Asmus Freytag
At 01:21 PM 2/7/02 -0500, Elliotte Rusty Harold wrote: I'm not sure Unicode can be fixed at this point. The flaws may be too deeply embedded. The real solution may involve waiting until companies and people start losing significant amounts of money as a result of the flaws in Unicode, and

Re[2]: Unicode and Security

2002-02-08 Thread Asmus Freytag
At 06:18 PM 2/8/02 +0100, Philipp Reichmuth wrote: Oh, it is very well possible to design a character set that supports all of Latin, Cyrillic and Greek without being susceptible to this problem beyond the familiar 1-l-|, 0-O dimension. The main premise is to encode glyphs instead of characters

Re: This spoofing and security thread

2002-02-13 Thread Asmus Freytag
At 06:37 PM 2/11/02 +, Juliusz Chroboczek wrote: We, ASCII-age programmers, are used to considering plain text rendering as being injective up to binary identity. We carefully choose fonts that distinguish between O and 0, 1 and l. We use editors that warn us about non-native line ending

Re: Unicode and end users

2002-02-14 Thread Asmus Freytag
At 09:22 AM 2/14/02 +, Martin Kochanski wrote: Are there, in fact, many circumstances in which it is necessary for an end user to create files that do *not* have a BOM at the beginning? In principle this is a requirement for data being labelled *external to the date* as being in either

Re: Smiles, faces, etc

2002-02-16 Thread Asmus Freytag
Whether or not they would get support to be encoded is almost irrelevant as long as no-one comes forward and makes a formal proposal with solid background information. Only then can this issue be settled where it matters: in the UTC. Discussions on open lists like this, unless accompanied by

Re: Unicode and end users

2002-02-16 Thread Asmus Freytag
At 12:37 PM 2/16/02 -0800, Doug Ewell wrote: Why would anyone, faced with a UTF-8 file that contains invalid sequences, want to retain the invalid sequences, much less convert the file to another encoding form that either (a) preserves the invalid sequences or (b) leaves a marker showing where

Updated Unicode Technical Report #20: Unicode in XML and other markup languages

2002-02-18 Thread Asmus Freytag
Unicode Technical Report #20 at http://www.unicode.org/unicode/reports/tr20/ has been updated. This report is jointly published as Unicode Technical Report and W3C Note. It has been updated primarily to reflect the addition of character to Unicode 3.1 and the pending addition of characters to

Re: Unicode Search Engines

2002-02-19 Thread Asmus Freytag
At 09:52 PM 2/18/02 -0800, Doug Ewell wrote: So if some language turns out to need a with horn in the future, its readers will have to cross its fingers that rendering engines become capable of displaying U+0061 U+031B properly. Support for such arbitrary combination is apparently in the works

Re: Recent Threats

2002-02-27 Thread Asmus Freytag
Would you by chance mean 'threads' ? There is a difference, you know ;-) A./ At 04:49 PM 2/26/02 +0700, Stefan Probst wrote: Good Evening, can somebody pls. explain to me dummy, what the long threats about R(o|u)mania, Canada, California, Yankees, and Initials in various countries..

Re: ISO 3166 (country codes) Maintenance Agency Web pages move

2002-02-27 Thread Asmus Freytag
At 05:36 PM 2/27/02 -0500, John Cowan wrote: numbering houses (which seems to be 18th century) I would have ventured that it is much older than that, dimly recalling some older maps from a small museum I once visited. There's a difference between house numbers and street addresses. House

  1   2   3   4   5   6   7   8   9   10   >