Re: Unicode forms for internal storage

2004-01-21 Thread Elliotte Rusty Harold
its own issues). No, I don't plan to support XML 1.1, certainly not in XOM 1.0 and beyond that not unless someone asks for it and demonstrates a real need. This is discussed in the XOM FAQ at http://www.cafeconleche.org/XOM/faq.xhtml#d0e183 -- Elliotte Rusty Harold [EMAIL PROTECTED

Re: Unicode forms for internal storage

2004-01-21 Thread Elliotte Rusty Harold
. However, that's going to have to wait for 1.1. For 1.0, I just want to pick something that's a reasonable compromise across the most common cases. -- Elliotte Rusty Harold [EMAIL PROTECTED] Effective XML (Addison-Wesley, 2003) http://www.cafeconleche.org/books/effectivexml http

Unicode forms for internal storage

2004-01-20 Thread Elliotte Rusty Harold
as the translation between UTF-8 and UTF-16 the class is currently performing on every call to setValue and getValue, ideally faster. Has anyone done any work on Unicode formats for this use-case? Does anyone have any references or ideas to share? -- Elliotte Rusty Harold [EMAIL PROTECTED] Effective

Re: Unicode forms for internal storage

2004-01-20 Thread Elliotte Rusty Harold
is if you're transmitting this encoding on the wire, which I am definitely not doing. But SCSU looks like a really nice option. Thanks. -- Elliotte Rusty Harold [EMAIL PROTECTED] Effective XML (Addison-Wesley, 2003) http://www.cafeconleche.org/books/effectivexml http://www.amazon.com/exec

RE: Unicode forms for internal storage

2004-01-20 Thread Elliotte Rusty Harold
, of the string (therefore max of 32,767 octets per string, which shouldn't ordinarily be a problem). That would be a problem. I definitely cannot rule out long strings, where long is quite a bit larger than 32K. -- Elliotte Rusty Harold [EMAIL PROTECTED] Effective XML (Addison-Wesley, 2003) http

Re: Chinese rod numerals

2004-01-10 Thread Elliotte Rusty Harold
was ever a distinguishing characteristic of two otherwise identical characters? -- Elliotte Rusty Harold [EMAIL PROTECTED] Effective XML (Addison-Wesley, 2003) http://www.cafeconleche.org/books/effectivexml http://www.amazon.com/exec/obidos/ISBN%3D0321150406/ref%3Dnosim

Re: Backslash n [OT] was Line Separator and Paragraph Separator

2003-10-22 Thread Elliotte Rusty Harold
of justice! (Tongue firmly in cheek.) -- Elliotte Rusty Harold [EMAIL PROTECTED] Processing XML with Java (Addison-Wesley, 2002) http://www.cafeconleche.org/books/xmljava http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA

RE: Line Separator and Paragraph Separator

2003-10-21 Thread Elliotte Rusty Harold
it in a real text file. I have. It shows up in a lot of old text files as a page separator character. It's also occasionally used as a document separator when someone wants to stuff multiple XML documents in the same file. -- Elliotte Rusty Harold [EMAIL PROTECTED] Processing XML with Java

NFC code

2003-10-19 Thread Elliotte Rusty Harold
/#Code_Sample. However, it's Copyright © 1998-1999 Unicode, Inc. All Rights Reserved. so I can't copy it into my own project. Is there anything else out there? -- Elliotte Rusty Harold [EMAIL PROTECTED] Processing XML with Java (Addison-Wesley, 2002) http://www.cafeconleche.org/books/xmljava

Re: NFC code

2003-10-19 Thread Elliotte Rusty Harold
use this for now. -- Elliotte Rusty Harold [EMAIL PROTECTED] Processing XML with Java (Addison-Wesley, 2002) http://www.cafeconleche.org/books/xmljava http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA

Re: Non-ascii string processing?

2003-10-08 Thread Elliotte Rusty Harold
documents. It also would have been much more compatible with existing parsers and tools. :-( -- Elliotte Rusty Harold [EMAIL PROTECTED] Processing XML with Java (Addison-Wesley, 2002) http://www.cafeconleche.org/books/xmljava http://www.amazon.com/exec/obidos/ISBN%3D0201771861

RE: Non-ascii string processing?

2003-10-07 Thread Elliotte Rusty Harold
would argue that the schema language should itself be written in terms of grapheme clusters rather than characters, but it isn't and thus we need to handle characters to implement a validator in accordance with the spec. -- Elliotte Rusty Harold [EMAIL PROTECTED] Processing XML with Java

Re: Everson Mono

2003-02-14 Thread Elliotte Rusty Harold
be more appropriate for the next time I need something like this. Is there any list anywhere of the ranges or code points that are included? -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

Re: ISO 8859-11 (Thai) cross-mapping table

2002-10-08 Thread Elliotte Rusty Harold
if Java uses one-byte per boolean in arrays or not?) -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | XML in a Nutshell, 2nd

Re: ISO 8859-11 (Thai) cross-mapping table

2002-10-08 Thread Elliotte Rusty Harold
; an odd value fails. Interesting. Do you have any references on that I can explore further? A quick google search didn't turn up anything relevant. I'm curious to see how the algorithm actually works. -- +---++---+ | Elliotte Rusty

ISO 8859-11 (Thai) cross-mapping table

2002-10-05 Thread Elliotte Rusty Harold
The Unicode data files at http://www.unicode.org/Public/MAPPINGS/ISO8859/ do not include a mapping for ISO-8859-11, Thai. Is there any particular reason for this? Is ISO-8859-11 unfinished or deprecated or unable to be mapped to Unicode or some such? If none of these things are true, is there

Re: ISO 8859-11 (Thai) cross-mapping table

2002-10-05 Thread Elliotte Rusty Harold
. There are about 10 characters in TIS-620 that are mapped to the Unicode replacement character. This is from 1998 though. Has Unicode's Thai support improved any in later versions? -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED

Character for e, 2.71828...

2002-04-07 Thread Elliotte Rusty Harold
of any kind. Is this a mistake in the Unicode data? -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible, 2nd

Re: [OT] Re: The exact birthday of French: 0842-02-14

2002-03-28 Thread Elliotte Rusty Harold
Ages. If the hypothesis proved to be true, Islamic history probably wouldn't be affected very much at all. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

RE: [OT] Re: The exact birthday of French: 0842-02-14

2002-03-28 Thread Elliotte Rusty Harold
. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible, 2nd Edition (Hungry Minds, 2001) | | http://www.cafeconleche.org

Re: The exact birthday of French: 0842-02-14

2002-03-27 Thread Elliotte Rusty Harold
:8VRf94MWzUgC:www.cl.cam.ac.uk/~mgk25/volatile/Niemitz-1997.pdf+did+Charlemagne+existhl=en -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible

Re: The exact birthday of French: 0842-02-14

2002-03-27 Thread Elliotte Rusty Harold
fell into the Dark Ages has been a hotly debated subject for a long time. It's astonishing to consider that the answer might be that it never happened at all. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

Re: ISO 3166 (country codes) Maintenance Agency Web pages move

2002-02-25 Thread Elliotte Rusty Harold
. That might be a faulty assumption in the long run. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible, 2nd

Re: Off-Topic (Re: This spoofing and security thread)

2002-02-14 Thread Elliotte Rusty Harold
there though. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible, 2nd Edition (Hungry Minds, 2001

Re: Off-Topic (Re: This spoofing and security thread)

2002-02-14 Thread Elliotte Rusty Harold
impressive since I think e makes up about 20% of the letters in typical German. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

Re: This spoofing and security thread

2002-02-13 Thread Elliotte Rusty Harold
. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible, 2nd Edition (Hungry Minds, 2001) | | http

Re: Unicode and Security

2002-02-07 Thread Elliotte Rusty Harold
. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible, 2nd Edition (Hungry Minds, 2001) | | http://www.ibiblio.org/xml/books/bible2

Re: Unicode and Security

2002-02-07 Thread Elliotte Rusty Harold
think we can count on client software getting this right. (Hell, Microsoft, can't even stop e-mail from running scripts.) The problem needs to be fixed closer to the source. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED

Re: Unicode and Security

2002-02-07 Thread Elliotte Rusty Harold
On Thu, Feb 07, 2002 at 10:34:20AM -0500, Elliotte Rusty Harold wrote: Unicode is a character encoding, not a glyph encoding. Furthermore, it's a superset of a number of preexisting character sets, so that it was possible for those users to move to Unicode without problems. Since important

Re: Unicode and Security

2002-02-07 Thread Elliotte Rusty Harold
obvious. In Unicode, it is not. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible, 2nd Edition (Hungry Minds

Re: Unicode and Security

2002-02-07 Thread Elliotte Rusty Harold
? Of course it's not. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible, 2nd Edition (Hungry Minds, 2001

Re: Unicode and Security

2002-02-07 Thread Elliotte Rusty Harold
are no substitute for prevention. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible, 2nd Edition (Hungry Minds, 2001

Re: Unicode and Security

2002-02-07 Thread Elliotte Rusty Harold
systems; but when we're designing something truly new like internationalized domain names it only makes sense to avoid these known problems. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

Re: Unicode and Security

2002-02-07 Thread Elliotte Rusty Harold
in the face of this. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible, 2nd Edition (Hungry Minds, 2001

Re: HTML Validation (was Re: Clean and Unicode compliance)

2001-12-16 Thread Elliotte Rusty Harold
though. I suspect a lot of our tools haven't been thoroughly tested with PLane-1 and are likely to have these sorts of bugs in them. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

Re: Unicode aware drawing program

2001-12-09 Thread Elliotte Rusty Harold
overlapping shapes with anything approaching an adequate degree of facility. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML

Unicode aware drawing program

2001-12-06 Thread Elliotte Rusty Harold
could write a Java program to auto-generate an SVG image, but that feels a little like using a hearse to haul dirt around my farm. It would probably work, but isn't really the right tool for the job. :-) -- +---++---+ | Elliotte Rusty

Seeking fonts for recently added characters

2001-12-05 Thread Elliotte Rusty Harold
CROSS;So;0;ON;N; The entire musical symbols block -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | Java I/O

Re: Two new documents

2001-10-08 Thread Elliotte Rusty Harold
on the grounds that they were idiosyncratic (i.e. didn't show up anywhere except the PHAISTOS DISK). Has that changed? -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

RE: Playing with Unicode (was: Re: UTF-17)

2001-06-25 Thread Elliotte Rusty Harold
? -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible, 2nd Edition (Hungry Minds, 2001) | | http://www.ibiblio.org/xml/books/bible2/ | | http

Re: XML Blueberry Requirements

2001-06-21 Thread Elliotte Rusty Harold
. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible (IDG Books, 1999) | | http

Re: XML Blueberry Requirements

2001-06-21 Thread Elliotte Rusty Harold
. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible (IDG Books, 1999) | | http

Re: XML Blueberry Requirements

2001-06-21 Thread Elliotte Rusty Harold
is not theoretical, but IBM can damn well fix its own software without polluting XML for the rest of us.) There is no justification here for a new, incompatible version of XML. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED

Re: XML Blueberry Requirements

2001-06-21 Thread Elliotte Rusty Harold
in the wild.) This is just what I can think of off the top of my head. There's probably more. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

Re: Decimal Unicodepoints

2001-04-25 Thread Elliotte Rusty Harold
. :-) -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible (IDG Books, 1999) | | http

Re: The Unicode Standard, Version 3.1

2001-03-31 Thread Elliotte Rusty Harold
with the word processor, but haven't been able to display any supplementary characters in the HTML browser yet. Which word processor? Which HTML browser? -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

Re: [langue-fr] L'anglais est-il une langue universelle ?

2001-01-02 Thread Elliotte Rusty Harold
, and it does make life simpler and more pleasant than it otherwise would be. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

Re: [langue-fr] L'anglais est-il une langue universelle ?

2000-12-30 Thread Elliotte Rusty Harold
Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible (IDG Books, 1999) | | http://metalab.unc.edu/xml/books/bible/ | | http://www.amazon.com/exec

Re: [langue-fr] L'anglais est-il une langue universelle ?

2000-12-29 Thread Elliotte Rusty Harold
e're weird. The average citizen of any country has neither the time, money, nor interest to learn more than two languages; nor should they have to. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/

RE: [OT] Re: the Ethnologue

2000-11-30 Thread Elliotte Rusty Harold
"mutually incomprehensible forms of spoken English"? -- +---++-------+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The

Re: Java and Unicode

2000-11-16 Thread Elliotte Rusty Harold
are attractive. It may take the next post-Java language to really solve them. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

Re: Java and Unicode

2000-11-16 Thread Elliotte Rusty Harold
At 7:26 AM -0800 11/16/00, Valeriy E. Ushakov wrote: On Thu, Nov 16, 2000 at 05:58:27 -0800, Elliotte Rusty Harold wrote: public char charAt(int index) This method is used to walk strings, looking at each character in turn, a useful thing to do. Clearly it would be possible to replace

Re: Java and Unicode

2000-11-15 Thread Elliotte Rusty Harold
this? -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible (IDG Books, 1999) | | http

Microsoft Office 2001 Mac

2000-10-10 Thread Elliotte Rusty Harold
and fonts for all this, but so far Word 98 can't take advantage of it. -- +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer

Armenian numbers

2000-09-04 Thread Elliotte Rusty Harold
bet double as digits? Or are there some uniquely Armenian digits that Unicode is missing? +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/

Re: Unicode FAQ addendum

2000-07-20 Thread Elliotte Rusty Harold
ou're receiving strings through an API like SAX that can properly decode whatever stream it's sitting on top of", then I see your point. But it might help to be a little more explicit. +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED

Security Risks of Unicode

2000-07-16 Thread Elliotte Rusty Harold
te as well. +---++---+ | Elliotte Rusty Harold | [EMAIL PROTECTED] | Writer/Programmer | +---++---+ | The XML Bible (IDG Books, 1999) | |