On Fri, 23 Mar 2001, Geoff Hutchison wrote:
> At 10:59 PM -0600 3/23/01, Gilles Detillieux wrote:
> >What do you consider "good" vs. "bad" characters?  Remember that most of
> >us have little or no XML experience, so you need to define your terms
> >if we're to understand each other.
> 
> Sorry, I didn't realize his response was only to me--it's been a busy 
> week. A "bad character" was evidently a high-bit character--something 
> above 7-bit ASCII.
> 
> I haven't the faintest idea why this would choke an XML parser 
> considering SGML-encoding is defined for most Unicode character sets.
> 
> >Are there others that should get similar treatment, or is this 
> >another matter altogether?
> 
> It may be an issue that the high-bit characters in 3.1 aren't 
> re-encoded into SGML equivalents in htsearch. But I still don't see 
> why this is causing grief for an XML parser.

Sorry for that... I have to be more careful where I send mail. I didn't mean it
to go to just you.

It is obvious I need to research XML some more. I know that when I used
XML::Parser (a perl module based on expat) it choked on the bad characters. I do
know that the philosophy of XML is "If somethings not exactly right, crash and
burn!" Parsers have to be very anal about things.

I'll hit the books and give a good answer with references when I find the
answer.

-- 
Jonathan Gardner
[EMAIL PROTECTED]
(425)820-2244 x123


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to