On Fri, 23 Mar 2001, Geoff Hutchison wrote:
> At 10:59 PM -0600 3/23/01, Gilles Detillieux wrote:
> >What do you consider "good" vs. "bad" characters? Remember that most of
> >us have little or no XML experience, so you need to define your terms
> >if we're to understand each other.
>
> Sorry, I didn't realize his response was only to me--it's been a busy
> week. A "bad character" was evidently a high-bit character--something
> above 7-bit ASCII.
>
> I haven't the faintest idea why this would choke an XML parser
> considering SGML-encoding is defined for most Unicode character sets.
>
> >Are there others that should get similar treatment, or is this
> >another matter altogether?
>
> It may be an issue that the high-bit characters in 3.1 aren't
> re-encoded into SGML equivalents in htsearch. But I still don't see
> why this is causing grief for an XML parser.
Sorry for that... I have to be more careful where I send mail. I didn't mean it
to go to just you.
It is obvious I need to research XML some more. I know that when I used
XML::Parser (a perl module based on expat) it choked on the bad characters. I do
know that the philosophy of XML is "If somethings not exactly right, crash and
burn!" Parsers have to be very anal about things.
I'll hit the books and give a good answer with references when I find the
answer.
--
Jonathan Gardner
[EMAIL PROTECTED]
(425)820-2244 x123
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html