Looking at the w3c spec it seems that putting a DTD fragment, for instance:
<!ENTITY pound "£">
into the xml document should be sufficient to render £ correctly into
a pound sign (Ascii character 163), however, when I try and parse this XML
with SabloTron it gives the following:
Error [code:2] [URI:file:/home/httpd/neoweb/html/tmp/odd.xml] [line:3]
XML parser error 2: syntax error
odd.xml
-------
<?xml version="1.0" standalone="no"?>
<!DOCTYPE moreovernews SYSTEM
"http://p.moreover.com/xml_dtds/moreovernews.dtd">
<!ENTITY pound "£">
<allproducts>
<quid>£</quid>
<quid2>£</quid2>
<quid3>�</quid3>
<amp>&</amp>
<acute>´</acute>
<apos>'</apos>
<quot>"</quot>
</allproducts>
Presumably I'm not issuing the fragment correctly, but I'm kind of stumped
now. I'd appreciate any pointers as to where I'm going wrong.
Thanks
Nick
> -----Original Message-----
> From: Sebastian Rahtz
> [mailto:[EMAIL PROTECTED]]
> Sent: 06 September 2000 20:39
> To: Sablotron Mailing List
> Subject: Re: [Sab] Disappearing special browser chars bug?
>
>
> Nick Vincent writes:
> > <?xml version="1.0" standalone="no"?>
> > <!DOCTYPE moreovernews SYSTEM
> > "http://p.moreover.com/xml_dtds/moreovernews.dtd">
> >
> > <allproducts>
> > <quid>£</quid>
> > <amp>&</amp>
> > <acute>´</acute>
> > </allproducts>
>
> If the entities are defined in your DTD, the Expat parser does NOT
> read them. It only reads the DTD subset in the document itself. It is
> maddening, but such is life. Many other parsers *do* read the DTD, so
> you can get caught out like this
>
> > Is this by design,
> yes
>
> > and if so is there any way round it?
>
> define the entities in the DTD subset in each document
>
> > I've tried using the £ syntax to specify a pound
> sign, but a '£'
> > (Ascii 194, Ascii 163) appears in the document when I
> parse it, and is
> > subsequently displayed in the browser. This also looks a
> bit silly.
>
> thats because the default output encoding is UTF8, which any decent
> browser should understand. Does Sabltoron support other output
> encodings? I cannot remember.
>
> you are in a Unicode world now. probably best to bite the bullet and
> get software which understands Unicode/UTF8. or run post-processing
> converters (like iconv)
>
> Sebastian
>