Looking at the w3c spec it seems that putting a DTD fragment, for instance:

<!ENTITY pound "&#163;">

into the xml document should be sufficient to render &pound; correctly into
a pound sign (Ascii character 163), however, when I try and parse this XML
with SabloTron it gives the following:

Error [code:2] [URI:file:/home/httpd/neoweb/html/tmp/odd.xml] [line:3] 
  XML parser error 2: syntax error

odd.xml
-------
<?xml version="1.0" standalone="no"?>
<!DOCTYPE moreovernews SYSTEM
"http://p.moreover.com/xml_dtds/moreovernews.dtd">
<!ENTITY pound "&#163;">

<allproducts>
        <quid>&pound;</quid>
        <quid2>&#163;</quid2>
        <quid3>�</quid3>
        <amp>&amp;</amp>
        <acute>&acute;</acute>
        <apos>&apos;</apos>
        <quot>&quot;</quot>
</allproducts>

Presumably I'm not issuing the fragment correctly, but I'm kind of stumped
now.  I'd appreciate any pointers as to where I'm going wrong.

Thanks

Nick


> -----Original Message-----
> From: Sebastian Rahtz
> [mailto:[EMAIL PROTECTED]]
> Sent: 06 September 2000 20:39
> To: Sablotron Mailing List
> Subject: Re: [Sab] Disappearing special browser chars bug?
> 
> 
> Nick Vincent writes:
>  > <?xml version="1.0" standalone="no"?>
>  > <!DOCTYPE moreovernews SYSTEM
>  > "http://p.moreover.com/xml_dtds/moreovernews.dtd">
>  > 
>  > <allproducts>
>  >    <quid>&pound;</quid>
>  >    <amp>&amp;</amp>
>  >    <acute>&acute;</acute>
>  > </allproducts>
> 
> If the entities are defined in your DTD, the Expat parser does NOT
> read them. It only reads the DTD subset in the document itself. It is
> maddening, but such is life. Many other parsers *do* read the DTD, so
> you can get caught out like this
> 
>  > Is this by design,
> yes
> 
>  >  and if so is there any way round it?  
> 
> define the entities in the DTD subset in each document
> 
>  > I've tried using the &#163; syntax to specify a pound 
> sign, but a '£'
>  > (Ascii 194, Ascii 163) appears in the document when I 
> parse it, and is
>  > subsequently displayed in the browser.  This also looks a 
> bit silly.
> 
> thats because the default output encoding is UTF8, which any decent
> browser should understand. Does Sabltoron support other output
> encodings? I cannot remember.
> 
> you are in a Unicode world now. probably best to bite the bullet and
> get software which understands Unicode/UTF8. or run post-processing
> converters (like iconv)
> 
> Sebastian
> 

Reply via email to