In a moment of interantihistimine lucidity tonight, it occurred to me that XML::Parser normally doesn't go get the DTD that an XML document refers to, and normally doesn't have to. But, this train of thought continued, suppose one had this bit of code in a PXML document: <item><c>'dumb' => 1</c></item> <p>Enables naïve parsing.</p> To resolve the ï, XML::Parser would have to get and process the PXML DTD and see it pulling in the W3C's XHTML entity references, which it would have to go get and process. This is all fine for my local copy of nsgmls, as I've set up the local catalog file to redirect queries on the PXML DTD as well as for the W3C things, to local files. But I figure that XML::Parser would have to hit perl.com (or wherever else I keep the DTD) and W3C.org to get all the files necessary to know that ï is an "ï" character. I don't know if XML::Parser uses a CATALOG file, but even if it did, that's just one more thing people have to bother with. So I'm thinking of bypassing this problem by banishing from the DTD /all/ those definitions of character entities. (Altho this still leaves &, <, >, ', and ", which are predefined.) This would mean that if you wanted a "ï" character in PXML, you'd have three alternatives: 1) just use a "ï". Just make sure that the XML document's declared encoding agrees with the one your editor's using, and make a point of not putting your POD thru 8-bit-impure lines. As we're not living in 1985, the latter requirement is presumably not problematic. (Anyone planning to transmit their POD to 1985 and back might consider UTF-7, God help us all.) 2) use a numeric character reference, i.e., ï or ï 3) brave souls just define the entity for themselves: <!ENTITY iuml "ï"> If anyone would be rather put out by these, or has some other suggestion, then SPEAK NOW, or forever hold your Reese's Pieces! Anyhoo, I figure that if I take all the character entity declarations out of the PXML DTD, there'll be no need for XML parsers to have to go snare the DTD, so that one could even sensibly label PXML documents as standalones! (BTW, did I mention that Pod::PXML's xml2pod is a validating parser? The DTD is hardwired in -- well, the element content models, at least. I suppose it'd be trivial to add attribute validity checking.) (BTW, Reese's Pieces is a trademark of Hershey Foods Corporation. But I prefer Droste dark chocolate anyway.) -- Sean M. Burke [EMAIL PROTECTED] http://www.spinn.net/~sburke/
