ID: 41041 User updated by: bleathem at gmail dot com Reported By: bleathem at gmail dot com Status: Bogus Bug Type: DOM XML related Operating System: Max OS/X 10.4.9 PHP Version: 5.2.1 Assigned To: rrichards New Comment:
Thanks, this solved my problem exactly. And sorry for wasting your time. I did read through the PHP documentation extensively, I was however looking in the section on DOM, rather than libxml. Believe me, the effort involved in submitting a bug report (installing the latest version, writing sample code, etc. etc.) makes it a last resort! Previous Comments: ------------------------------------------------------------------------ [2007-04-10 16:11:25] [EMAIL PROTECTED] Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php you need to substitute the entities LIBXML_NOENT when loading the xml into the DOM document. ------------------------------------------------------------------------ [2007-04-10 15:50:45] bleathem at gmail dot com Description: ------------ I've created a xml file that uses a doctype to define html entities. I've created an accompanying Relax NG file to validate the file. If the xml element that contains the entity follows an Relax NG <interleave> block, the validation fails. I have demonstrated this in the accompanying source code. The variable $xml is validated against two Relax NG schemas, where the preceding elements are contained in an <interleave> block, and one where the are not. The validation fails in the <interleave> case. I have tried the validation using an online validator (java based, uses jing), see: http://hsivonen.iki.fi/validator/ so it is not the XML or the Relax NG, but rather the validator itself. I have found other circumstances where the presence entities cause the validation to fail, I can provided these if they are necessary. Reproduce code: --------------- <?php $xml = <<<EOF <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE root[ <!ENTITY mu "mu"> ]> <eec:field xmlns:eec="http://www.triumf.info/common/xml/eec" xmlns="http://www.w3.org/1999/xhtml"> <eec:title>Isotope</eec:title> <eec:name>sec_isotope</eec:name> <eec:type>text</eec:type> <eec:value>μ</eec:value> </eec:field> EOF; $rng = <<<EOF <?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:eec="http://www.triumf.info/common/xml/eec"> <start> <element name="eec:field"> <interleave> <element name="eec:title"><text/></element> <element name="eec:name"><text/></element> <element name="eec:type"><text/></element> </interleave> <element name="eec:value"><text/></element> </element> </start> </grammar> EOF; $rng2 = <<<EOF <?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" xmlns:eec="http://www.triumf.info/common/xml/eec"> <start> <element name="eec:field"> <element name="eec:title"><text/></element> <element name="eec:name"><text/></element> <element name="eec:type"><text/></element> <element name="eec:value"><text/></element> </element> </start> </grammar> EOF; ini_set( 'track_errors', 1); ini_set('error_reporting', E_ALL | E_STRICT); $dom = new DOMDocument(); $dom->loadXML($xml, LIBXML_DTDLOAD|LIBXML_DTDATTR); echo "<h3>1st Time</h3>"; if ($dom->relaxNGValidateSource($rng)) echo "Relax NG validated"; else echo $php_errormsg; echo "<h3>2nd Time</h3>"; if ($dom->relaxNGValidateSource($rng2)) echo "Relax NG validated"; else echo $php_errormsg; ?> Expected result: ---------------- 1st Time Relax NG validated 2nd Time Relax NG validated Actual result: -------------- 1st Time Element value has extra content: mu 2nd Time Relax NG validated ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=41041&edit=1