Edit report at https://bugs.php.net/bug.php?id=48080&edit=1
ID: 48080 Updated by: cataphr...@php.net Reported by: jose dot rob dot jr at gmail dot com Summary: Add support for forcing DOM to validate a DOMDocument with a DTD Status: Assigned Type: Feature/Change Request Package: DOM XML related PHP Version: 5.2.9 Assigned To: cataphract Block user comment: N Private report: N New Comment: The particular solution enunciated in this request for handling the external external DTD is not implemented, but you can now intercept the loading of such external entity with the external entity loader. See https://svn.php.net/viewvc/php/php-src/trunk/ext/libxml/tests/libxml_set_external_entity_loader_basic.phpt?revision=315672&view=markup Previous Comments: ------------------------------------------------------------------------ [2011-09-08 09:49:11] bj...@php.net Gustavo; So this is fixed? ------------------------------------------------------------------------ [2011-08-29 05:08:32] cataphr...@php.net Added libxml_set_external_entity_loader() in PHP 5.4/trunk, which also solves this problem. ------------------------------------------------------------------------ [2010-10-31 12:49:50] php at example dot com It should also be noted that this affects any DOMDocuments using the standard XHTML SystemIDs. The W3C decided to block all requests to their URIs. See http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic ------------------------------------------------------------------------ [2009-04-26 17:17:38] jose dot rob dot jr at gmail dot com Description: ------------ I need to validate XML files before loading them, then I created a DTD and hosted it. With python I can distribute the DTD file with the program and validate the XML file locally. A python example: --- from lxml import etree from StringIO import StringIO xmlstart="""<?xml version="1.0" encoding="utf8"?> <!DOCTYPE example PUBLIC '-//Example//Example DTD' 'http://example.com/mydtd.dtd'>""" xmlok=xmlstart+"<example>The XML file</example>"; xmlinvalid=xmlstart+"<example><a>test</a>The XML file</example>"; dtddata="<!ELEMENT example (#PCDATA) >"; f=StringIO(dtddata); dtd=etree.DTD(f); print "Valid XML:"; xml1=etree.XML(xmlok); validation=dtd.validate(xml1); print validation; print dtd.error_log.filter_from_errors(); print "Invalid XML:"; xml2=etree.XML(xmlinvalid); validation=dtd.validate(xml2); print validation; print dtd.error_log.filter_from_errors(); ---- The only way I find to port this stript is using DOMDocument::validate() but this method will get the DTD from http://example.com/mydtd.dtd and be slower, generate traffic, and fail when example.com is off-line... I suggest adding an attribute like DOMDocument::validate($source) where $source is a string with DTD source to avoid situations like this... Reproduce code: --------------- <?php $xmlstart=<<<XML <?xml version="1.0" encoding="utf8"?> <!DOCTYPE example PUBLIC '-//Example//Example DTD' 'http://example.com/mydtd.dtd'> XML; $xmlok=$xmlstart."<example>The XML file</example>"; $xmlinvalid=$xmlstart."<example><a>test</a>The XML file</example>"; $dtddata="<!ELEMENT example (#PCDATA) >"; print "<h1>Valid XML:</h1>"; $xml1=DOMDocument::loadXML($xmlok); $validation=(int)$xml1->validate($dtddata); //Example that would work print "<p><b>$validation</b></p>"; print "<h1>Invalid XML:</h1>"; $xml1=DOMDocument::loadXML($xmlinvalid); $validation=(int)$xml1->validate($dtddata); //Example that would work print "<p><b>$validation</b></p>"; ?> Expected result: ---------------- Valid XML: 1 Invalid XML: Warning: DOMDocument::validate() [function.DOMDocument-validate]: Element example was declared #PCDATA but contains non text nodes in /script/path/xml.php on line 19 Warning: DOMDocument::validate() [function.DOMDocument-validate]: No declaration for element a in /script/path/xml.php on line 19 0 Actual result: -------------- When no argument is passed to validate and DTD server is off-line: Valid XML: Warning: DOMDocument::validate(http://example.com/mydtd.dtd) [function.DOMDocument-validate]: failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /script/path/xml.php on line 14 Warning: DOMDocument::validate() [function.DOMDocument-validate]: I/O warning : failed to load external entity "http://example.com/mydtd.dtd" in /script/path/xml.php on line 14 Warning: DOMDocument::validate() [function.DOMDocument-validate]: Could not load the external subset "http://example.com/mydtd.dtd" in /script/path/xml.php on line 14 0 Invalid XML: Warning: DOMDocument::validate(http://example.com/mydtd.dtd) [function.DOMDocument-validate]: failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /script/path/xml.php on line 19 Warning: DOMDocument::validate() [function.DOMDocument-validate]: I/O warning : failed to load external entity "http://example.com/mydtd.dtd" in /script/path/xml.php on line 19 Warning: DOMDocument::validate() [function.DOMDocument-validate]: Could not load the external subset "http://example.com/mydtd.dtd" in /script/path/xml.php on line 19 0 ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=48080&edit=1