Edit report at https://bugs.php.net/bug.php?id=48080&edit=1

 ID:                 48080
 Updated by:         cataphr...@php.net
 Reported by:        jose dot rob dot jr at gmail dot com
 Summary:            Add support for forcing DOM to validate a
                     DOMDocument with a DTD
 Status:             Assigned
 Type:               Feature/Change Request
 Package:            DOM XML related
 PHP Version:        5.2.9
 Assigned To:        cataphract
 Block user comment: N
 Private report:     N

 New Comment:

The particular solution enunciated in this request for handling the external 
external DTD is not implemented, but you can now intercept the loading of such 
external entity with the external entity loader. See 
https://svn.php.net/viewvc/php/php-src/trunk/ext/libxml/tests/libxml_set_external_entity_loader_basic.phpt?revision=315672&view=markup


Previous Comments:
------------------------------------------------------------------------
[2011-09-08 09:49:11] bj...@php.net

Gustavo; So this is fixed?

------------------------------------------------------------------------
[2011-08-29 05:08:32] cataphr...@php.net

Added libxml_set_external_entity_loader() in PHP 5.4/trunk, which also solves 
this problem.

------------------------------------------------------------------------
[2010-10-31 12:49:50] php at example dot com

It should also be noted that this affects any DOMDocuments using the standard 
XHTML SystemIDs. The W3C decided to block all requests to their URIs. See 
http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic

------------------------------------------------------------------------
[2009-04-26 17:17:38] jose dot rob dot jr at gmail dot com

Description:
------------
I need to validate XML files before loading them, then I created a DTD and 
hosted it.

With python I can distribute the DTD file with the program and validate the XML 
file locally.
A python example:
---
from lxml import etree
from StringIO import StringIO

xmlstart="""<?xml version="1.0" encoding="utf8"?>
<!DOCTYPE example PUBLIC '-//Example//Example DTD' 
'http://example.com/mydtd.dtd'>"""

xmlok=xmlstart+"<example>The XML file</example>";
xmlinvalid=xmlstart+"<example><a>test</a>The XML file</example>";

dtddata="<!ELEMENT example (#PCDATA) >";

f=StringIO(dtddata);
dtd=etree.DTD(f);

print "Valid XML:";
xml1=etree.XML(xmlok);
validation=dtd.validate(xml1);
print validation;
print dtd.error_log.filter_from_errors();

print "Invalid XML:";
xml2=etree.XML(xmlinvalid);
validation=dtd.validate(xml2);
print validation;
print dtd.error_log.filter_from_errors();
----
The only way I find to port this stript is using DOMDocument::validate() but 
this method will get the DTD from http://example.com/mydtd.dtd and be slower, 
generate traffic, and fail when example.com is off-line...

I suggest adding an attribute like DOMDocument::validate($source) where $source 
is a string with DTD source to avoid situations like this...

Reproduce code:
---------------
<?php
$xmlstart=<<<XML
<?xml version="1.0" encoding="utf8"?>
<!DOCTYPE example PUBLIC '-//Example//Example DTD' 
'http://example.com/mydtd.dtd'>
XML;

$xmlok=$xmlstart."<example>The XML file</example>";
$xmlinvalid=$xmlstart."<example><a>test</a>The XML file</example>";
$dtddata="<!ELEMENT example (#PCDATA) >";

print "<h1>Valid XML:</h1>";
$xml1=DOMDocument::loadXML($xmlok);
$validation=(int)$xml1->validate($dtddata); //Example that would work
print "<p><b>$validation</b></p>";
print "<h1>Invalid XML:</h1>";
$xml1=DOMDocument::loadXML($xmlinvalid);
$validation=(int)$xml1->validate($dtddata); //Example that would work
print "<p><b>$validation</b></p>";
?>

Expected result:
----------------
Valid XML:

1

Invalid XML:

Warning: DOMDocument::validate() [function.DOMDocument-validate]: Element 
example was declared #PCDATA but contains non text nodes in 
/script/path/xml.php on line 19

Warning: DOMDocument::validate() [function.DOMDocument-validate]: No 
declaration for element a in /script/path/xml.php on line 19

0

Actual result:
--------------
When no argument is passed to validate and DTD server is off-line:

Valid XML:

Warning: DOMDocument::validate(http://example.com/mydtd.dtd) 
[function.DOMDocument-validate]: failed to open stream: HTTP request failed! 
HTTP/1.1 404 Not Found in /script/path/xml.php on line 14

Warning: DOMDocument::validate() [function.DOMDocument-validate]: I/O warning : 
failed to load external entity "http://example.com/mydtd.dtd"; in 
/script/path/xml.php on line 14

Warning: DOMDocument::validate() [function.DOMDocument-validate]: Could not 
load the external subset "http://example.com/mydtd.dtd"; in /script/path/xml.php 
on line 14

0

Invalid XML:

Warning: DOMDocument::validate(http://example.com/mydtd.dtd) 
[function.DOMDocument-validate]: failed to open stream: HTTP request failed! 
HTTP/1.1 404 Not Found in /script/path/xml.php on line 19

Warning: DOMDocument::validate() [function.DOMDocument-validate]: I/O warning : 
failed to load external entity "http://example.com/mydtd.dtd"; in 
/script/path/xml.php on line 19

Warning: DOMDocument::validate() [function.DOMDocument-validate]: Could not 
load the external subset "http://example.com/mydtd.dtd"; in /script/path/xml.php 
on line 19

0


------------------------------------------------------------------------



-- 
Edit this bug report at https://bugs.php.net/bug.php?id=48080&edit=1

Reply via email to