Edit report at http://bugs.php.net/bug.php?id=48080&edit=1
ID: 48080
Comment by: php at example dot com
Reported by: jose dot rob dot jr at gmail dot com
Summary: Add support for forcing DOM to validate a
DOMDocument with a DTD
Status: Open
Type: Feature/Change Request
Package: Feature/Change Request
PHP Version: 5.2.9
Block user comment: N
New Comment:
It should also be noted that this affects any DOMDocuments using the
standard XHTML SystemIDs. The W3C decided to block all requests to their
URIs. See
http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic
Previous Comments:
------------------------------------------------------------------------
[2009-04-26 17:17:38] jose dot rob dot jr at gmail dot com
Description:
------------
I need to validate XML files before loading them, then I created a DTD
and hosted it.
With python I can distribute the DTD file with the program and validate
the XML file locally.
A python example:
---
from lxml import etree
from StringIO import StringIO
xmlstart="""<?xml version="1.0" encoding="utf8"?>
<!DOCTYPE example PUBLIC '-//Example//Example DTD'
'http://example.com/mydtd.dtd'>"""
xmlok=xmlstart+"<example>The XML file</example>";
xmlinvalid=xmlstart+"<example><a>test</a>The XML file</example>";
dtddata="<!ELEMENT example (#PCDATA) >";
f=StringIO(dtddata);
dtd=etree.DTD(f);
print "Valid XML:";
xml1=etree.XML(xmlok);
validation=dtd.validate(xml1);
print validation;
print dtd.error_log.filter_from_errors();
print "Invalid XML:";
xml2=etree.XML(xmlinvalid);
validation=dtd.validate(xml2);
print validation;
print dtd.error_log.filter_from_errors();
----
The only way I find to port this stript is using DOMDocument::validate()
but this method will get the DTD from http://example.com/mydtd.dtd and
be slower, generate traffic, and fail when example.com is off-line...
I suggest adding an attribute like DOMDocument::validate($source) where
$source is a string with DTD source to avoid situations like this...
Reproduce code:
---------------
<?php
$xmlstart=<<<XML
<?xml version="1.0" encoding="utf8"?>
<!DOCTYPE example PUBLIC '-//Example//Example DTD'
'http://example.com/mydtd.dtd'>
XML;
$xmlok=$xmlstart."<example>The XML file</example>";
$xmlinvalid=$xmlstart."<example><a>test</a>The XML file</example>";
$dtddata="<!ELEMENT example (#PCDATA) >";
print "<h1>Valid XML:</h1>";
$xml1=DOMDocument::loadXML($xmlok);
$validation=(int)$xml1->validate($dtddata); //Example that would work
print "<p><b>$validation</b></p>";
print "<h1>Invalid XML:</h1>";
$xml1=DOMDocument::loadXML($xmlinvalid);
$validation=(int)$xml1->validate($dtddata); //Example that would work
print "<p><b>$validation</b></p>";
?>
Expected result:
----------------
Valid XML:
1
Invalid XML:
Warning: DOMDocument::validate() [function.DOMDocument-validate]:
Element example was declared #PCDATA but contains non text nodes in
/script/path/xml.php on line 19
Warning: DOMDocument::validate() [function.DOMDocument-validate]: No
declaration for element a in /script/path/xml.php on line 19
0
Actual result:
--------------
When no argument is passed to validate and DTD server is off-line:
Valid XML:
Warning: DOMDocument::validate(http://example.com/mydtd.dtd)
[function.DOMDocument-validate]: failed to open stream: HTTP request
failed! HTTP/1.1 404 Not Found in /script/path/xml.php on line 14
Warning: DOMDocument::validate() [function.DOMDocument-validate]: I/O
warning : failed to load external entity "http://example.com/mydtd.dtd"
in /script/path/xml.php on line 14
Warning: DOMDocument::validate() [function.DOMDocument-validate]: Could
not load the external subset "http://example.com/mydtd.dtd" in
/script/path/xml.php on line 14
0
Invalid XML:
Warning: DOMDocument::validate(http://example.com/mydtd.dtd)
[function.DOMDocument-validate]: failed to open stream: HTTP request
failed! HTTP/1.1 404 Not Found in /script/path/xml.php on line 19
Warning: DOMDocument::validate() [function.DOMDocument-validate]: I/O
warning : failed to load external entity "http://example.com/mydtd.dtd"
in /script/path/xml.php on line 19
Warning: DOMDocument::validate() [function.DOMDocument-validate]: Could
not load the external subset "http://example.com/mydtd.dtd" in
/script/path/xml.php on line 19
0
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/bug.php?id=48080&edit=1