From:
Operating system: Linux
PHP version: 5.4.0alpha3
Package: DOM XML related
Bug Type: Bug
Bug description:DOMDocument::LoadHTMLFile fails with %xx sequences in filename.
Description:
------------
DOMDocument::LoadHTMLFile appears to urldecode it's argument, which causes
problems when attempting to load a file containing a %xx sequence.
This issue was brought up on ##php in freenode when someone was attempting
to load
a file named 'Linux_Files%2Fetc%2Fbash.bashrc.html'. Suggested work around
was to
use LoadHTML + file_get_contents instead.
There was a small debate over whether this is a bug, or just a
documentation
problem (perhaps LoadHTMLFile expects a URL).
DOMDocument::Load() is also affected.
Test script:
---------------
Contents of 'Linux_Files%2Fetc%2Fbash.bashrc.html'
---------------------------------------8<---------------------------------------
<html>
<head>
<title></title>
</head>
<body>
</body>
</html>
---------------------------------------8<---------------------------------------
contents of 'test.php'
---------------------------------------8<---------------------------------------
<?php
$file = 'Linux_Files%2Fetc%2Fbash.bashrc.html';
$doc = new DOMDocument();
$doc->loadHTMLFile($file);
var_dump($doc->getElementsByTagName('body')->length);
echo str_repeat('-', 80), "\r\n";
$doc2 = new DOMDocument();
$doc2->loadHTMLFile(urlencode($file));
var_dump($doc2->getElementsByTagName('body')->length);
---------------------------------------8<---------------------------------------
Expected result:
----------------
Expect the ->loadHTMLFile($file) to succeed and the -
>loadHTMLFile(urlencode($file)) to fail with a file-not-found type error.
Actual result:
--------------
->loadHTMLFile($file) failes with errors:
PHP Warning: DOMDocument::loadHTMLFile(): I/O warning : failed to load
external
entity "Linux_Files%2Fetc%2Fbash.bashrc.html" in /home/kicken/test.php on
line 6
Warning: DOMDocument::loadHTMLFile(): I/O warning : failed to load external
entity
"Linux_Files%2Fetc%2Fbash.bashrc.html" in /home/kicken/test.php on line 6
->loadHTMLFile(urlencode($file)) succeeds.
--
Edit bug report at https://bugs.php.net/bug.php?id=55374&edit=1
--
Try a snapshot (PHP 5.4):
https://bugs.php.net/fix.php?id=55374&r=trysnapshot54
Try a snapshot (PHP 5.3):
https://bugs.php.net/fix.php?id=55374&r=trysnapshot53
Try a snapshot (trunk):
https://bugs.php.net/fix.php?id=55374&r=trysnapshottrunk
Fixed in SVN:
https://bugs.php.net/fix.php?id=55374&r=fixed
Fixed in SVN and need be documented:
https://bugs.php.net/fix.php?id=55374&r=needdocs
Fixed in release:
https://bugs.php.net/fix.php?id=55374&r=alreadyfixed
Need backtrace:
https://bugs.php.net/fix.php?id=55374&r=needtrace
Need Reproduce Script:
https://bugs.php.net/fix.php?id=55374&r=needscript
Try newer version:
https://bugs.php.net/fix.php?id=55374&r=oldversion
Not developer issue:
https://bugs.php.net/fix.php?id=55374&r=support
Expected behavior:
https://bugs.php.net/fix.php?id=55374&r=notwrong
Not enough info:
https://bugs.php.net/fix.php?id=55374&r=notenoughinfo
Submitted twice:
https://bugs.php.net/fix.php?id=55374&r=submittedtwice
register_globals:
https://bugs.php.net/fix.php?id=55374&r=globals
PHP 4 support discontinued:
https://bugs.php.net/fix.php?id=55374&r=php4
Daylight Savings: https://bugs.php.net/fix.php?id=55374&r=dst
IIS Stability:
https://bugs.php.net/fix.php?id=55374&r=isapi
Install GNU Sed:
https://bugs.php.net/fix.php?id=55374&r=gnused
Floating point limitations:
https://bugs.php.net/fix.php?id=55374&r=float
No Zend Extensions:
https://bugs.php.net/fix.php?id=55374&r=nozend
MySQL Configuration Error:
https://bugs.php.net/fix.php?id=55374&r=mysqlcfg