Edit report at https://bugs.php.net/bug.php?id=65364&edit=1
ID: 65364
User updated by: mike at skew dot org
Reported by: mike at skew dot org
-Summary: baseURI should always be a real URI
+Summary: In doc not loaded from a URL, baseURI should still
be a real URI
Status: Open
Type: Feature/Change Request
Package: DOM XML related
Operating System: FreeBSD 8.4-RELEASE
PHP Version: 5.4.17
Block user comment: N
Private report: N
New Comment:
Made summary (request title) more descriptive.
Previous Comments:
------------------------------------------------------------------------
[2013-07-31 05:28:46] mike at skew dot org
Description:
------------
When loading an XML document from memory (string, file, whatever), such as via
DOMDocument::loadXml(), PHP tells libxml to use the current working directory,
or I think sometimes the script's folder path, as the base URI. libxml (and
PHP's thin API) expose this value as the baseURI property in the DOM,
overriding
it only as needed when xml:base is in scope.
As pointed out in the comments on Bug #44367, neither PHP nor libxml claim to
support DOM Core Level 3, so maybe this PHP/libxml "baseURI" property shouldn't
be expected to behave as it would in a L3-compliant DOM. Nevertheless, I think
it's reasonable to expect that something called "baseURI" would in fact be
formatted as a URI, thus making it useful as an actual base URI for resolving
URI references like href attribute values in XHTML documents and Atom feeds.
So, my request is that the path be prefixed with "file://", at the very least.
This will have the added benefit of making things forward-compatible with DOM
Core L3, if such support is intended for the future.
Also, my experience with XML apps suggests that a more typical default would be
a file URI corresponding to the script itself, rather than the cwd or script
folder. This would have no effect on resolution of relative references (except
an empty string), so it seems harmless in that regard. However, I would
understand if there's a desire to avoid disclosing the script's file name in
the
DOM.
Test script:
---------------
<?php
$xml_string = "<greeting>hello world</greeting>";
$doc = new DOMDocument();
$doc->loadXml($xml_string);
$document_base = $doc->baseURI;
$document_element_base = $doc->documentElement->baseURI;
echo " DOMDocument->baseURI = " . $document_base;
// A URI, as opposed to a URI reference, is always absolute.
// That is, it has a scheme/protocol (something before ":").
echo (strpos($document_base, ":") == false ? " (not a real URI!)" : " (probably
OK!)") . "\n";
echo " DocumentElement->baseURI = " . $document_element_base;
echo (strpos($document_element_base, ":") == false ? " (not a real URI!)" : "
(probably OK!)") . "\n";
?>
Expected result:
----------------
DOMDocument->baseURI = file:///usr/home/x/ (probably OK!)
DocumentElement->baseURI = file:///usr/home/x/ (probably OK!)
Actual result:
--------------
DOMDocument->baseURI = /usr/home/x/ (not a real URI!)
DocumentElement->baseURI = /usr/home/x/ (not a real URI!)
------------------------------------------------------------------------
--
Edit this bug report at https://bugs.php.net/bug.php?id=65364&edit=1