ID: 44367 User updated by: daniel dot oconnor at gmail dot com Reported By: daniel dot oconnor at gmail dot com Status: Bogus Bug Type: DOM XML related Operating System: Windows PHP Version: 5.2.5 Assigned To: rrichards New Comment:
See http://www.w3.org/TR/grddl/#base_misc & http://www.apps.ietf.org/rfc/rfc3986.html#sec-5.1 The way to determine baseURI is: 1. Look for it on the root document element (HTML - <base>, XML - <foo xml:base=""> 2. Couldn't find that? Use the URL we retrieved the document with * And make sure we follow redirects! 3. Couldn't find that? Application specific (but we don't really have a setBaseURI()) So, condition #1 is broken in 5.2.5 when you do: <?php $doc = DOMDocument::load('http://www.w3.org/2001/sw/grddl-wg/td/inline-rdf6.xml'); var_dump($doc->baseURI); //Expected http://wwww.example.org/ produces: string(53) "http://www.w3.org/2001/sw/grddl-wg/td/inline-rdf6.xml" Previous Comments: ------------------------------------------------------------------------ [2008-03-10 14:09:30] [EMAIL PROTECTED] Thank you for taking the time to write to us, but this is not a bug. Please double-check the documentation available at http://www.php.net/manual/ and the instructions on how to report a bug at http://bugs.php.net/how-to-report.php Don't know about GRDDL, but for DOM trees, base uri of a DOMDocument is the URI its loaded from (or for memory based tree, the current dir). You need to check on the document element to get the base uri you are looking for. ------------------------------------------------------------------------ [2008-03-08 22:20:31] [EMAIL PROTECTED] Rob, please take a look ------------------------------------------------------------------------ [2008-03-08 05:09:06] daniel dot oconnor at gmail dot com Description: ------------ The W3C clarified a few xml:base issues when publishing the GRDDL spec. You can see the tests at http://www.w3.org/TR/grddl-tests/#ambiguous-infoset. Basically: * DOMDocument::loadXML does not detect xml:base attributes * simplexml_load_file does not detect xml:base attributes (or they are lost during the importNode phase) * simplexml_load_string does not detect xml:base attributes (or they are lost during the importNode phase) * DOMDocument does not deal with nested xml:base * DOMDocument does not deal with redirected xml:base locations To clarify on the redirect-xml:base stuff... If I request http://foo.com/example.xml and that redirects me to http://bar.com/example.xml and bar.com/example.xml said xml:base = http://foo.com/example.xml ... then http://bar.com/example.xml's baseURI should be http://bar.com/example.xml Reproduce code: --------------- <?php $url = 'http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml'; $xml = file_get_contents($url); //Load a url $doc = DOMDocument::load($url); var_dump($doc->baseURI); //Expected http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml //Load an xml document with xml:base $doc = DOMDocument::loadXML($xml); var_dump($doc->baseURI); //Expected http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml //Does it work with importNode? $sxe = simplexml_load_file($url); $dom_sxe = dom_import_simplexml($sxe); $dom = new DOMDocument('1.0'); $dom_sxe = $dom->importNode($dom_sxe, true); $dom_sxe = $dom->appendChild($dom_sxe); var_dump($doc->baseURI); //Expected (maybe) http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml // Alternative? $sxe = simplexml_load_string($xml); $dom_sxe = dom_import_simplexml($sxe); $dom = new DOMDocument('1.0'); $dom_sxe = $dom->importNode($dom_sxe, true); $dom_sxe = $dom->appendChild($dom_sxe); var_dump($doc->baseURI); //Expected (maybe) http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml //What about documents with an invalid xml:base (not on the top level element)? $doc = DOMDocument::load('http://www.w3.org/2001/sw/grddl-wg/td/inline-rdf6.xml'); var_dump($doc->baseURI); //Expected http://wwww.example.org/ //What about documents with a *redirected xml:base* ? //Note: this test case is a little broken because of a W3C server change - it *should* redirect to 'http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml' // and thus have a funky new xml:base value $doc = DOMDocument::load('http://www.w3.org/2001/sw/grddl-wg/td/xmlWithBase.xml'); var_dump($doc->baseURI); //Expected http://www.w3.org/2001/sw/grddl-wg/td/base/xmlWithBase.xml Expected result: ---------------- See reproduce code Actual result: -------------- See reproduce code ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=44367&edit=1
