forewarding a bug report{?} I recieved.
--- Begin Message ---
Hello!

I wrote an RSS parser using PHP 5.0RC1 and SimpleXML and noticed a lot
of irregularities (bugs). I'll summarize them here.

1) There is no way to know if a tag actually exists or not
$xml = simplexml_load_string('<root><a>123</a><b>123</b></root>');
echo(count($xml->nonExistantElement)); // outputs 1
echo(isset($xml->nonExistantElement)); // outputs 1
echo(!empty($xml->nonExistantElement)); // outputs 1
var_dump($xml->nonExistantElement);
/* outputs:
object(simplexml_element)#2 (0) {
}
*/

2) XPath doesn't work at all for files with default namespaces.

$xml = simplexml_load_file('http://jeremy.zawodny.com/blog/atom.xml');
/* Atom feed with a default namespace, the first line is:
<feed version="0.3" xmlns="http://purl.org/atom/ns#";
xmlns:dc="http://purl.org/dc/elements/1.1/"; xml:lang="en">
*/
echo(count($xml->xpath('/feed'))); //outputs 0


$xml = simplexml_load_file('atom.xml');
/* The same feed with a removed default namespace, the first line is:
<feed version="0.3" xmlns:dc="http://purl.org/dc/elements/1.1/";
xml:lang="en">
*/
echo(count($xml->xpath('/feed'))); //outputs 1


$xml = simplexml_load_file('http://jeremy.zawodny.com/blog/index.xml');
/* RSS 0.91 feed with no namespaces defined, <rss> is the root element
*/
echo(count($xml->xpath('/rss'))); // outputs 1


$xml = simplexml_load_file('http://jeremy.zawodny.com/blog/index.rdf');
/* RSS 1.0 (RDF) with no default namespace, the first line is:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"; ...>
*/
var_dump($xml->xpath('/RDF')); // empty array
var_dump($xml->xpath('/rdf:RDF')); // works as expected

Unfortunately, using namespaces in xpath is not an acceptable option either, 
because if the namespace is not defined, the xpath engine returns a PHP warning.
Avoiding the error with a @ in front of the function call results in the function
always returning true.

3) Tags with namespace prefixes are not referencable.

$xml = simplexml_load_file('http://jeremy.zawodny.com/blog/index.rdf');
/* Tags of interest:
<item
rdf:about="http://jeremy.zawodny.com/blog/archives/001752.html";>...<dc:date>2004-03-20T20:25:55-08:00</dc:date>...</item>
*/
var_dump($xml);
/*  ... 
  ["item"]=>
  array(15) {
    [0]=>
    object(simplexml_element)#3 (6) {
      ["title"]=>
      string(24) "Slashdot feature request"
      ...
      ["date"]=> (that's the <dc:date>)
      string(25) "2004-03-20T22:51:00-08:00"
    }
... */
echo($xml->item[0]->title); // "Slashdot Feature Request"
echo($xml->item[0]->date); // nothing
echo($xml->item[0]->{'dc:date'}); // nothing
var_dump($xml->item[0]->xpath('date')); // empty array
var_dump($xml->item[0]->xpath('dc:date')); // works as expected


I don't know C well enough to be able to fix those bugs, so can the
SimpleXML maintainer take a look?


Regards,
Alek Andreev
[EMAIL PROTECTED]


--- End Message ---
-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to