forewarding a bug report{?} I recieved.
--- Begin Message ---
Hello!
I wrote an RSS parser using PHP 5.0RC1 and SimpleXML and noticed a lot
of irregularities (bugs). I'll summarize them here.
1) There is no way to know if a tag actually exists or not
$xml = simplexml_load_string('<root><a>123</a><b>123</b></root>');
echo(count($xml->nonExistantElement)); // outputs 1
echo(isset($xml->nonExistantElement)); // outputs 1
echo(!empty($xml->nonExistantElement)); // outputs 1
var_dump($xml->nonExistantElement);
/* outputs:
object(simplexml_element)#2 (0) {
}
*/
2) XPath doesn't work at all for files with default namespaces.
$xml = simplexml_load_file('http://jeremy.zawodny.com/blog/atom.xml');
/* Atom feed with a default namespace, the first line is:
<feed version="0.3" xmlns="http://purl.org/atom/ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/" xml:lang="en">
*/
echo(count($xml->xpath('/feed'))); //outputs 0
$xml = simplexml_load_file('atom.xml');
/* The same feed with a removed default namespace, the first line is:
<feed version="0.3" xmlns:dc="http://purl.org/dc/elements/1.1/"
xml:lang="en">
*/
echo(count($xml->xpath('/feed'))); //outputs 1
$xml = simplexml_load_file('http://jeremy.zawodny.com/blog/index.xml');
/* RSS 0.91 feed with no namespaces defined, <rss> is the root element
*/
echo(count($xml->xpath('/rss'))); // outputs 1
$xml = simplexml_load_file('http://jeremy.zawodny.com/blog/index.rdf');
/* RSS 1.0 (RDF) with no default namespace, the first line is:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" ...>
*/
var_dump($xml->xpath('/RDF')); // empty array
var_dump($xml->xpath('/rdf:RDF')); // works as expected
Unfortunately, using namespaces in xpath is not an acceptable option either,
because if the namespace is not defined, the xpath engine returns a PHP warning.
Avoiding the error with a @ in front of the function call results in the function
always returning true.
3) Tags with namespace prefixes are not referencable.
$xml = simplexml_load_file('http://jeremy.zawodny.com/blog/index.rdf');
/* Tags of interest:
<item
rdf:about="http://jeremy.zawodny.com/blog/archives/001752.html">...<dc:date>2004-03-20T20:25:55-08:00</dc:date>...</item>
*/
var_dump($xml);
/* ...
["item"]=>
array(15) {
[0]=>
object(simplexml_element)#3 (6) {
["title"]=>
string(24) "Slashdot feature request"
...
["date"]=> (that's the <dc:date>)
string(25) "2004-03-20T22:51:00-08:00"
}
... */
echo($xml->item[0]->title); // "Slashdot Feature Request"
echo($xml->item[0]->date); // nothing
echo($xml->item[0]->{'dc:date'}); // nothing
var_dump($xml->item[0]->xpath('date')); // empty array
var_dump($xml->item[0]->xpath('dc:date')); // works as expected
I don't know C well enough to be able to fix those bugs, so can the
SimpleXML maintainer take a look?
Regards,
Alek Andreev
[EMAIL PROTECTED]
--- End Message ---
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php