[PHP] parsing malformed xml documents
Hei guys, I´m parsing some xml's and fetching nodes using xpath, and the PHP 5.0 DOM. Unfortunately, some documents have white spaces in the beginning or some missing tags. In some situations, the script just skips that xml, or even crashes without notice. I tried loading them as html, and disabling validation, but that didn´t do the trick, as they have invalid html tags. I wonder if there is some way (maybe an external class, or something) to accomplish this. I know that theorically, it would be better to have well formed xml (I also think so), but I need to handle them at any rate, and they´re created by an external source away from my control. Thanks in advance, Mariano Guadagnini -- No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.385 / Virus Database: 268.3.5/300 - Release Date: 03/04/2006 -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
Re: [PHP] parsing malformed xml documents
Mariano Guadagnini wrote: Hei guys, I´m parsing some xml's and fetching nodes using xpath, and the PHP 5.0 DOM. Unfortunately, some documents have white spaces in the beginning or some missing tags. In some situations, the script just skips that xml, or even crashes without notice. I tried loading them as html, and disabling validation, but that didn´t do the trick, as they have invalid html tags. I wonder if there is some way (maybe an external class, or something) to accomplish this. I know that theorically, it would be better to have well formed xml (I also think so), but I need to handle them at any rate, and they´re created by an external source away from my control. How about something like this: $dom = @DOMDocument::loadHTML($xml); if(is_object($dom)) $xpath = new DOMXPath($dom); else { $xml = tidy_repair_string($xml); if($xml) { $dom = @DOMDocument::loadHTML($xml); if(is_object($dom)) $xpath = new DOMXPath($dom); } } -Rasmus -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php