Roger B. wrote: >>Is there something wrong with the HTML parsers? >> >> > >Nikolas: Are they installed by default on most servers? If not, can >those running in sandboxes install them? > > Everything except perhaps the C libraries. All the other implementations I listed are written purely in their respective scripting language and should work if you just copy the source. Well, there's the possiblity that some require an obscure lib somewhere, but the source I skimmed indicated that they were based on basic string functions and/or regular expressions. I should also note that I'm not sure how well each of them stands up to crazy real world input because I've never tried using them on arbitrary pages.
Hrm, I find, or make up, some test inputs ... >>From the perspective of my niche, I can tell you that Coldfusion can >use jTidy to make sense of random HTML, but it is (a) installed in >virtually zero CF hosting environments, and (b) cannot normally be >added by an individual developer working in a sandbox. (It's also >riddled with bugs, but I'm just grateful to have it at all... I steer >clear of gift horses' mouths whenever possible.) > >-- >Roger Benningfield > My condolences. -Nikolas 'Atrus' Coukouma
