Am 17.01.2013 um 23:38 schrieb Sean P. DeNigris <[email protected]>:
> fstephany wrote >> http://www.squeaksource.com/Soup.html > > Def works in 1.4... Soup is a must if you may have to deal with ill-formed > HTML (i.e. web scraping in general) because it's the only library I know of > that handles it robustly. I've used it a lot and it's pretty > straightforward. > Ok, thanks for the update. I'm not sure handling ill-formedness is a major requirement but it is good to have. Do you know if HTML5 would be handled as ill-formedness? Apart from that I'm interested if kind of a document model is emitted or what it does. Well, I'll have a look. thanks, Norbert
