Norbert 

soup produces a limited ast containing all the information and not really a ful 
and nice AST for html

Stef


> 
> Am 17.01.2013 um 23:38 schrieb Sean P. DeNigris <[email protected]>:
> 
>> fstephany wrote
>>> http://www.squeaksource.com/Soup.html
>> 
>> Def works in 1.4... Soup is a must if you may have to deal with ill-formed
>> HTML (i.e. web scraping in general) because it's the only library I know of
>> that handles it robustly. I've used it a lot and it's pretty
>> straightforward.
>> 
> Ok, thanks for the update. I'm not sure handling ill-formedness is a major 
> requirement but it is good to have. Do you know if HTML5 would be handled as 
> ill-formedness? 
> Apart from that I'm interested if kind of a document model is emitted or what 
> it does. Well, I'll have a look. 
> 
> thanks,
> 
> Norbert
> 
> 


Reply via email to