> cedreek wrote >> To me, far better than using Soup. > > Ah, interesting! I use Soup almost exclusively. What did you find superior > about XMLParserHTML? I may give it a try... >
It’s mainly xpath which I find easier than navigating the html tree with soup or even The xmlHtmlparser. I usually copy the xpath form a web inspector. I have to tweak it a bit though. > > cedreek wrote >> Google chrome pharo integration helps top to scrap complex full JS web >> site like google ;) > > Also interesting! Any publicly available examples? How does one load "Google > chrome pharo integration"? Also, there is often the "poor man's" way (albeit > requiring manual intervention) by inspecting the Ajax http requests in a > developer console and then recreating directly in Pharo. > I just tried it once. There is a google chrome plugin that allows to use chrome headless to get the fully loaded html page. I need to try it again. A simple example I’d like to do is to scrap google and remove advertised content ^^ This is btw Torsten package: https://github.com/astares/Pharo-Chrome Happy scrapping ;-) And thx Torsten for all ^^ Cedrick > > > ----- > Cheers, > Sean > -- > Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html >