[web2py] Re: [Scheduler] Scraping javascript pages

Nico de Groot Wed, 02 Mar 2016 01:39:10 -0800

I've used PhantomJS in combination with selenium (ghostdriver) and Python 
requests. Phantom used a headless browser, so you can make your scraping a task 
for the Web2py scheduler on your server.


First analyse the website to get the Ajax urls and the headers, and construct 
the correct requests if necessary, like when the urls are protected with 
authentication.

See http://techstonia.com/scraping-with-phantomjs-and-python.html for a simpler 
example

Nico de Groot 

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[web2py] Re: [Scheduler] Scraping javascript pages

Reply via email to