I've used PhantomJS in combination with selenium (ghostdriver) and Python 
requests. Phantom used a headless browser, so you can make your scraping a task 
for the Web2py scheduler on your server.

First analyse the website to get the Ajax urls and the headers, and construct 
the correct requests if necessary, like when the urls are protected with 
authentication.

See http://techstonia.com/scraping-with-phantomjs-and-python.html for a simpler 
example

Nico de Groot 

-- 
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to