Thank you very much. This is helpful!

Actually, what I wanna do is 
1) read a *<table>DATA</table>* from a exteranl page and insert it to the 
database.
2) Using a cron /scheduler to update the database table periodically if the 
source web page had changed 

Could you give me a idea how to get going with 1)?

Say I have done this:
* fetch the rows (<td>CONTENT</td>) of the table into the elements 
collection
* stripped unwanted parts off and separated the columns 

Now how do I push into the table?

Here's what I plan to do:
* create a model corresponding table columns and data (all strings, numbers)
* read the data sliced from the elements collection by a for loop and 
assign it to the database table fields.

Would you confirm this approach?
Is there a more efficient way?

Thank you and kind regards,
Timmie

Am Freitag, 3. Mai 2013 13:48:14 UTC+2 schrieb Anthony:
>
> That same code works in a controller -- it was merely being demonstrated 
> in a shell. Instead of urllib.urlopen, you can now use fetch (which also 
> works on GAE):
>
> from gluon.tools import fetch
> page = TAG(fetch('http://www.web2py.com'))
> page.elements('div') # gives you a list of all DIV elements in the page 
> (as web2py DIV helper objects)
>
> Actually, at the moment, the above will generate an error because 
> apparently there is an unbalanced <a> tag somewhere on the web2py.compage.
>
> Anthony
>
> On Friday, May 3, 2013 3:15:55 AM UTC-4, Timmie wrote:
>>
>> Hello, 
>> is there an example how to use this: 
>>
>> scraping utils 
>> https://groups.google.com/forum/?fromgroups=#!topic/web2py/skcc2ql3zOs 
>>
>> in a controller? 
>>
>> Especially the first lines (fetching the page and getting it into an 
>> element) is what I am looking for. 
>>
>>
>> The above example is made for the shell access. 
>>
>> Thanks and kind regards, 
>> Timmie 
>>
>>
>>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to web2py+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to