I think that sounds reasonable.
On Friday, May 3, 2013 3:50:29 PM UTC-4, Timmie wrote:
>
> Thank you very much. This is helpful!
>
> Actually, what I wanna do is
> 1) read a *<table>DATA</table>* from a exteranl page and insert it to the
> database.
> 2) Using a cron /scheduler to update the database table periodically if
> the source web page had changed
>
> Could you give me a idea how to get going with 1)?
>
> Say I have done this:
> * fetch the rows (<td>CONTENT</td>) of the table into the elements
> collection
> * stripped unwanted parts off and separated the columns
>
> Now how do I push into the table?
>
> Here's what I plan to do:
> * create a model corresponding table columns and data (all strings,
> numbers)
> * read the data sliced from the elements collection by a for loop and
> assign it to the database table fields.
>
> Would you confirm this approach?
> Is there a more efficient way?
>
> Thank you and kind regards,
> Timmie
>
> Am Freitag, 3. Mai 2013 13:48:14 UTC+2 schrieb Anthony:
>>
>> That same code works in a controller -- it was merely being demonstrated
>> in a shell. Instead of urllib.urlopen, you can now use fetch (which also
>> works on GAE):
>>
>> from gluon.tools import fetch
>> page = TAG(fetch('http://www.web2py.com'))
>> page.elements('div') # gives you a list of all DIV elements in the page
>> (as web2py DIV helper objects)
>>
>> Actually, at the moment, the above will generate an error because
>> apparently there is an unbalanced <a> tag somewhere on the web2py.compage.
>>
>> Anthony
>>
>> On Friday, May 3, 2013 3:15:55 AM UTC-4, Timmie wrote:
>>>
>>> Hello,
>>> is there an example how to use this:
>>>
>>> scraping utils
>>> https://groups.google.com/forum/?fromgroups=#!topic/web2py/skcc2ql3zOs
>>>
>>> in a controller?
>>>
>>> Especially the first lines (fetching the page and getting it into an
>>> element) is what I am looking for.
>>>
>>>
>>> The above example is made for the shell access.
>>>
>>> Thanks and kind regards,
>>> Timmie
>>>
>>>
>>>
--
---
You received this message because you are subscribed to the Google Groups
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.