Hi Adam,
I tried to do that, but the .zip file is corrupted. I think that this is
because the URL points to a web page with a link to the file, not to the
file itself. Thus, the save_html just saves the web page.
In case you want to have a look, the page is here:
http://data.gbif.org/download/downloadReady.htm?downloadFile=occurrence-search-12978055989365071032693658999911.zip
(notice that it will expire in about seven hours from now).
Thanks a lot,
Avi
On 2/16/2011 9:50 AM, Adam Victor Nazareth Brandizzi wrote:
On Wed, Feb 16, 2011 at 1:20 PM, Avi Bar Massada<[email protected]> wrote:
Hi,
Hi, Avi!
I've been using twill with a python script to automate downloads from
web-based databases. Until now, I only needed to fetch text files, so it was
pretty simple. I've been using:
go("web address")
b = twill.get_browser()
data = b.result.get_page()
Now, I'm trying to fetch data from a different website, which generates a
link to a .zip file. Given that I know the direct URL to the zip file, would
it be possible to download it directly using twill? Clicking on the link in
the actual page opens a download dialogue box. Is there any way to bypass it
and just get the file directly?
Here I got a ZIP file with "go"
go http://jsfcompref.appspot.com/faces/chapter04.zip
and wrote it to a file using "save_html"
save_html chapter04.zip
It worked flawlessly:
Diderot:sandbox brandizzi$ unzip chapter04.zip
Archive: chapter04.zip
creating: chapter04/web/
[...]
inflating: build.properties.sample
Have you tried to do it?
Thanks!
Avi
Good luck!
_______________________________________________
twill mailing list
[email protected]
http://lists.idyll.org/listinfo/twill