Hi Adam,

I tried to do that, but the .zip file is corrupted. I think that this is because the URL points to a web page with a link to the file, not to the file itself. Thus, the save_html just saves the web page. In case you want to have a look, the page is here: http://data.gbif.org/download/downloadReady.htm?downloadFile=occurrence-search-12978055989365071032693658999911.zip
(notice that it will expire in about seven hours from now).

Thanks a lot,
Avi


On 2/16/2011 9:50 AM, Adam Victor Nazareth Brandizzi wrote:
On Wed, Feb 16, 2011 at 1:20 PM, Avi Bar Massada<[email protected]>  wrote:
Hi,
Hi, Avi!

I've been using twill with a python script to automate downloads from
web-based databases. Until now, I only needed to fetch text files, so it was
pretty simple. I've been using:

go("web address")
b = twill.get_browser()
data = b.result.get_page()

Now, I'm trying to fetch data from a different website, which generates a
link to a .zip file. Given that I know the direct URL to the zip file, would
it be possible to download it directly using twill? Clicking on the link in
the actual page opens a download dialogue box. Is there any way to bypass it
and just get the file directly?

Here I got a ZIP file with "go"

go http://jsfcompref.appspot.com/faces/chapter04.zip
and wrote it to a file using "save_html"

save_html chapter04.zip
It worked flawlessly:

Diderot:sandbox brandizzi$ unzip chapter04.zip
Archive:  chapter04.zip
    creating: chapter04/web/
   [...]
   inflating: build.properties.sample

Have you tried to do it?

Thanks!
Avi

Good luck!

_______________________________________________
twill mailing list
[email protected]
http://lists.idyll.org/listinfo/twill

Reply via email to