simple download manager
I wish to automate the downloading from a particular site which has some ADs and which requires to click on a lot of buttons before the download starts. What library should I use to handle HTTP? Also, I need to support big files ( 1 GB) so the library should hand the data to me chunk by chunk. -- https://mail.python.org/mailman/listinfo/python-list
Re: simple download manager
On Wed, Nov 5, 2014 at 1:53 AM, Kiuhnm gandal...@mail.com wrote: I wish to automate the downloading from a particular site which has some ADs and which requires to click on a lot of buttons before the download starts. What library should I use to handle HTTP? Also, I need to support big files ( 1 GB) so the library should hand the data to me chunk by chunk. You may be violating the site's terms of service, so be aware of what you're doing. This could be a really simple job (just figure out what the last HTTP query is, and replicate that), or it could be insanely complicated (crypto, JavaScript, and/or timestamped URLs could easily be involved). To start off, I would recommend not writing a single like of Python code, but just pulling up Mozilla Firefox with Firebug, or Google Chrome with in-built inspection tools, or some equivalent, and watching the exact queries that go through. Once you figure out what queries are happening, you can figure out how to do them in Python. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: simple download manager
On Tuesday, November 4, 2014 4:00:51 PM UTC+1, Chris Angelico wrote: On Wed, Nov 5, 2014 at 1:53 AM, Kiuhnm gandal...@mail.com wrote: I wish to automate the downloading from a particular site which has some ADs and which requires to click on a lot of buttons before the download starts. What library should I use to handle HTTP? Also, I need to support big files ( 1 GB) so the library should hand the data to me chunk by chunk. You may be violating the site's terms of service, so be aware of what you're doing. This could be a really simple job (just figure out what the last HTTP query is, and replicate that), or it could be insanely complicated (crypto, JavaScript, and/or timestamped URLs could easily be involved). To start off, I would recommend not writing a single like of Python code, but just pulling up Mozilla Firefox with Firebug, or Google Chrome with in-built inspection tools, or some equivalent, and watching the exact queries that go through. Once you figure out what queries are happening, you can figure out how to do them in Python. ChrisA It'll be tricky. I'm sure of that, but if the browser can do it, so can I :) Fortunately, there are no captchas. -- https://mail.python.org/mailman/listinfo/python-list
Re: simple download manager
On Tuesday, November 4, 2014 4:10:59 PM UTC+1, Kiuhnm wrote: On Tuesday, November 4, 2014 4:00:51 PM UTC+1, Chris Angelico wrote: On Wed, Nov 5, 2014 at 1:53 AM, Kiuhnm gandal...@mail.com wrote: I wish to automate the downloading from a particular site which has some ADs and which requires to click on a lot of buttons before the download starts. What library should I use to handle HTTP? Also, I need to support big files ( 1 GB) so the library should hand the data to me chunk by chunk. You may be violating the site's terms of service, so be aware of what you're doing. This could be a really simple job (just figure out what the last HTTP query is, and replicate that), or it could be insanely complicated (crypto, JavaScript, and/or timestamped URLs could easily be involved). To start off, I would recommend not writing a single like of Python code, but just pulling up Mozilla Firefox with Firebug, or Google Chrome with in-built inspection tools, or some equivalent, and watching the exact queries that go through. Once you figure out what queries are happening, you can figure out how to do them in Python. ChrisA It'll be tricky. I'm sure of that, but if the browser can do it, so can I :) Fortunately, there are no captchas. There are no captcha but the site is behind cloudflare (DDOS protection). Anyway, I now know what to do. To deal with cloudflare's javascript challenge I'm going to use jsdb, a neat little javascript interpreter. By the way, I'm using requests instead of urllib, but I need to figure out how to download and write to disk big files. -- https://mail.python.org/mailman/listinfo/python-list