Re: [Bug-wget] downloading links in a dynamic site
Vinh Nguyen wrote: Dear list, My goal is to download some pdf files from a dynamic site (not sure on the terminology). For example, I would execute: wget -U firefox -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=0 and would get my 10 pdf files. On the page I can click a Next link (to have more files), and I execute: wget -U firefox -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=10 However, the downloaded files are identical to the previous. I tried the cookies setting and referer setting: wget -U firefox --cookies=on --keep-session-cookies --save-cookies=cookie.txt -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=0 wget -U firefox --referer='http://site.com/?sortorder=ascp_o=0' --cookies=on --load-cookies=cookie.txt --keep-session-cookies --save-cookies=cookie.txt -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=10 but the results again are identical. Any suggestions? Thanks. Vinh Look at the page source how they are generating the urls. Maybe they are using some ugly javascript, although that discards the benefit of paging...
Re: [Bug-wget] downloading links in a dynamic site
On Mon, Jul 26, 2010 at 1:51 PM, Vinh Nguyen vinhdi...@gmail.com wrote: That's displayed in the source. Also, when i try to manually enter the url changing =10, =20, =30, I get the right page, so I don't think it's a javascript issue. What else could it be besides referer and cookies? Confirmed that it also works in a DIFFERENT browser (conkeror and firefox). Hmm, what can be the difference between wget and these browsers?
Re: [Bug-wget] downloading links in a dynamic site
On Mon, Jul 26, 2010 at 2:02 PM, Vinh Nguyen vinhdi...@gmail.com wrote: On Mon, Jul 26, 2010 at 1:51 PM, Vinh Nguyen vinhdi...@gmail.com wrote: That's displayed in the source. Also, when i try to manually enter the url changing =10, =20, =30, I get the right page, so I don't think it's a javascript issue. What else could it be besides referer and cookies? Confirmed that it also works in a DIFFERENT browser (conkeror and firefox). Hmm, what can be the difference between wget and these browsers? This issue is RESOLVED. Put 'quotes' around the url. I thought I had this the entire time. Thanks everyone. Vinh
[Bug-wget] downloading links in a dynamic site
Dear list, My goal is to download some pdf files from a dynamic site (not sure on the terminology). For example, I would execute: wget -U firefox -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=0 and would get my 10 pdf files. On the page I can click a Next link (to have more files), and I execute: wget -U firefox -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=10 However, the downloaded files are identical to the previous. I tried the cookies setting and referer setting: wget -U firefox --cookies=on --keep-session-cookies --save-cookies=cookie.txt -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=0 wget -U firefox --referer='http://site.com/?sortorder=ascp_o=0' --cookies=on --load-cookies=cookie.txt --keep-session-cookies --save-cookies=cookie.txt -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=10 but the results again are identical. Any suggestions? Thanks. Vinh