Re: [Bug-wget] downloading links in a dynamic site
On Mon, Jul 26, 2010 at 1:51 PM, Vinh Nguyen vinhdi...@gmail.com wrote: That's displayed in the source. Also, when i try to manually enter the url changing =10, =20, =30, I get the right page, so I don't think it's a javascript issue. What else could it be besides referer and cookies? Confirmed that it also works in a DIFFERENT browser (conkeror and firefox). Hmm, what can be the difference between wget and these browsers?
Re: [Bug-wget] downloading links in a dynamic site
On Mon, Jul 26, 2010 at 2:02 PM, Vinh Nguyen vinhdi...@gmail.com wrote: On Mon, Jul 26, 2010 at 1:51 PM, Vinh Nguyen vinhdi...@gmail.com wrote: That's displayed in the source. Also, when i try to manually enter the url changing =10, =20, =30, I get the right page, so I don't think it's a javascript issue. What else could it be besides referer and cookies? Confirmed that it also works in a DIFFERENT browser (conkeror and firefox). Hmm, what can be the difference between wget and these browsers? This issue is RESOLVED. Put 'quotes' around the url. I thought I had this the entire time. Thanks everyone. Vinh
[Bug-wget] downloading links in a dynamic site
Dear list, My goal is to download some pdf files from a dynamic site (not sure on the terminology). For example, I would execute: wget -U firefox -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=0 and would get my 10 pdf files. On the page I can click a Next link (to have more files), and I execute: wget -U firefox -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=10 However, the downloaded files are identical to the previous. I tried the cookies setting and referer setting: wget -U firefox --cookies=on --keep-session-cookies --save-cookies=cookie.txt -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=0 wget -U firefox --referer='http://site.com/?sortorder=ascp_o=0' --cookies=on --load-cookies=cookie.txt --keep-session-cookies --save-cookies=cookie.txt -r -l1 -nd -e robots=off -A '*.pdf,*.pdf.*' http://site.com/?sortorder=ascp_o=10 but the results again are identical. Any suggestions? Thanks. Vinh