Re: Recursive downloading of pages through the "action" attributes of the following "form" tags
I reproduce this issue with the lastest version (1.21.4) with the following pages : form.html: form.html: form.html: form.html: form.html: form.html: form.html: form.html: post.html: post.html: post.html: link post.html: post.html: link.html: link.html: link.html: form link.html: link.html: A basic recusive command only downloads the form.html page when I expected to download all 3 pages. wget-1.21.4$ ./src/wget -r http://127.0.0.1/form.html --2023-05-15 01:08:55-- http://127.0.0.1/form.html Connecting to 127.0.0.1:80... connected. HTTP request sent, awaiting response... 200 OK Length: 145 [text/html] Saving to: '127.0.0.1/form.html' 127.0.0.1/form.html 100%[===>] 145 --.-KB/s in 0s 2023-05-15 01:08:55 (18.6 MB/s) - '127.0.0.1/form.html' saved [145/145] FINISHED --2023-05-15 01:08:55-- Regards, Florian On 4/17/23 21:22, BERBAR Florian wrote: Hi folk, I have question about recursive downloading of webpages. Trying to download all pages from a website using recursing option (--recursive) on wget 1.21, the webpages processing seems to don't follow form "action" attributs of "form" tags. - Does it be the expecting behavior? - Is there a combination of options to download all pages of a website with the attribut "action"? Exemple with 3 HTML pages : - Page 1 - form.html : HTML form with "action" attribut pointing to "Page 2" - Page 2 - post.html : HTML page with a link to "Page 3". - Page 3 - link.html : HTML page without link. I tried this command to download all tree pages but only "Page 1" was downloaded: $ wget -r https://host/form.html I tried "--follow-tags=form" option but the same behavior was observed. Regards, Florian
Re: Recursive downloading of pages through the "action" attributes of the following "form" tags
Hello Tim, The 3 pages used during my tests are the following : form.html: form.html: form.html: form.html: form.html: form.html: form.html: post.html: post.html: post.html: post.html: post.html: link.html: link.html: link.html: link.html: link.html: I tried to download all the three pages with recursive mode using the following command but only the first page was downloaded (form.html) : $ wget -r http://127.0.0.1/form.html Regards, Florian On 4/22/23 20:21, Tim Rühsen wrote: On 17.04.23 21:22, BERBAR Florian wrote: Hi folk, I have question about recursive downloading of webpages. Trying to download all pages from a website using recursing option (--recursive) on wget 1.21, the webpages processing seems to don't follow form "action" attributs of "form" tags. - Does it be the expecting behavior? - Is there a combination of options to download all pages of a website with the attribut "action"? Exemple with 3 HTML pages : - Page 1 - form.html : HTML form with "action" attribut pointing to "Page 2" - Page 2 - post.html : HTML page with a link to "Page 3". - Page 3 - link.html : HTML page without link. I tried this command to download all tree pages but only "Page 1" was downloaded: $ wget -r https://host/form.html I tried "--follow-tags=form" option but the same behavior was observed. Generally, Wget supports form tags with action attributes. So maybe you encounter malformed HTML or there is a bug in Wget. Could you please give us a copy of that page, or at least the HTML part containing the form tags ? Regards, Tim Regards, Florian
Recursive downloading of pages through the "action" attributes of the following "form" tags
Hi folk, I have question about recursive downloading of webpages. Trying to download all pages from a website using recursing option (--recursive) on wget 1.21, the webpages processing seems to don't follow form "action" attributs of "form" tags. - Does it be the expecting behavior? - Is there a combination of options to download all pages of a website with the attribut "action"? Exemple with 3 HTML pages : - Page 1 - form.html : HTML form with "action" attribut pointing to "Page 2" - Page 2 - post.html : HTML page with a link to "Page 3". - Page 3 - link.html : HTML page without link. I tried this command to download all tree pages but only "Page 1" was downloaded: $ wget -r https://host/form.html I tried "--follow-tags=form" option but the same behavior was observed. Regards, Florian