[PLUG] Download All PDF Files From Website

Rich Shepard Sat, 20 Jun 2015 09:22:07 -0700

   Perhaps I'm the only one who did not know how to use wget to download
multiple .pdf files from a website rather than the site itself. If others
also have tried and failed this information may be useful.


   After reading the curl and wget man pages I tried various options to
download ~50 .pdf files from a web site. All attempts failed.

   Web searches revealed many threads and blogs that showed how both can be
used to download an entire site, but not just .pdf or .jpg or other file
types and not the html itself. Then I found a thread on linuxquestions.org
where a responder pointed out that the robots.txt file prevented file
downloads.

   The original poster used that information to create the solution. Use this
command followed by the full URL:

wget --convert-links -r -A "*.pdf" -erobots=off http://www....

   It works!

Rich
_______________________________________________
PLUG mailing list
[email protected]
http://lists.pdxlinux.org/mailman/listinfo/plug

[PLUG] Download All PDF Files From Website

Reply via email to