This works better since a lot of files get specified with additional info appended to the name. Otherwise wget will download the file and then proceed to delete it.
wget --convert-links -r -A "*.pdf*" -erobots=off http://www.... On Sat, 2015-06-20 at 09:19 -0700, Rich Shepard wrote: > Perhaps I'm the only one who did not know how to use wget to download > multiple .pdf files from a website rather than the site itself. If others > also have tried and failed this information may be useful. > > After reading the curl and wget man pages I tried various options to > download ~50 .pdf files from a web site. All attempts failed. > > Web searches revealed many threads and blogs that showed how both can be > used to download an entire site, but not just .pdf or .jpg or other file > types and not the html itself. Then I found a thread on linuxquestions.org > where a responder pointed out that the robots.txt file prevented file > downloads. > > The original poster used that information to create the solution. Use this > command followed by the full URL: > > wget --convert-links -r -A "*.pdf" -erobots=off http://www.... > > It works! > > Rich > _______________________________________________ > PLUG mailing list > [email protected] > http://lists.pdxlinux.org/mailman/listinfo/plug _______________________________________________ PLUG mailing list [email protected] http://lists.pdxlinux.org/mailman/listinfo/plug
