El 22/09/15 19:10, El Gato escribió:
Hi, everyone.
I am having trouble with wget64 on Windows. I am using a batch script
to download files from a host:
@echo OFF
FOR /L %%i in (1, 1, 9999) DO (
cls
echo Downloading file %%i
wget64.exe -A pdf,chm -e robots=off --progress=bar --show-progress
-r -np -nd -nc -HDit-ebooks.info,filepi.com --content-disposition -a
wget.log it-ebooks.info/book/%%i/
)
|wget| will download |index.html| (which I feel is unnecessary), then
it proceeds to the hosted file and downloads it if the file does not
exist on the destination, but will fail to retrieve the |index.html|
of the next book and start the next download.
Is it really necessary to download |index.html| and if that is the
case, how can I tell |wget| to erase and download the new one every time?
It should be downloading then deleting it, since you are only accepting
pdf and chm files (it downloads index.html for looking for the files).
And that's what it does here.
As a bit of unwanted help, I would recommend printing the urls (replace
the for contents with an echo) and loading the list with wget -i - This
way wget will be able to reuse the opened connection instead of running
10000 instances (and connecting to the server 10000 times).