Thanks! I will try this when I get time...
Here is a note... I am using the --spider option now and it looks like it also downloads and saves the file to disk and then removes it when it is done... I don't mind on that, but it doesn't match the documentation... Also, if I use wget in spider mode, it will at the end of the log output tell me about all the broken links... but I also need to know what page those broken links are created on (if the broken link) is on the site I am getting... this will help me find the 404 on my site... I have a vision for how this should work to make it awesome... Any way to do that, or anyone want to add this functionality? Thanks! On Sat, Jul 23, 2011 at 7:12 AM, Giuseppe Scrivano <[email protected]>wrote: > Hello, > > Patrick Steil <[email protected]> writes: > > > If I run this command: > > > > wget www.domain.org/news?page=1 options= -r --no-clobber > --html-extension > > --convert-links -np --include-directories=news > > > Here is what it does today: > > > > 1. When --html-extension is turned on, the --noclobber is not changing > the > > name of the downloaded files, but it DOES rewrite the file as the > date/time > > stamp changes every time I run the above command. > > I couldn't reproduce it. I have `strace'd but I can't see any syscall > which could modify the time stamp. Can you please attach the strace > and the wget debug log? You can get it by: > > strace -o strace.log wget <args> -d -o wget.log > > > > > 2. If I turn off --html-extension, then as soon as WGET sees that the > first > > file has already been downloaded it stops and does not continue to > > spider/download any further pages. > > AFAICS, the behaviour you get using --no-clobber and -r is documented, > and it should work exactly as you described it (a newer version is > ignored). The old version is still traversed for links. > > Cheers, > Giuseppe > -- ** *Patrick Steil | ChurchBuzz.org* Church Website Optimization <http://www.churchbuzz.org/> Like us on Facebook <http://facebook.com/churchbuzz>! Mobile: 940-391-9250
