On Monday 21 October 2013 12:33:10 Alexander Tobias Heinrich wrote: > For example, I tried: > wget --tries=3 --retry-connrefused --no-clobber --load-cookies=cookies.txt > --convert-links --page-requisites --adjust-extension --recursive > --include-directories /strategy/live-poker,/download > http://www.pokerstrategy.com/strategy/live-poker > > This correctly downloads only the html documents I want and also gets the > media files from the /download folder, but: > - does not modify the html so that <img>-Tags point to the downloaded files > (however, it does modify <a>-Tags that link to local html documents) > - does not get media files from other domains. > > If for example I add --span-hosts, it simply gets too much (all documents > from different language versions of the website that I don't need). > > Note: For the example URL I provided here you won't need to log in and thus > the load-cookies option can be waived.
Hi Alexander, please have a look into the 'Recursive Accept/Reject Options' docs. You could set the domains to be followed by using --domains. Also --include-directories and/or --exclude-directories might be a help. I am not sure that you can achieve your goal with a single call to Wget. Missing files / directories could be downloaded using separate calls to Wget. --input-file combined with --force-html and/or --base might be a help. Regards, Tim
