Hello wget list, A question that has been bugging me for quite some time…
If a site has a large amount of hotlinked images, videos, etc… how could one perform an infinite recursive crawl that included hotlinked images, etc, without invoking -H, which would grab unwanted material, and in some cases get way out of control? Heritrix has an option for this: https://webarchive.jira.com/wiki/display/Heritrix/unexpected+offsite+content Httrack has an option, using the --near flag: http://www.httrack.com/html/fcguide.html This is essentially the only thing preventing me from solely using wget for my web archiving needs… am I missing something? Thanks, Ben
