[Bug-wget] wget (extra?) features ...

Albretch Mueller Tue, 01 Feb 2011 11:11:33 -0800

 After rereading the wget's manual:
~
 gnu.org/software/wget/manual/wget.html
~
 I still don't see how to include the following options:
~
 1) say you need to get all linked pages and requisites (style sheets,
images, ...) starting from a certain URL ONLY if they belong to the
same domain as the one of the URL, and,
~
 2) if externally linked, get only the one page (with its requisites)
without crawling that other site
~
 3) some regexp functionality to discriminate certain sites more finely.
~
 the thing is that with '--domains=domain-list' you can specify that
domain, but you can not allow for single external pages and some sites
include the exact same information in more than one language and you
may not be interested in downloading it in all available languages
~
 Can you achieve this using wget?
~
 Anyway, wget may not have been exactly designed to do this.  Do you
know of any other os project that would, for example, consolidate many
pages into one, parse out all inline script from, say,
googlesyndication.com and such crud, ...?
~
 thanks
 lbrtchx

[Bug-wget] wget (extra?) features ...

Reply via email to