hello, When mirroring a website how do I just download HTML content (whether static, PHP, ASP, etc) and ignore images, css, js, and everything else? At first I thought of creating an accept list, but I can't rely on the file extension because many HTML pages do not include an extension (eg http://en.wikipedia.org/wiki/Foo) Then I thought of a reject list, but there are so many different kinds of non-HTML content.
Is there a way to do this with wget? thanks, Richard
