On 26/06/13 00:20, Yiwei Yang wrote:
Hi,
I'm trying to understand how wget -p find out "everything that supports
the web page" to be downloaded . Could someone refer to me where I could
find this part of code in wget source code? Thank you!
Lucy
See src/html-url.c
The interesting tags are described at known_tags (line 91), whose
attributes
are listed at tag_url_attributes (line 140).
The real work is done by src/html-parse.c, but you can treat it as a
black box
implementing map_html_tags().
get_urls_html calls map_html_tags(), which then calls collect_tags_mapper()
for each tag we marked.
Regards