On Mon, 28 Jan 2002, Neal Richter wrote: > The more user-side interactive the page is, generally the worse off you > will be.
This essentially sums up the problem. A spider cannot emulate a human user. Even if you make the assumption that a spider might be able to parse and/or run the JavaScript or whatever, that doesn't mean it can actually use it for navigation. Is it supposed to know that one drop-down menu is supposed to be relative URLs and another is something else? > After reading the Retriever & HTML parsing code, htdig pretty much treats > web pages as documents-to-parse, and not programs-to-run. So without good > default behavior, it may not be too sucessfull on pages with dynamic > content for navigation purposes. Beyond everything that's been discussed so far, I'll interject that there are perfectly good ways of pointing a spider at URLs even if you use some sort of dynamic navigation. For example, the <LINK> tag: <http://www.w3.org/TR/html4/struct/links.html#h-12.3> In particular, the HTML specifications have sections about "Links and search engines." :-) -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

