Greetings Neal, On Sat, 4 Oct 2003 11:00, Neal Richter wrote: > If the timestamps are the same we don't bother to download it. > > > I think you misinterpreted what Lachlan suggested, i.e. the case > > where Y does NOT change. If Y is the only document with a link > > to X, and Y does not change, it will still have the link to X, so > > X is still "valid". However, if Y didn't change, and htdig > > (without -i) doesn't reindex Y, then how will it find the link to > > X to validate X's presence in the db? > > Changing Y is the point!
Agreed, changing Y is what triggers the current bug. However, I believe that a simple fix of the current bug will introduce a *new* bug for the more common case that Y *doesn't* change. Reread Gilles's scenario and try to answer his question. I'd explain it more clearly, but I don't have a napkin handy :) If we get around to implementing Google's link analysis, as Geoff suggested, then we may be able to fix the problem properly. It seems that any fix will have to look at all links *to* a page, and then mark as "obsolete" those *links* where (a) the link-from page ("Y") is changed and (b) it no longer contains the link. After the dig, all pages must be checked (in the database), and those with no links which are not obsolete can themselves be marked as obsolete. > However I would strongly recommend we enable head_before_get by > default. We're basically wasting bandwidth like drunken sailors > with it off!!! Good suggestion. If we want some code bloat, we could have an "auto" mode, which would use head_before_get unless -i is specified, but not when -i is specified (since we'll always have to do the "get" anyway)... Cheers, Lachlan -- [EMAIL PROTECTED] ht://Dig developer DownUnder (http://www.htdig.org) ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ ht://Dig Developer mailing list: [EMAIL PROTECTED] List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-dev