According to Geoff Hutchison:
> On Tue, 20 Mar 2001, Gilles Detillieux wrote:
> > It turns out that htdig does a depth-order traversal of the document tree,
> > so really the hop count should always be increasing, never decreasing.
>
> <sigh> It's been ages since I caught myself in loops (pun intended) with
> hopcounts. Alas, it is not quite so simple. In part, servers can refer to
> each other.
>
> www.foo.com -> 1.html -> 2.html
> www.bar.com -> www.foo.com/2.html (oops!)
>
> It's complicated because with multiple servers, we don't always do an
> exact depth-first search. For example, with 3.2 we can index a few
> documents in a row on one server before jumping to another, which is great
> for HTTP performance, but... So the servers keep URLs in a priority queue
> by hopcount.
Oh, right! I was just thinking of the simple case of a single server.
I think in this case, the hopcount should always increase or stay the
same, but never decrease.
> In 3.1, URLs are put on the queue in a semi-haphazard fashion. Let's
> continue the example above. We put foo.com/1 onto the queue (hop 1), then
> go to bar.com and add some URLs, including foo.com/2 (also hop 1). We go
> back to foo.com to index 1.html and then we hit the problem in question.
>
> > In 3.2.0b3, Geoff tried to fix it, but IMHO ended up breaking it even
> > more, with this patch: "http://www.htdig.org/mail/1998/11/0345.html".
>
> I believe that's 3.1.0b3.
Oops, yes.
> And the previous code is obviously wrong as the
> example above indicates. (We'd suddenly give 2.html credit for a longer
> path!?) But I think I was imagining a non-existent possibility.
>
> > hop count to drop? It seems any change should be to the referenced
> > document, not to the current one. Can you let me know if my patch
> > breaks anything?
>
> Your code is correct, but I still need to think about loops for a bit
> more. It's good you brought it up since I see some other cleanups in
> there...
OK, I'll have to have a good look at the new Retriever code to see what I
can backport.
--
Gilles R. Detillieux E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba Phone: (204)789-3766
Winnipeg, MB R3E 3J7 (Canada) Fax: (204)789-3930
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html