Re: [htdig] Re: htdig indexing and soft links

Jim Cole Thu, 06 Mar 2003 13:07:17 -0800

On Thursday, March 6, 2003, at 01:56 AM, Jarno Laitinen wrote:

Well, first off, you probably should put the various Apache
FancyIndexing ?C=M (etc.) strings into your exclude_urls patterns.
Otherwise, you're getting a slew of almost-duplicate documents.

The htdig and apache are not so familiar to me. Could you clarify your answer?

Take a look at http://www.htdig.org/FAQ.html#q4.23. This FAQ provides a couple solutions to the issue of FancyIndexing.

As for your infinite loop, it does happen. That's why there's work on
duplicate detection in 3.2 as well as the max_hop_count attribute:
http://www.htdig.org/attrs.html#max_hop_count
So what is your proposal to prevent the cycles? I can't decrease the max_hop_count 'cause I don't know how deep my tree will be.

If you can neither fix the problem directly nor limit the dig based on hop count, you might take a look at http://www.htdig.org/attrs.html#exclude_urls. If you happen to know in advance where the problems will occur, you might be able to come up with URL patterns (e.g. repeated bits such as /dir1/dir1/dir1) that can be used to exclude the URLs.

Jim

------------------------------------------------------- This SF.net email is sponsored by: Etnus, makers of TotalView, The debugger for complex code. Debugging C/C++ programs can leave you feeling lost and disoriented. TotalView can help you find your way. Available on major UNIX and Linux platforms. Try it free. www.etnus.com _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

Re: [htdig] Re: htdig indexing and soft links

Reply via email to