On Tue, Feb 18, 2003 at 12:57:30PM -0600, Gilles Detillieux wrote:
> Actually, in the values above, max_doc_size is 4 times the value of
> max_head_length, not a smaller value.  That makes sense, though, as
> the document "head" (i.e. the chunk of plain text extracted from the
> start of the document) would never be larger than max_doc_size.

*laugh* I appear to have issues in all my emails today. I might as well
just head back to bed as the extra coffee doesn't seem to have helped!
I think in one instance of troubleshooting I
actually made my "head" larger than the "doc_size" to force the pages into
submission. In the end it was manually deleting databases that got the
results to appear the way I wanted them to.

> The rule of thumb is that max_doc_size should be at least as large as
> the largest file you want to index, and max_head_length is somewhere
> between 0 and max_doc_size, depending on how important it is to you
> to make sure an excerpt containing the search word can be displayed
> (vs. how much disk space you're willing to use up to accomplish that).

Because I'm indexing a discussion forum finding the max_doc_size and the
max_head_length are moving targets...

And while I'm here making absurd statements about The Way Things
Work...does max_head_length count any content that is included in 
htdig_noindex bits? What about max_doc_size?

The config file info says:
http://htdig.org/attrs.html#max_doc_size
"This is the upper limit to the amount of data retrieved for documents."
Including noindex? Or omitting noindex?

And you know what else I'd like? A library of config files (start_urls
probably ought to be omitted so that people don't index other peoples'
sites by mistake). I learn a LOT from reading other peoples configs...

emma :)

-- 
Emma Jane Hogbin
[[ 416 417 2868 ][ www.xtrinsic.com ]]


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to