On Tue, Feb 18, 2003 at 01:33:05PM +0300, Aleksey Serba wrote:
> I'm using rundig to reindex.
> http://htdig.org/rundig.html
>     Rundig uses the "-i" option to htdig, so it always reindexes your web site from 
>scratch when you run it.

I never use rundig. I don't trust it. ;) I had the same problem you were
having on a French crawl of a web site the following are what I had to set
(and then run htdig + htfuzzy + htmerge about 5 times to get them to
actually show up in the search results page). You may also want to delete
your databases manually before you start to make sure they really are
gone.

This is the full config file:
http://xtrinsic.com/geek/articles/workfiles/htdig.fr.txt

I'm pretty sure these are the only relevent parts:
max_head_length:        50000000  # english is 10x bigger
                                  # and it's big enough
excerpt_show_top: false           # true=show top, false=show word
no_excerpt_show_top: false        # if word not found show text below
no_excerpt_text: (Aucun des critères de recherche n'a \
        été trouvé au début de ce document.)


I'm pretty sure the max_doc_size would affect whether or not any search
result is found not whether or not it displays. It's currently set to (a
value smaller than the max_head_length):

# To limit network connections, ht://Dig will only pull up to a certain
# limit
# of bytes. This prevents the indexing from dying because the server keeps
# sending information. 
#
max_doc_size:           200000000

-- 
Emma Jane Hogbin
[[ 416 417 2868 ][ www.xtrinsic.com ]]


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to