I'm having a problem where my search results are out-of-date with
respect to the site, even though htdig is definitely running, and
definitely fetching the files from the web server, and not giving
errors. Perhaps I am misunderstanding what an update dig does? I thought
that it checked every document in its database, and rescanned it if it
was new, as well as following any links to new documents, and removing
it if it gets a 404.

I run htdig and htmerge with the -a commandline options. I then move the
*.docdb.work, *.docs.index.work and *.words.db.work files to *.docdb,
*.docs.index.work and *.words.db respectively. I don't actually use
wildcards, the *s are just there because I have different databases for
different sites. I then copy the *.docdb file back to *.docdb.work so
that it is there for the next update dig. The *.wordlist.work file is
left alone ready for the next update.

Does that procedure sound correct? All the pages on the sites use
server-side includes, and hence don't have Last-Modified: headers, could
that be confusing matters?

I have been running tail -f on *.wordlist.work while htdig is running,
and it just seems to be adding lines like

+707
+494
+689
+495
+709
+478
+1072
+504

rather than any new words. This seems odd to me, but then I never did
this before. After htmerge is finished those lines aren't there any
more.

I'm running htdig 3.1.5, compiled with gcc 2.8.1 on Solaris 2.6. I've
attached one of my config files, with the comments removed to save
space.

Adam Rice
database_dir:           /export2/www/nq/htdig-db

start_url:              http://www.thisislancashire.co.uk/lancashire/

limit_urls_to:          http://www.thisislancashire.co.uk/lancashire/

database_base:          ${database_dir}/lancashire

search_results_wrapper: 
/export/www/nq/htdig-test/templates/lancashire/search_results.html
nothing_found_file:     
/export/www/nq/htdig-test/templates/lancashire/nothing_found.html
syntax_error_file:      
/export/www/nq/htdig-test/templates/lancashire/syntax_error.html

exclude_urls:           /cgi-bin/ .cgi (old) (updating) xml /archive/ /thisisunited/

maintainer:             [EMAIL PROTECTED]


max_head_length:        256

use_meta_description:   true

keywords_meta_tag_names:        htdig-keywords

excerpt_show_top:       yes

date_format:            %d %B %Y

allow_numbers:          yes

translate_amp:          true
translate_lt_gt:        true
translate_quot:         true

max_doc_size:           2000000

search_algorithm:       exact:1 synonyms:0.5 endings:0.1

template_map: Long long ${common_dir}/long.html \
                Short short ${common_dir}/short.html \
                Tidy tidy ${common_dir}/newsquest.html
template_name: tidy

next_page_text:         <img src=/htdig/buttonr.gif border=0 align=middle width=30 
height=30 alt=next>
no_next_page_text:
prev_page_text:         <img src=/htdig/buttonl.gif border=0 align=middle width=30 
height=30 alt=prev>
no_prev_page_text:
page_number_text:       "<img src=/htdig/button1.gif border=0 align=middle width=30 
height=30 alt=1>" \
                        "<img src=/htdig/button2.gif border=0 align=middle width=30 
height=30 alt=2>" \
                        "<img src=/htdig/button3.gif border=0 align=middle width=30 
height=30 alt=3>" \
                        "<img src=/htdig/button4.gif border=0 align=middle width=30 
height=30 alt=4>" \
                        "<img src=/htdig/button5.gif border=0 align=middle width=30 
height=30 alt=5>" \
                        "<img src=/htdig/button6.gif border=0 align=middle width=30 
height=30 alt=6>" \
                        "<img src=/htdig/button7.gif border=0 align=middle width=30 
height=30 alt=7>" \
                        "<img src=/htdig/button8.gif border=0 align=middle width=30 
height=30 alt=8>" \
                        "<img src=/htdig/button9.gif border=0 align=middle width=30 
height=30 alt=9>" \
                        "<img src=/htdig/button10.gif border=0 align=middle width=30 
height=30 alt=10>"
no_page_number_text:    "<img src=/htdig/button1.gif border=2 align=middle width=30 
height=30 alt=1>" \
                        "<img src=/htdig/button2.gif border=2 align=middle width=30 
height=30 alt=2>" \
                        "<img src=/htdig/button3.gif border=2 align=middle width=30 
height=30 alt=3>" \
                        "<img src=/htdig/button4.gif border=2 align=middle width=30 
height=30 alt=4>" \
                        "<img src=/htdig/button5.gif border=2 align=middle width=30 
height=30 alt=5>" \
                        "<img src=/htdig/button6.gif border=2 align=middle width=30 
height=30 alt=6>" \
                        "<img src=/htdig/button7.gif border=2 align=middle width=30 
height=30 alt=7>" \
                        "<img src=/htdig/button8.gif border=2 align=middle width=30 
height=30 alt=8>" \
                        "<img src=/htdig/button9.gif border=2 align=middle width=30 
height=30 alt=9>" \
                        "<img src=/htdig/button10.gif border=2 align=middle width=30 
height=30 alt=10>"



------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.
List archives:  <http://www.htdig.org/mail/menu.html>
FAQ:            <http://www.htdig.org/FAQ.html>

Reply via email to