Hello,

I want htdig to exclude URLs that contain the ?  question mark query
separator. I have the following configuration file but URLs like that
are still being indexed. I am using htdig 3.1.4 . Is this a bug?

I know I can exclude URLs like that in htsearch by setting the exclude
query string argument, but I also noticed that if I have it set to
"?  /graphics/" the exclusing no longer works.

Anybody knows what is the problem?


The command line called by PHP like this:

REQUEST_METHOD=GET 
QUERY_STRING="words=forms&format=htdig&exclude=%3F+%2Fgraphics%2F&matchesperpage=10&method=or&page=1&sort=score"
 /usr/local/htdocs/htdig/cgi-bin/htsearch -c setup/htdig.conf

The configuration is this:

database_dir: /usr/local/htdig/db/test
start_url: http://local.test.org/test/
maintainer: [EMAIL PROTECTED]
search_algorithm: exact:1 synonyms:0.5 endings:0.1
exclude_urls: ?
limit_urls_to: http://local.test./test/
bad_extensions: .wav .gz .z .sit .au .zip .tar .hqx .exe .com .gif .jpg .jpeg .aiff 
.class .map .ram .tgz .bin .rpm .mpg .mov .avi
max_head_length: 10000
max_doc_size: 200000
no_excerpt_show_top: true
valid_punctuation: : .-_/!#$%^&*��
template_map: htdig htdig library/htdig_template.html
search_results_header: library/htdig_header.html
search_results_footer: 
nothing_found_file: library/htdig_nomatch.html
syntax_error_file: library/htdig_syntaxerror.html


Regards,
Manuel Lemos

Web Programming Components using PHP Classes.
Look at: http://phpclasses.UpperDesign.com/?[EMAIL PROTECTED]
--
E-mail: [EMAIL PROTECTED]
URL: http://www.mlemos.e-na.net/
PGP key: http://www.mlemos.e-na.net/ManuelLemos.pgp
--


------------------------------------
To unsubscribe from the htdig mailing list, send a message to
[EMAIL PROTECTED]
You will receive a message to confirm this.

Reply via email to