Hello, 

I have a problem a boolean search and numbers in word documents... 

I have indexed a folder with word, excel and prf Documents, 
everything worked fine. A  boolean seach with the search term like 
"testword not 301" shows in the result TWO  word-documents which 
contain both terms, the testword and the number:  In the  indexed 
files there are 9 documents which contain the testword and the 
number. 

I use a  debian-woody htdig  3.1.6-3 with doc2html.pl and catdoc. 
Maybe somebody has an idea what happened. 

Ahoj Joerg xxx 

  

Here is a part of my config-file: 

####CUT#### 

method_names: or Any and All boolean Boolean 
template_map:  builtin-long builtin-long builtin-long builtin-short 
builtin-short builtin- short 
sort_names:  score Score time Time revscore Revscore revtime Revtime 
title A-Z  revtitle Z-A 

minimum_word_length: 2 

maximum_word_length: 30 

allow_numbers: true 

max_head_length:        10000 

max_doc_size: 20000000 

no_excerpt_show_top:    true 

locale: de_DE 

#search_algorithm:      exact:1 synonyms:0.5 endings:0.1 
search_algorithm:       exact:1 endings:0.1 substring:0.1 
#search_algorithm:      exact:1 endings:0.5 
lang_dir:             ${common_dir}/german 
bad_word_list:        ${lang_dir}/bad_words 
endings_affix_file:   ${lang_dir}/german.aff 
endings_dictionary:   ${lang_dir}/german.0 
endings_root2word_db: ${lang_dir}/root2word.db 
endings_word2root_db: ${lang_dir}/word2root.db 

external_parsers: application/msword->text/html 
/opt/htdig/bin/doc2html.pl \ 
                  application/vnd.ms-excel->text/html 
/opt/htdig/bin/doc2html.pl \ 
                  application/pdf->text/html 
/opt/htdig/bin/doc2html.pl 

debian_pdf_parser: xpdf 

####CUT##### 




-------------------------------------------------------
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to