According to Dennis & Marlis Merbach:
> it seems there is something wrong with pdf indexing on my machine:
> 
> pdf documents get indexed by htdig (at least it lists them in -v mode with 
> filesize and without protest) but never show up in the search results.

A listing in -v output isn't conclusive.  It just means htdig got to the
file, but it doesn't necessarily mean it got anything usable out of it.
You'd need -vvvv to see what words it's actually putting in the index from
a given document.  (This will produce LOTS of output.)

> Are there any settings I could have forgot or any required parameters for 
> htsearch?
...
> htdig 3.1.5 on Suse Linux 7.3, Arcobat Reader 4
> 
> with this config:
> 
> database_dir:         /opt/www/htdig/db
> start_url:                    http://akropolis/inbas/htdocs/index.html
> limit_urls_to:                http://akropolis/inbas/
> exclude_urls:         /cgi-bin/ .cgi template=  inka_original
> bad_extensions:               .php .css .inc .wav .gz .z .sit .au .zip .tar .hqx 
>.exe 
> .com .gif \
>                       .jpg .jpeg .aiff .class .map .ram .tgz .bin .rpm .mpg .mov .avi
> maintainer:           [EMAIL PROTECTED]
> max_head_length:      10000
> max_doc_size:         10000000
> excerpt_length:               200
> no_excerpt_show_top:  true
> search_algorith:      exact:1 endings:0.5 substring: 0.5
> lang_dir:             ${common_dir}/german
> bad_word_list:                ${lang_dir}/bad_words
> endings_affix_file:   ${lang_dir}/german.aff
> endings_dictionary:   ${lang_dir}/german.0
> endings_root2word_db: ${lang_dir}/root2word.db
> endings_word2root_db: ${lang_dir}/word2root.db
> locale: de_DE
> keyword_meta_tag_names:       keywords description
> pdf_parser: /usr/local/Acrobat4/bin/acroread -toPostScript

This may be your problem.  Acrobat 4 doesn't work very reliably for
generating PostScript files in this way.  You'd be much better off with
doc2html.pl and xpdf.  See http://www.htdig.org/FAQ.html#q4.9

-- 
Gilles R. Detillieux              E-mail: <[EMAIL PROTECTED]>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/~grdetil
Dept. Physiology, U. of Manitoba  Phone:  (204)789-3766
Winnipeg, MB  R3E 3J7  (Canada)   Fax:    (204)789-3930

_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to