I installed the CVS version of Htdig on 15-Oct-2004. I have a batch of PDF files I am indexing that are about 30kB each. Everything seems to work OK except if I search for a phrase that I know is right at the end of one of the documents I get the message "None of the search words were found in the top of this document." for each of the hits. I have increased max_head_length to 100kB, 1MB, 10MB but whatever I do it just won't work. I have also tried reducing max_head_length down to a low value and searching for phases in the middle of a document to check it is doing anything at all. It is.

I don't want to ever have a situation where the excerpt can not be displayed, the documents are all small enough for that not to be an issue. I also don't want to see the top of the document instead of an excerpt. For this application I am working on, excerpts are very important.

The relavent htdig.conf settings are...


max_head_length: 10000000 max_doc_size: 10000000 excerpt_length: 800 max_excerpts: 3 no_excerpt_show_top: false excerpt_show_top: false


Any help much appreciated.

Thanks,
SimonB
/
/


------------------------------------------------------- This SF.net email is sponsored by: IT Product Guide on ITManagersJournal Use IT products in your business? Tell us what you think of them. Give us Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more http://productguide.itmanagersjournal.com/guidepromo.tmpl _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to