On Wed, 2 Jun 2004, Janine Sisk wrote: > When you search for a particular word (which happens to be my client's > last name :) in my site, you get five matches. Four of them look > entirely normal; the excerpt starts from the beginning of the text on > the page. The fifth is different. The excerpt is snipped out of the > middle of the document and contains the search term. The search term > is a hyperlink, which goes to the first footnote in the full article. > This footnote has nothing to do with the word in question. > > This seems bizarre to me. What could be different about this one match?
The behavior you describe doesn't sound out of the ordinary. Unless you tell htsearch otherwise, it tries to find occurrences of the search term in the text saved from the dig and then use that part of the text as the excerpt. In most cases this is a good thing since it gives some context for the term in the result list. However if you are using a small value for the max_head_length attribute (I believe the default is only 512 bytes), then in many cases the saved text won't include an instance of the search term. In this case, the text used for the excerpt depends on other attribute settings. If the no_excerpt_show_top attribute is set to true, then htsearch just displays whatever text it found at the top of the document when the page was indexed. If no_excerpt_show_top is set to false, then a generic message is used for the excerpt (the message can be changed with the no_excerpt_text attribute). The fact that you encountered a linked term in the excerpt is a separate issue. Most likely this is due to the add_anchors_to_excerpt attribute; this attribute causes links to the closest anchors to be associated with the search term in the excerpt. If your goal is to have only the top of the page shown in each excerpt, there is also an excerpt_show_top attribute that will do exactly that. > > I tried to find some documentation on how the excerpts are created, but > failed. Will I need to dive into the code to figure this out? There are a number of attributes that affect excerpts. The valid ht://Dig attributes are listed at http://www.htdig.org/confindex.html. If you use your browser to find occurrences of 'excerpt' in the page, you should be able to dig up most of related attributes. Jim ------------------------------------------------------- This SF.Net email is sponsored by the new InstallShield X. >From Windows to Linux, servers to mobile, InstallShield X is the one installation-authoring solution that does it all. Learn more and evaluate today! http://www.installshield.com/Dev2Dev/0504 _______________________________________________ ht://Dig general mailing list: <[EMAIL PROTECTED]> ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html List information (subscribe/unsubscribe, etc.) https://lists.sourceforge.net/lists/listinfo/htdig-general

