On Wed, 2 Jun 2004, Janine Sisk wrote:

> When you search for a particular word (which happens to be my client's 
> last name :) in my site,  you get five matches.  Four of them look 
> entirely normal;  the excerpt starts from the beginning of the text on 
> the page.  The fifth is different.  The excerpt is snipped out of the 
> middle of the document and contains the search term.  The search term 
> is a hyperlink, which goes to the first footnote in the full article.  
> This footnote has nothing to do with the word in question.
> 
> This seems bizarre to me.  What could be different about this one match?

The behavior you describe doesn't sound out of the ordinary. Unless you
tell htsearch otherwise, it tries to find occurrences of the search term
in the text saved from the dig and then use that part of the text as the
excerpt. In most cases this is a good thing since it gives some context
for the term in the result list. However if you are using a small value
for the max_head_length attribute (I believe the default is only 512
bytes), then in many cases the saved text won't include an instance of
the search term. In this case, the text used for the excerpt depends on
other attribute settings. If the no_excerpt_show_top attribute is set to
true, then htsearch just displays whatever text it found at the top of
the document when the page was indexed. If no_excerpt_show_top is set to
false, then a generic message is used for the excerpt (the message can
be changed with the no_excerpt_text attribute).

The fact that you encountered a linked term in the excerpt is a separate
issue. Most likely this is due to the add_anchors_to_excerpt attribute;
this attribute causes links to the closest anchors to be associated with
the search term in the excerpt.

If your goal is to have only the top of the page shown in each excerpt,
there is also an excerpt_show_top attribute that will do exactly that.

> 
> I tried to find some documentation on how the excerpts are created, but 
> failed.  Will I need to dive into the code to figure this out?

There are a number of attributes that affect excerpts. The valid ht://Dig
attributes are listed at http://www.htdig.org/confindex.html. If you use
your browser to find occurrences of 'excerpt' in the page, you should be
able to dig up most of related attributes.

Jim


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
>From Windows to Linux, servers to mobile, InstallShield X is the one
installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
ht://Dig general mailing list: <[EMAIL PROTECTED]>
ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html
List information (subscribe/unsubscribe, etc.)
https://lists.sourceforge.net/lists/listinfo/htdig-general

Reply via email to