Re: Finding out which field caused the search hit

mark harwood Mon, 10 Sep 2007 02:15:20 -0700

>> the answer has never been satisfactory

Is this the original question?
    http://www.nabble.com/Which-field-matched---tf4141549.html#a11780757



What actually formed the basis of a document match is hidden in a tree of
heterogeneous Query objects and to be efficient their match output is
limited to document ids and scores-  not some detailed analysis of
which sections of a document matched. It is therefore hard/impossible
to have any highlighting solution which provides answers for all query
types and the existing Highlighter relies on a rough heuristic where 
QueryTermExtractor output is used to find the list of query terms used and 
field TokenStreams are analyzed for content.

You mention Highlighter performance would be bad for wildcard queries. Have you 
tried it? If it does turn out to be bad (many wildcard variants produced) might 
I suggest the following:

1) Dissect the unrewritten Query and find all WildcardQuery objects
2) Create a custom analyzer that re-implements the wildcard logic and produces 
a highlighter-friendly token stream i,e
    Given a query of 
            Fred W*
    and data of
            Fred West was arrested
    the analyzer would produce:
            Fred [W*|West] was arrested
    ..where the tokens "W*" and "West" appear at the same position
3) Add a special wildcard term (W*) to the list of Query terms given to the 
Highlighter. This would then match with the W* injected into the content in 
step 2)

This would avoid the overhead of picking through all the wildcard variants 
produced by the wildcardQuery but at the cost of extra coding on your part and 
the runtime cost of re-executing wildcard logic on all terms in the selected 
documents' TokenStreams. The difference in runtime cost may prove minimal.

Cheers
Mark






      ___________________________________________________________
Yahoo! Answers - Got a question? Someone out there knows the answer. Try it
now.
http://uk.answers.yahoo.com/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Finding out which field caused the search hit

Reply via email to