Thanks, I did start to dig into how DebugComponent does its thing a little, and 
I'm not all the way down the rabbit hole yet, but the lucene indexSearcher's 
explain() method has this comment:

"This is intended to be used in developing Similarity implementations, and, for 
good performance, should not be displayed with every hit. Computing an 
explanation is as expensive as executing the query over the entire index."

Which makes me wonder if I'd get almost all of the debugQuery=true performance 
penalty anyway if I try to do as you suggest.


-----Original Message-----
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Friday, December 07, 2012 10:47 AM
To: solr-user@lucene.apache.org
Subject: Re: Which fields matched?

The debugQuery "explain" is simply a text display of what Lucene has already 
calculated. As such, you could do a custom search component that gets the 
non-text Lucene "Explanation" object for the query and then traverse it to get 
your matched field list without all the text. No parsed would be required, but 
the Explanation structure could get messy.

-- Jack Krupansky

-----Original Message-----
From: Jeff Wartes
Sent: Friday, December 07, 2012 11:59 AM
To: solr-user@lucene.apache.org
Subject: Which fields matched?


If I have an arbitrarily complex query that uses ORs, something like:
q=(simple_fieldtype:foo OR complex_fieldtype:foo) AND 
(another_simple_fieldtype:bar OR another_complex_fieldtype:bar)

I want to know which fields actually contributed to the match for each 
document returned. Something like:
docID=1, 
fields_matched=simple_fieldtype,complex_fieldtype,another_complex_fieldtype
docID=2, fields_matched=simple_fieldtype,another_complex_fieldtype


My basic use case is that I have several copyField'ed variations on the same 
data (using different complex FieldTypes), and I want to know which 
variations contributed to the document so I can conclude things like "Well, 
this document matched the field with the SynonymFilterFactory, but not the 
one without, so this particular document must've been a synonym match."

I know you could probably lift this from debugQuery output, but that's a 
non-starter due to parsing complexity and query performance impact.
I think you could edge into some of this using the HighlightComponent 
output, but that's a non-starter because it requires fields be stored=true. 
Most of my fieldTypes are intended solely for indexing/search, and make no 
sense from a stored/retrieval standpoint. And to be clear, I really don't 
care about which terms matched anyway, only which fields.

If there's an easy way to get this, I'd love to hear it. Otherwise, I'm 
mostly looking for a head start on where to go looking for this data so I 
can add my own Component or something - assuming the data is even available 
in the solr layer?

Thanks. 

Reply via email to