I have a complicated problem to solve, and I don't know enough about lucene/solr to phrase the question properly. This is kind of a shot in the dark. My requirement is to return search results always in completely "collapsed" form, rolling up duplicates with a count. Duplicates are defined by whatever fields are requested. If the search requests fields A, B, C, then all matched documents that have identical values for those 3 fields are "dupes". The field list may change with every new search request. What I do know is the super set of all fields that may be part of the field list at index time.
I know this can't be done with configuration alone. It doesn't seem performant to retrieve all 1M+ docs and post process in Java. A very smart person told me that a custom hit collector should be able to do the filtering for me. So, maybe I create a custom search handler that somehow exposes this custom hit collector that can use FieldCache or DocValues to examine all the matches and filter the results in the way I've described above. So assuming this is a viable solution path, can anyone suggest some helpful posts, code fragments, books for me to review? I admit to being out of my depth, but this requirement isn't going away. I'm grasping for straws right now. thanks (using Solr 4.9) -- View this message in context: http://lucene.472066.n3.nabble.com/Engage-custom-hit-collector-for-special-search-processing-tp4179348.html Sent from the Solr - User mailing list archive at Nabble.com.