You could run the MLT for the document in question, then gather all
those doc id's in the MLT results and negate those in a subsequent
query. Not sure how robust that would work with very large result sets,
but something to try.

Another approach would be to gather the "interesting terms" from the
document in question and then negate those terms in subsequent queries.
Perhaps with many negated terms, Solr will rank the results based on
most negated terms above less negated terms, simulating a ranked "less
like" effect.

On Fri, 2012-04-20 at 15:38 -0700, Charlie Maroto wrote:
> Hi all,
> 
> Is there a way to implement the opposite to MoreLikeThis (LessLikeThis, I
> guess :).  The requirement we have is to remove all documents with content
> like that of a given document id or a text provided by the end-user.  In
> the current index implementation (not using Solr), the user can narrow
> results by indicating what document(s) are not relevant to him and then
> request to remove from the search results any document whose content is
> like that of the selected document(s)
> 
> Our index has close to 100 million documents and they cover multiple topics
> that are not related to one another.  So, a search for some broad terms may
> retrieve documents about engineering, agriculture, communications, etc.  As
> the user is trying to discover the relevant documents, he may select an
> agriculture-related document to exclude it and those documents like it from
> the results set; same w/ engineering-like content, etc. until most of the
> documents are about communications.
> 
> Of course, some exclusions may actually remove relevant content but those
> filters can be removed to go back to the previous set of results.
> 
> Any ideas from similar implementations or suggestions are welcomed!
> Thanks,
> Carlos


Reply via email to