Hi Jim,
I came across your post while searching for any requirements for diversity 
in results.

I'm working on an approach that allows you to limit results from any one 
choice of field (in your case "type").
Using this approach, all of the results are still selected on their 
individual merits (keeping their natural score) but there's an additional 
rule applied that you can only have N of any one type. The thinking behind 
this logic is described 
here: 
https://issues.apache.org/jira/browse/LUCENE-6066?focusedCommentId=14219901&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14219901

Would this approach work in your case?


On Saturday, March 16, 2013 6:20:52 AM UTC, Jim wrote:
>
> I would like to apologize ahead of time because this is a difficult 
> situation 
> to explain. 
>
> Say I have a field named "type" on every one of my documents.  That type 
> can 
> be "Book", "DVD", or "Photo". 
>
> I then run a query on all items, let's say it is James Bond. 
>
> I will get back results that are mostly DVDs, and the Books, maybe even 
> Photos, get pushed to the back due to having a lower score.  Who knows 
> why, 
> maybe the rest of the document's content doesn't mention James Bond as 
> much 
> as the DVDs do.  The goal is to mix in the top-scoring Books and Photos in 
> with the top-scoring DVDs so that we end up with a diverse result set.  So 
> basically, lower the score of the more common result (DVD), and/or raise 
> the 
> score of the less common result (Book, Photo). 
>
> I want to write a query so that once I find a document with the type 
> "DVD", 
> I make the next document with type "DVD" have a lower score, and the next 
> one have even lower.  The problem is that I don't want to do this at index 
> time, because I want it to vary based on the results of a query (If the 
> query comes back with 75 DVDs, the last DVD will have a much lower score 
> than if the query comes back with 25 DVDs) 
>
> I've thought of two possible ways of going about this, but I am not sure 
> how 
> to implement it. 
>
> One is to somehow use the custom_filters_score or custom_score query to 
> get 
> the current facet count on the "type" field.  Or to somehow store that 
> count 
> per document found?  I have no way of really explaining this better than 
> that.  But if I had that information somehow, I could write a custom_score 
> query that inversely affects the document's score based on how many times 
> we've seen the same "type". 
>
> The other option is to alter the score of the results based on the facet 
> counts for a facet based on the "type" field.  Basically, inversely 
> affecting the documents with that "type" based on how high the count is. 
> For example, if DVDs has 75 results, and Books has 2, alter the score of 
> all 
> DVD items by 1/75, and alter the score of all Books by 1/2.  This is just 
> a 
> broad example, these numbers wouldn't work perfectly. 
>
> I had already planned on doing the 2nd possible solution with two queries: 
> one to get the facets for the query, and the second would use that 
> information and send the altered boosts in custom_score queries.  But I 
> would ideally like to do this all in one query. 
>
> Any ideas?  I know it is a lot to swallow, but it would give the search 
> user 
> a larger variety of options. 
>
>
>
> -- 
> View this message in context: 
> http://elasticsearch-users.115913.n3.nabble.com/Writing-a-custom-score-or-custom-filters-score-query-based-on-field-value-frequency-tp4031780.html
>  
> Sent from the ElasticSearch Users mailing list archive at Nabble.com. 
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/af059948-9bec-4136-a24a-8bcd988f7deb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to