I'm trying to find a way to prevent multiple posts from appearing in search 
results that are from the same author. So far I've tried random scoring, 
which allows me to maintain pagination. However, I can still have up to 4 
of the same authors in a given page of 10 results.

Is there any way to score a document based on how many times a certain 
field occurs in the result set? As far as I'm aware you cannot persist a 
variable or object in a scoring script.

I've looked into several methods of accomplishing this, but many of them 
have quite a few cons. Such as removing the duplicates, and calling again 
to retrieve a new set of results which have the current authors excluded. 
However this can also return multiple of the same authors. So I'm left to 
query one by one to replace duplicate authors in a result set, and this 
breaks deep pagination because eventually the other result set which is 
used to replace duplicates runs out of pages before the standard search. 
I've also tried aggregation which is not page-able.

Is there any functionality to spread out or subtract the score of a 
document based on how many times a document of the same author(or field) 
occurs?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/89f4676e-3472-4abf-a182-229299d2149f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to