Isn't top_hits aggregration pageable? See the "from" parameter listed on the page:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html Certainly you don't want to page through everything (you want scan/scroll for that), but adequate paging for most search uses. Or do you want to just eliminate duplicate authors only in one page (ie set of 10) of results? -Doug On Tuesday, December 9, 2014, Travis sturzl <travisstu...@gmail.com> wrote: > I'm trying to find a way to prevent multiple posts from appearing in > search results that are from the same author. So far I've tried random > scoring, which allows me to maintain pagination. However, I can still have > up to 4 of the same authors in a given page of 10 results. > > Is there any way to score a document based on how many times a certain > field occurs in the result set? As far as I'm aware you cannot persist a > variable or object in a scoring script. > > I've looked into several methods of accomplishing this, but many of them > have quite a few cons. Such as removing the duplicates, and calling again > to retrieve a new set of results which have the current authors excluded. > However this can also return multiple of the same authors. So I'm left to > query one by one to replace duplicate authors in a result set, and this > breaks deep pagination because eventually the other result set which is > used to replace duplicates runs out of pages before the standard search. > I've also tried aggregation which is not page-able. > > Is there any functionality to spread out or subtract the score of a > document based on how many times a document of the same author(or field) > occurs? > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to elasticsearch+unsubscr...@googlegroups.com > <javascript:_e(%7B%7D,'cvml','elasticsearch%2bunsubscr...@googlegroups.com');> > . > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/89f4676e-3472-4abf-a182-229299d2149f%40googlegroups.com > <https://groups.google.com/d/msgid/elasticsearch/89f4676e-3472-4abf-a182-229299d2149f%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- Doug Turnbull Search & Big Data Architect OpenSource Connections <http://o19s.com> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CALG6HL9JBdZ6aFNA6czCd6%3DqUycC-m63fMree0zh%3DdyPqJ%3DnKQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.