Hi, I want to fetch a fixed large number of documents randomly from Elasticsearch to compute some statistics (100,000 out of 10 M documents). The randomness has to be predictable so that I get the same documents with every request.
My problem is that scan and scroll is fast but as I understand the order is not predictable. On the other side I could use the 'random_score' function with a fixed seed in my query. That would fix the order problem but deep pagination is very slow. Has anyone done this before? Any ideas or pointers how to do this with Elasticsearch? Any help appreciated. Cheers, Sebastian -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e00e363a-5346-48bd-807c-4b221bed7c28%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.