thanks! On Fri, Apr 11, 2014 at 12:29 AM, Adrien Grand < [email protected]> wrote:
> Elasticsearch exposes the total number of hits in the search responses, > let's call it T. So if your page size is P, you know that there are `ceil(T > / P)` pages. > > > On Fri, Apr 11, 2014 at 5:48 AM, Mohit Anchlia <[email protected]>wrote: > >> I have one more follow up question, how can one know if there are more >> documents or not? This is to avoid one exta last call if possible. >> >> >> On Thu, Apr 10, 2014 at 3:47 PM, Mohit Anchlia <[email protected]>wrote: >> >>> Thanks Adrien and Nikolas it's very helpful. >>> >>> >>> On Thu, Apr 10, 2014 at 3:19 PM, Adrien Grand < >>> [email protected]> wrote: >>> >>>> On Thu, Apr 10, 2014 at 11:13 PM, Nikolas Everett <[email protected]>wrote: >>>> >>>>> This one is easy. Elasticsearch/lucene has to keep a min heap of all >>>>> the documents you find and the score that is from + size big. Technically >>>>> it is min(from + size, max(rescore_window_size)). Anyway, that means some >>>>> part of the query has O(n) space and O(n * log(n)) time complexity where n >>>>> is from + size. That part might be dwarfed by some other action but it is >>>>> there. And technically in the worst case the time complexity is more like >>>>> O(hits * log(n)) but thats not likely. >>>>> >>>> >>>> Everything that Nikolas said is correct. I'd like to add that starting >>>> with Elasticsearch 1.2.0, paging with scroll is going to be more >>>> efficient[1] since the worst case will be O(hits * log(size)) instead of >>>> O(hits * log(from + size)). If you are interested in why it is possible, >>>> the reason is that on each shard, scroll is going to keep track of the >>>> least document that is part of the hits of the previous page, so that you >>>> can just ignore documents that compare greater than this document instead >>>> of adding them to the priority queue. >>>> >>>> The issue with realtime is that it creates lots of segments that >>>> usually get merged very quickly. On the other hand, scroll works by asking >>>> the shard to keep open the view over the index that was used for the first >>>> page, until the scroll is closed. This can delay space reclamation and >>>> force Elasticsearch to keep a significant number of files open (beware of >>>> going out of file descriptors). >>>> >>>> If you have important search traffic, I would recommend not to use >>>> scroll for every user because of its cost. It is usually a better idea to >>>> just increase the from parameter and prevent your users from performing >>>> deep paging since it might kill your cluster. (If you go to any web search >>>> engine, you'll see that even if they tell us that your query matched >>>> millions of documents, they only allow you to get hits for a few tens of >>>> pages.) >>>> >>>> [1] https://github.com/elasticsearch/elasticsearch/issues/4940 >>>> >>>> -- >>>> Adrien Grand >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elasticsearch" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6JwVMTfHr%2BdFbqRvBWJ2%2B2zAAR6g8T9C31-gXpYN4LWQ%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6JwVMTfHr%2BdFbqRvBWJ2%2B2zAAR6g8T9C31-gXpYN4LWQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CAOT3TWqnkRqD%2BoAX1W4ThSCE-%3DWtgYPqkvVUgEFXCj8iWJf2JA%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAOT3TWqnkRqD%2BoAX1W4ThSCE-%3DWtgYPqkvVUgEFXCj8iWJf2JA%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Adrien Grand > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5ghh5iNVOqDoDtzR7E4922JqQOBis9FYG7QZsGo5%2B8Yw%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5ghh5iNVOqDoDtzR7E4922JqQOBis9FYG7QZsGo5%2B8Yw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOT3TWpRVuOwR%3DtPAJuOtfhcwAQs_0LcRp4iBq1zJHhGUQ_82Q%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
