Elasticsearch exposes the total number of hits in the search responses,
let's call it T. So if your page size is P, you know that there are `ceil(T
/ P)` pages.


On Fri, Apr 11, 2014 at 5:48 AM, Mohit Anchlia <[email protected]>wrote:

> I have one more follow up question, how can one know if there are more
> documents or not? This is to avoid one exta last call if possible.
>
>
> On Thu, Apr 10, 2014 at 3:47 PM, Mohit Anchlia <[email protected]>wrote:
>
>> Thanks Adrien and Nikolas it's very helpful.
>>
>>
>> On Thu, Apr 10, 2014 at 3:19 PM, Adrien Grand <
>> [email protected]> wrote:
>>
>>> On Thu, Apr 10, 2014 at 11:13 PM, Nikolas Everett <[email protected]>wrote:
>>>
>>>> This one is easy.  Elasticsearch/lucene has to keep a min heap of all
>>>> the documents you find and the score that is from + size big.  Technically
>>>> it is min(from + size, max(rescore_window_size)).  Anyway, that means some
>>>> part of the query has O(n) space and O(n * log(n)) time complexity where n
>>>> is from + size.  That part might be dwarfed by some other action but it is
>>>> there.  And technically in the worst case the time complexity is more like
>>>> O(hits * log(n)) but thats not likely.
>>>>
>>>
>>> Everything that Nikolas said is correct. I'd like to add that starting
>>> with Elasticsearch 1.2.0, paging with scroll is going to be more
>>> efficient[1] since the worst case will be O(hits * log(size)) instead of
>>> O(hits * log(from + size)). If you are interested in why it is possible,
>>> the reason is that on each shard, scroll is going to keep track of the
>>> least document that is part of the hits of the previous page, so that you
>>> can just ignore documents that compare greater than this document instead
>>> of adding them to the priority queue.
>>>
>>> The issue with realtime is that it creates lots of segments that usually
>>> get merged very quickly. On the other hand, scroll works by asking the
>>> shard to keep open the view over the index that was used for the first
>>> page, until the scroll is closed. This can delay space reclamation and
>>> force Elasticsearch to keep a significant number of files open (beware of
>>> going out of file descriptors).
>>>
>>> If you have important search traffic, I would recommend not to use
>>> scroll for every user because of its cost. It is usually a better idea to
>>> just increase the from parameter and prevent your users from performing
>>> deep paging since it might kill your cluster. (If you go to any web search
>>> engine, you'll see that even if they tell us that your query matched
>>> millions of documents, they only allow you to get hits for a few tens of
>>> pages.)
>>>
>>> [1] https://github.com/elasticsearch/elasticsearch/issues/4940
>>>
>>> --
>>> Adrien Grand
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6JwVMTfHr%2BdFbqRvBWJ2%2B2zAAR6g8T9C31-gXpYN4LWQ%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6JwVMTfHr%2BdFbqRvBWJ2%2B2zAAR6g8T9C31-gXpYN4LWQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAOT3TWqnkRqD%2BoAX1W4ThSCE-%3DWtgYPqkvVUgEFXCj8iWJf2JA%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAOT3TWqnkRqD%2BoAX1W4ThSCE-%3DWtgYPqkvVUgEFXCj8iWJf2JA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j5ghh5iNVOqDoDtzR7E4922JqQOBis9FYG7QZsGo5%2B8Yw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to