Implication is the memory needed to be allocated on each shard.

David

> Le 14 déc. 2014 à 05:46, Ron Sher <[email protected]> a écrit :
> 
> Again, why not use a very large count size? What are the implications of 
> using a very large count?
> Regarding performance - it seems doing 1 request with a very large count 
> performs better than using scan scroll (with count of 100 using 32 shards)
> 
>> On Wednesday, December 10, 2014 10:53:50 PM UTC+2, David Pilato wrote:
>> No I did not say that. Or I did not mean that. Sorry if it was unclear.
>> I said: don’t use large sizes:
>> 
>>>> Never use size:10000000 or from:10000000. 
>> 
>> 
>> You should read this: 
>> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html#scroll-scan
>> 
>> -- 
>> David Pilato | Technical Advocate | Elasticsearch.com
>> @dadoonet | @elasticsearchfr | @scrutmydocs
>> 
>> 
>> 
>>> Le 10 déc. 2014 à 21:16, Ron Sher <[email protected]> a écrit :
>>> 
>>> So you're saying there's no impact on elasticsearch if I issue a large 
>>> size? 
>>> If that's the case then why shouldn't I just call size of 1M if I want to 
>>> make sure I get everything?
>>> 
>>>> On Wednesday, December 10, 2014 8:22:47 PM UTC+2, David Pilato wrote:
>>>> Scan/scroll is the best option to extract a huge amount of data.
>>>> Never use size:10000000 or from:10000000. 
>>>> 
>>>> It's not realtime because you basically scroll over a given set of 
>>>> segments and all new changes that will come in new segments won't be taken 
>>>> into account during the scroll.
>>>> Which is good because you won't get inconsistent results.
>>>> 
>>>> About size, I'd would try and test. It depends on your docs size I believe.
>>>> Try with 10000 and see how it goes when you increase it. You will may be 
>>>> discover that getting 10*10000 docs is the same as 1*100000. :)
>>>> 
>>>> Best
>>>> 
>>>> David
>>>> 
>>>>> Le 10 déc. 2014 à 19:09, Ron Sher <[email protected]> a écrit :
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I was wondering about best practices to to get all data according to some 
>>>>> filters.
>>>>> The options as I see them are:
>>>>> Use a very big size that will return all accounts, i.e. use some value 
>>>>> like 1m to make sure I get everything back (even if I need just a few 
>>>>> hundreds or tens of documents). This is the quickest way, development 
>>>>> wise.
>>>>> Use paging - using size and from. This requires looping over the result 
>>>>> and the performance gets worse as we advance to later pages. Also, we 
>>>>> need to use preference if we want to get consistent results over the 
>>>>> pages. Also, it's not clear what's the recommended size for each page.
>>>>> Use scan/scroll - this gives consistent paging but also has several 
>>>>> drawbacks: If I use search_type=scan then it can't be sorted; using 
>>>>> scan/scroll is (maybe) less performant than paging (the documentation 
>>>>> says it's not for realtime use); again not clear which size is 
>>>>> recommended.
>>>>> So you see - many options and not clear which path to take.
>>>>> 
>>>>> What do you think?
>>>>> 
>>>>> Thanks,
>>>>> Ron
>>>>> 
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "elasticsearch" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>>> email to [email protected].
>>>>> To view this discussion on the web visit 
>>>>> https://groups.google.com/d/msgid/elasticsearch/764a37c5-1fec-48c4-9c66-7835d8141713%40googlegroups.com.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>> 
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/838020dc-d2ea-423d-9606-778d807b1a0d%40googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/ac0841ac-4150-435c-a3da-afbf2a4b06a6%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7717B0E2-E971-4653-A0A7-BA66EC3EAE9F%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.

Reply via email to