Hi Jörg. Thanks for your reply.
Here is my filter.
{"filter":
{
"terms" : {
"_id" : [ "QSxrbEM8TKe5zr8931xBjA", "wj63ghegRwC6qLsWq2chkA",
"hYEhDbAqQwSRxhYfvDgFkg", "4bZmPE1fTYqijphRyyWiuQ",
"Fhq53yYyT3CEw6vclKu_NA", "XL2atBraTEyx57MefjFVhA",
"951i0dZkT064FlQkzHnnWA", "O8Ixbir1TrGT_IA3wKfsHg",
"8k4U7KsuTmsThqxy-5YaKw", "GNOoQTHglf22kzcE7EOf8g",
"-RQeY48fTg2kYnh2M4E1cQ", "u8DGBdfVR9WRVj6d9E4Ebw",
"WFHSXd7UQvCMYFBhFcTsng", "qnQ7q7FyTsg397lM1EWgqA",
"wRQtUzdMRy2qOkMCNxdpgA", "Ll83iglxSUS_Gs7mjkMt8w",
"d2sxZ1oBTfuvAfov5EJ0iw", "cyht-vB4Q-mMSg9N5jcGXg",
"bNSVaO47QTOCkfJhWo0qjg", "BHuhm55IRerKnynJ8WgFTw",
"fHKA4PF2QteWm8E7dW7CAw", "DLE6A7tyQJ-zcKcCa6IPSA",
"qfelTW7-SuGRQ0GKbngARA", "R7VHHJhYsUqfuxYof8BJ8w",
"W4PqiJfPSlSFjVKFsGkA4Q", "Juq62zOsRdheuW3O6Gb2KA",
"U9v0IKj_RrgRNjE31ZTt2g", "uNHa0kOOT5qjPpzxZcs35A",
"SwOgVNgIRwyVU3pEEycBuQ", "LaEpxFGIQgCArsNZ2rd4Pw",
"CiJ9gouZsbmTtxTWx7w6lA", "TaQV_I01RfCq3B6uAtIBoQ",
"9Jpjo5k-RlGfLVLF6nDgze", "57YpjRdASsrrae-RD3spog",
"bmA4EWFSTiKUaDzaNcCFKQ", "Fui9z_UbRe6AY1VhAr8Crw",
"2PORr5BzSDOmBXgmQkO5Zg", "snfwTmtuTv-uj5mOWSJpgA",
"0nHIrtePSaeW8aWArh_Mrg", "s0g9QHnjTgWX3rCIu1g0Hg",
"Jl67fACuQvCFgZxXAFtDOg" ],
"_cache" : true,
"_cache_key" : "my_terms_cache"
}
}
}
I already used "*ids filter*" but I got same behaviour. One thing that I
realized is that one of the cluster's nodes is increasing the Search Thread
Pool (something like Queue: 50 and Count: 47) and the others don't
(something like Queue: 0 and Count: 1). If I remove this node from the
cluster another one starts with the same problem.
My current environment is:
- 7 Data nodes with 16Gb (8Gb for ES)and 8 cores each one;
- 4 Load balancer Nodes (no data, no master) with 4Gb (3Gb for ES) and 8
cores each one;
- 4 MasterNodes (only master, no data) with 4Gb (3Gb for ES) and 8 cores
each one;
- Thread Pool Search 47 (the others are standard config);
- 7 Shards and 2 replicas Index;
- 14.6Gb Index size (14.524.273 documents);
I'm executing this filter with 50 concurrent users.
Regards
Em terça-feira, 3 de junho de 2014 20h33min45s UTC-3, Jörg Prante escreveu:
>
> Can you show your test code?
>
> You seem to look at the wrong settings - by adjusting node number, shard
> number, replica number alone, you can not find out the maximum node
> performance. E.g. concurrency settings, index optimizations, query
> optimizations, thread pooling, and most of all, fast disk subsystem I/O is
> important.
>
> Jörg
>
>
> On Wed, Jun 4, 2014 at 12:18 AM, Marcelo Paes Rech <[email protected]
> <javascript:>> wrote:
>
>> Thanks for your reply Nikolas. It helps a lot.
>>
>> And about the quantity of documents of each shard, or size of each shard.
>> And the need of no data nodes or only master nodes. When is it necessary?
>>
>> Some tests I did, when I increased request's number (like 100 users at
>> same moment, and redo it again and again), 5 nodes with 1 shard and 2
>> replicas each and 16Gb RAM (8Gb for ES and 8Gb for OS) weren't enough. The
>> response time start to increase more than 5s (I think less than 1s, in
>> this case, would be acceptable) .
>>
>> This test has a lot of documents (something like 14 millions).
>>
>>
>> Thanks. Regards.
>>
>> Em segunda-feira, 2 de junho de 2014 17h09min04s UTC-3, Nikolas Everett
>> escreveu:
>>
>>>
>>>
>>>
>>> On Mon, Jun 2, 2014 at 3:52 PM, Marcelo Paes Rech <[email protected]>
>>> wrote:
>>>
>>> Hi guys,
>>>>
>>>> I'm looking for an article or a guide for the best cluster
>>>> configuration. I read a lot of articles like "change this configuration"
>>>> and "you must create X shards per node" but I didn't saw nothing like
>>>> ElasticSearch Official guide for creating a cluster.
>>>>
>>>> What I would like to know are informations like.
>>>> - How to calculate how many shards will be good for the cluster.
>>>> - How many shards do we need per node? And if this is variable, how do
>>>> I calculate this?
>>>> - How much memory do I need per node and how many nodes?
>>>>
>>>> I think ElasticSearch is well documentated. But it is very fragmented.
>>>>
>>>>
>>>>
>>> For some of these that is because "it depends" is the answer. For
>>> example, you'll want larger heaps for aggregations and faceting.
>>>
>>> There are some rules of thumb:
>>> 1. Set Elasticsearch's heap memory to 1/2 of ram but not more then
>>> 30GB. Bigger then that and the JVM can't do pointer compression and you
>>> effectively lose ram.
>>> 2. #1 implies that having much more then 60GB of ram on each node
>>> doesn't make a big difference. It helps but its not really as good as
>>> having more nodes.
>>> 3. The most efficient efficient way of sharding is likely one shard on
>>> each node. So if you have 9 nodes and a replication factor of 2 (so 3
>>> total copies) then 3 shards is likely to be more efficient then having 2 or
>>> 4. But this only really matters when those shards get lots of traffic.
>>> And it breaks down a bit when you get lots of nodes. And the in presence
>>> of routing. Its complicated.
>>>
>>> But these are really just starting points, safe-ish defaults.
>>>
>>> Nik
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/94b8ecf9-efc4-4046-a862-63b670ccc23e%40googlegroups.com
>>
>> <https://groups.google.com/d/msgid/elasticsearch/94b8ecf9-efc4-4046-a862-63b670ccc23e%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d32487cb-db92-4b7a-b6b3-afd431beaf61%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.