Thanks Jörg, we've heard of others pre-creating indices, we were seeing it 
as a work around rather than a regular action but what you say makes it 
seem like something we should work with.


On Tuesday, May 13, 2014 12:13:10 PM UTC+1, Jörg Prante wrote:
>
> You should create indexes before bulk indexing. First, bulk indexing works 
> much better if all indices and their mappings are already present, the 
> operations will run faster and without conflicts, and the cluster state 
> updates are less frequent which reduces some noise and hiccups. Second, 
> setting the indices refresh rate to -1 and replica level to 0 while in bulk 
> indexing mode helps a lot for performance.
>
> If you create 1000+ shards per node, you seem to exceed the limit of your 
> system. Do not expect admin operations like index creation work in O(1) 
> time, they are O(n/c)  with n = number of affected shards and c the 
> threadpool size for the operation (the total node number also counts but I 
> neglect it here). So yes, it is expected that index creation operations 
> take longer if they reach the limit of your nodes, but there can be plenty 
> of reasons for it (increasing shard count is just one of them). And it is 
> expected that you see the 30s cluster action timeout in theses cases, yes.
>
> There is no strictly predictable resource limit for a node, all this 
> depends heavily on factors from outside of Elasticsearch (JVM, CPU, memory, 
> disk I/O, your workload of indexing/searching) so it is up to you to 
> calibrate your node capacity. After adding nodes, you will observe that ES 
> scales well and can handle more shards.
>
> Jörg
>
>
> On Tue, May 13, 2014 at 11:59 AM, Paul <[email protected] <javascript:>>wrote:
>
>> We are seeing a slow down in shard initialization speed as the number of 
>> shards/indices grows in our cluster.
>>
>> With 0-100's of indices/shards existing in the cluster a new bulk 
>> creation of indices up the 100's at a time is fine, we see them pass 
>> through the states and get a green cluster in a reasonable amount of time.
>>
>> As the total cluster size grows to 1000+ indices (3000+ shards) we begin 
>> to notice that the first rounds of initialization take longer to process, 
>> it seems to speed up after the first few batches, but this slow down leads 
>> to "failed to process cluster event (create-index [index_1112], cause 
>> [auto(bulk api)]) within 30s" type messages in the Master logs - the 
>> indices are eventually created.
>>
>>
>> Has anyone else experienced this? (did you find the cause / way to fix?)
>>
>> Is this somewhat expected behaviour? - are we approaching something 
>> incorrectly? (there are 3 data nodes involved, with 3 shards per index)
>>  
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/f34157df-b34e-4d69-a8bd-d8cffb2e5667%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6c918772-cd05-4640-aa67-3924737b3342%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to