Re: Upper limits on indexes/shards in a cluster

[email protected] Tue, 21 Oct 2014 12:26:04 -0700

Maybe you are hit by
https://github.com/elasticsearch/elasticsearch/issues/7385


Jörg

On Tue, Oct 21, 2014 at 9:17 PM, [email protected] <
[email protected]> wrote:

> This has nothing to do with OpenJDK.
>
> IndicesStatusRequest (deprecated, will be removed from future versions) is
> a heavy request, there may be something on your machines which takes longer
> than 5 seconds, so the request times out.
>
> The IndicesStatus action uses Directories.estimateSize of Lucene. This
> call might take some time on large directories, maybe you have many
> segments/unoptimized shards/indices.
>
> Jörg
>
> On Tue, Oct 21, 2014 at 6:21 PM, David Ashby <[email protected]>
> wrote:
>
>> I should also note that I've been using OpenJDK. I'm currently in the
>> process of moving to the official Oracle binaries; are there specific
>> optimizations changes there that help with inter-cluster IO? There's some
>> hints at that in this very old github-elasticsearch interview
>> <http://exploringelasticsearch.com/github_interview.html>.
>>
>>
>> On Monday, October 20, 2014 3:49:39 PM UTC-4, David Ashby wrote:
>>>
>>> example log line: [DEBUG][action.admin.indices.status] [Red Ronin]
>>> [*index*][1], node[t60FJtJ-Qk-dQNrxyg8faA], [R], s[STARTED]: failed to
>>> executed [org.elasticsearch.action.admin.indices.status.
>>> IndicesStatusRequest@36239161] 
>>> org.elasticsearch.transport.NodeDisconnectedException:
>>> [Shotgun][inet[/IP:9300]][indices/status/s] disconnected
>>>
>>> When the cluster gets into this state, all requests hang waiting for...
>>> something to happen. Each individual node returns 200 when curled locally.
>>> A huge number of this above log line appear at the end of this process --
>>> one for every single shard on the node, which is a huge vomit into my logs.
>>> As soon as a node is restarted the cluster "snaps back" and immediately
>>> fails outstanding requests and begins rebalancing. It even stops responding
>>> to bigdesk requests.
>>>
>>> On Monday, October 20, 2014 11:34:36 AM UTC-4, David Ashby wrote:
>>>>
>>>> Hi,
>>>>
>>>> We've been using elasticsearch on AWS for our application for two
>>>> purposes: as a search engine for user-created documents, and as a cache for
>>>> activity feeds in our application. We made a decision early-on to treat
>>>> every customer's content as a distinct index, for full logical separation
>>>> of customer data. We have about three hundred indexes in our cluster, with
>>>> the default 5-shards/1-replica setup.
>>>>
>>>> Recently, we've had major problems with the cluster "locking up" to
>>>> requests and losing track of its nodes. We initially responded by
>>>> attempting to remove possible CPU and memory limits, and placed all nodes
>>>> in the same AWS placement group, to maximize inter-node bandwidth, all to
>>>> no avail. We eventually lost an entire production cluster, resulting in a
>>>> decision to split the indexes across two completely independent clusters,
>>>> each cluster taking half of the indexes, with application-level logic
>>>> determining where the indexes were.
>>>>
>>>> All that is to say: with our setup, are we running into an undocumented
>>>> *practical* limit on the number of indexes or shards in a cluster? It
>>>> ends up being around 3000 shards with our setup. Our logs show evidence of
>>>> nodes timing out their responses to massive shard status-checks, and it
>>>> gets *worse* the more nodes there are in the cluster. It's also stable
>>>> with only *two* nodes.
>>>>
>>>> Thanks,
>>>> -David
>>>>
>>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/7046bd31-5c8a-4e33-9ab4-97cdd8bfd436%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/7046bd31-5c8a-4e33-9ab4-97cdd8bfd436%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGcfF687YVG-d0A9YHvzm5HwYDtF5U%2BgwcZR-aU4XF8DA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Upper limits on indexes/shards in a cluster

Reply via email to