Hmm, maybe. We are using the Elastica PHP library and call getStatus()->getServerStatus() relatively often (to try and work around elastica's lack of proper error handling of unreachable nodes) to determine if we have a node we can connect to or not. If that call maps to IndicesStatusRequest in the end we might be shooting ourselves in the foot.
On Tuesday, October 21, 2014 3:25:10 PM UTC-4, Jörg Prante wrote: > > Maybe you are hit by > https://github.com/elasticsearch/elasticsearch/issues/7385 > > Jörg > > On Tue, Oct 21, 2014 at 9:17 PM, [email protected] <javascript:> < > [email protected] <javascript:>> wrote: > >> This has nothing to do with OpenJDK. >> >> IndicesStatusRequest (deprecated, will be removed from future versions) >> is a heavy request, there may be something on your machines which takes >> longer than 5 seconds, so the request times out. >> >> The IndicesStatus action uses Directories.estimateSize of Lucene. This >> call might take some time on large directories, maybe you have many >> segments/unoptimized shards/indices. >> >> Jörg >> >> On Tue, Oct 21, 2014 at 6:21 PM, David Ashby <[email protected] >> <javascript:>> wrote: >> >>> I should also note that I've been using OpenJDK. I'm currently in the >>> process of moving to the official Oracle binaries; are there specific >>> optimizations changes there that help with inter-cluster IO? There's some >>> hints at that in this very old github-elasticsearch interview >>> <http://exploringelasticsearch.com/github_interview.html>. >>> >>> >>> On Monday, October 20, 2014 3:49:39 PM UTC-4, David Ashby wrote: >>>> >>>> example log line: [DEBUG][action.admin.indices.status] [Red Ronin] >>>> [*index*][1], node[t60FJtJ-Qk-dQNrxyg8faA], [R], s[STARTED]: failed to >>>> executed >>>> [org.elasticsearch.action.admin.indices.status.IndicesStatusRequest@36239161] >>>> >>>> org.elasticsearch.transport.NodeDisconnectedException: >>>> [Shotgun][inet[/IP:9300]][indices/status/s] disconnected >>>> >>>> When the cluster gets into this state, all requests hang waiting for... >>>> something to happen. Each individual node returns 200 when curled locally. >>>> A huge number of this above log line appear at the end of this process -- >>>> one for every single shard on the node, which is a huge vomit into my >>>> logs. >>>> As soon as a node is restarted the cluster "snaps back" and immediately >>>> fails outstanding requests and begins rebalancing. It even stops >>>> responding >>>> to bigdesk requests. >>>> >>>> On Monday, October 20, 2014 11:34:36 AM UTC-4, David Ashby wrote: >>>>> >>>>> Hi, >>>>> >>>>> We've been using elasticsearch on AWS for our application for two >>>>> purposes: as a search engine for user-created documents, and as a cache >>>>> for >>>>> activity feeds in our application. We made a decision early-on to treat >>>>> every customer's content as a distinct index, for full logical separation >>>>> of customer data. We have about three hundred indexes in our cluster, >>>>> with >>>>> the default 5-shards/1-replica setup. >>>>> >>>>> Recently, we've had major problems with the cluster "locking up" to >>>>> requests and losing track of its nodes. We initially responded by >>>>> attempting to remove possible CPU and memory limits, and placed all nodes >>>>> in the same AWS placement group, to maximize inter-node bandwidth, all to >>>>> no avail. We eventually lost an entire production cluster, resulting in a >>>>> decision to split the indexes across two completely independent clusters, >>>>> each cluster taking half of the indexes, with application-level logic >>>>> determining where the indexes were. >>>>> >>>>> All that is to say: with our setup, are we running into an >>>>> undocumented *practical* limit on the number of indexes or shards in >>>>> a cluster? It ends up being around 3000 shards with our setup. Our logs >>>>> show evidence of nodes timing out their responses to massive shard >>>>> status-checks, and it gets *worse* the more nodes there are in the >>>>> cluster. It's also stable with only *two* nodes. >>>>> >>>>> Thanks, >>>>> -David >>>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected] <javascript:>. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/7046bd31-5c8a-4e33-9ab4-97cdd8bfd436%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/elasticsearch/7046bd31-5c8a-4e33-9ab4-97cdd8bfd436%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/51189efa-e739-4e6e-9311-5c7126a28b03%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
