Cool. If I read right, your response time statistics graph includes 1 - network latency between the client nodes and the load balancer 2 - network latency between the load balancer and the cluster eligible masters. 3 - performance of the load balancer My interest in checking out 1 & 2 would depend on the network topology. I would for sure want to do something to rule out 3. Any possibility of letting at least one of the client nodes bypass the LB for a minute or two?
Then, I might be tempted to set up a script to hit _cat/thread_pool for 60 seconds at a time, with various of the thread pools/fields, looking for spikes. Maybe the same thing with _nodes/stats. On Wednesday, April 15, 2015 at 1:48:17 PM UTC-4, Daryl Robbins wrote: > > Thanks, Glen. Yes, I have run top: the Java Tomcat process is the only > thing running at the time. I also checked the thread activity in JProfiler > and nothing out of the ordinary popped up. > > On Wednesday, April 15, 2015 at 1:36:55 PM UTC-4, Glen Smith wrote: >> >> Have you run 'top' on the nodes? >> >> On Wednesday, April 15, 2015 at 8:56:20 AM UTC-4, Daryl Robbins wrote: >>> >>> Thanks for your response. GC was my first thought too. I have looked >>> through the logs and ran the app through a profiler, I am not seeing any >>> spike in GC activity or any other background thread when performance >>> degrades. Also, the fact that the slowdown occurs exactly every minute at >>> the same second would point me towards a more deliberate timeout or >>> heartbeat. >>> >>> I am running these tests in a controlled performance environment with >>> constant light to moderate load. There is no change in the behaviour when >>> under very light load. I have turned on slow logging for queries/fetches >>> but am not seeing any slow queries corresponding with the problem. The only >>> time I see a slow query is post-cold start of the search node, so it is at >>> least working. >>> >>> On Wednesday, April 15, 2015 at 1:00:00 AM UTC-4, Mark Walkom wrote: >>>> >>>> Have you checked the logs for GC events or similar? What about the web >>>> logs for events coming in? >>>> >>>> On 15 April 2015 at 09:03, Daryl Robbins <[email protected]> wrote: >>>> >>>>> I am seeing a consistent bottleneck in requests (taking about 2+ >>>>> seconds) at the same second every minute across all four of my client >>>>> nodes >>>>> who are connecting using the transport client from Java. These nodes are >>>>> completely independent aside from their reliance on the ElasticSearch >>>>> cluster and consequently they all happen to pause at the exact same >>>>> second >>>>> every minute. The exact second when this happens varies over time, but >>>>> the >>>>> four nodes always pause at the same time. >>>>> >>>>> I have 4 web nodes that connect to my ES cluster via transport. They >>>>> connect to a load balancer fronting our 3 dedicated master nodes. The >>>>> cluster contains 2 or more data nodes dependent on the configuration. >>>>> Regardless of the number, I am seeing the same symptoms. >>>>> >>>>> Any hints on how to proceed to troubleshoot this issue on the >>>>> ElasticSearch side would be greatly appreciated. Thanks very much! >>>>> >>>>> >>>>> <https://lh3.googleusercontent.com/-GKiOcsPXBjI/VS2ak04mzBI/AAAAAAAAAhQ/aLDlD82AddY/s1600/Screenshot%2B2015-04-14%2B18.53.24.png> >>>>> >>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "elasticsearch" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/elasticsearch/af209904-9113-43d0-8cbc-0c85afe52611%40googlegroups.com >>>>> >>>>> <https://groups.google.com/d/msgid/elasticsearch/af209904-9113-43d0-8cbc-0c85afe52611%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/15ffa0f9-92b5-4bb9-a4ec-064afa300b96%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
