Mark,

I got a few stack dumps of the instance that was stuck ssdtest-d03:8011

http://apaste.info/cofK
http://apaste.info/sv4M
http://apaste.info/cxUf

 


 I can get dumps of others if needed.

Thanks,

Rishi.

 

-----Original Message-----
From: Mark Miller <markrmil...@gmail.com>
To: solr-user <solr-user@lucene.apache.org>
Sent: Mon, Jun 17, 2013 1:57 pm
Subject: Re: Solr Cloud Hangs consistently .


Could you give a simple stack trace dump as well?

It's likely the distributed update deadlock that has been reported a few times 
now - I think usually with a replication factor greater than 2, but I can't be 
sure. The deadlock involves sending docs concurrently to replicas and I 
wouldn't 
have expected it to be so easily hit with only 2 replicas per shard. I should 
be 
able to tell from a stack trace though.

If it is that, it's on my short list to investigate (been there a long time now 
though - but I still hope to look at it soon).

- Mark

On Jun 17, 2013, at 1:44 PM, Rishi Easwaran <rishi.easwa...@aol.com> wrote:

> 
> 
> Hi All,
> 
> I am trying to benchmark SOLR Cloud and it consistently hangs. 
> Nothing in the logs, no stack trace, no errors, no warnings, just seems stuck.
> 
> A little bit about my set up. 
> I have 3 benchmark hosts, each with 96GB RAM, 24 CPU's and 1TB SSD. Each host 
is configured to have 8 SOLR cloud nodes running at 4GB each.
> JVM configs: http://apaste.info/57Ai
> 
> My cluster has 12 shards with replication factor 2- http://apaste.info/09sA
> 
> I originally stated with SOLR 4.2., tomcat 5 and jdk 6, as we are already 
running this configuration in production in Non-Cloud form. 
> It got stuck repeatedly.
> 
> I decided to upgrade to the latest and greatest of everything, SOLR 4.3, JDK7 
and tomcat7. 
> It still shows same behaviour and hangs through the test.
> 
> My test schema and config.
> Schema.xml - http://apaste.info/imah
> SolrConfig.xml - http://apaste.info/ku4F
> 
> The test is pretty simple. its a jmeter test with update command via SOAP rpc 
(round robin request across every node), adding in 5 fields from a csv file - 
id, guid, subject, body, compositeID (guid!id).
> number of jmeter threads = 150. loop count = 20, num of messages to add/per 
guid = 3; total 150*3*20 = 9000 documents.  
> 
> When cloud gets stuck, i don't get anything in the logs, but when i run 
netstat i see the following.
> Sample netstat on a stuck run. http://apaste.info/hr0O 
> hycl-d20 is my jmeter host. ssd-d01/2/3 are my cloud hosts.
> 
> 
> At the moment my benchmarking efforts are at a stand still.
> 
> Any help from the community would be great, I got some heap dumps and stack 
dumps, but haven't found a smoking gun yet.
> If I can provide anything else to diagnose this issue. just let me know.
> 
> Thanks,
> 
> Rishi.
> 
> 
> 
> 
> 
> 
> 
> 


 

Reply via email to