Hi there, I have been doing some load testing with Solr 4 beta (now, trunk). My configuration is fairly simple - two servers, replicating via SolrCloud. SolrCloud is configured as recommended in the wiki:
<updateRequestProcessorChain name="standard"> <processor class="solr.LogUpdateProcessorFactory" /> <processor class="solr.DistributedUpdateProcessorFactory" /> <processor class="solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain> Twice now I've seen sudden thread and file-descriptor spikes along with a complete deadlock, simultaneously on both machines. My max FDs is set to 1024, and (excepting the spikes) I never see usage over 375 fds. The first FD spike was with an older trunk revision. It was co-incident with a corrupt transaction log. I've lost the logs, unfortunately, but SOLR tried to re-process the same log over and over, leaking FDs and dying. The upgraded version has not reported the corrupt transaction issue prior to deadlock. However, according to the log files, the deadlock persists for about 5 minutes prior to FD exhaustion. The last log line is simply "INFO: end_commit_flush" Upon restart, I see a frightening amount of corrupt transaction log exceptions and " New transaction log already exists" exceptions. Any thoughts? Contact me for the thread dump; it's 1 MiB. Thanks, --Casey C.
signature.asc
Description: OpenPGP digital signature