Hi Frank, If your servlet container had a crazy low setting for the max number of threads I think you would see the CPU underutilized. But I think you would also see errors in on the client about connections being requested. Sounds like a possibly VM issue that's not Solr-specific...
Otis -- Solr & ElasticSearch Support http://sematext.com/ On Mon, Mar 25, 2013 at 1:18 PM, Frank Wennerdahl <frank.wennerd...@arcadelia.com> wrote: > Hi. > > > > We are currently benchmarking our Solr setup and are having trouble with > scaling hardware for a single Solr instance. We want to investigate how one > instance scales with hardware to find the optimal ratio of hardware vs > sharding when scaling. Our main problem is that we cannot identify any > hardware limitations, CPU is far from maxed out, disk I/O is not an issue as > far as we can see and there is plenty of RAM available. > > > > In short we have a couple of questions that we hope someone here could help > us with. Detailed information about our setup, use case and things we've > tried is provided below the questions. > > > > Questions: > > 1. What could cause Solr to utilize only 2 CPU cores when sending > multiple update requests in parallel in a VMWare environment? > > 2. Is there a software limit on the number of CPU cores that Solr can > utilize while indexing? > > 3. Ruling out network and disk performance, what could cause a > decrease in indexing speed when sending data over a network as opposed to > sending it from the local machine? > > > > We are running on three cores per Solr instance, however only one core > receives any non-trivial load. We are using VMWare (ESX 5.0) virtual > machines for hosting Solr and a QNAP NAS containing 12 HDDs in a RAID5 setup > for storage. Our data consists of a huge amount of small-sized documents. > When indexing we are using Solr's javabin format (although not through > Solrj, we have implemented the format in C#/.NET) and our batch size is > currently 1000 documents. The actual size of the data varies, but the > batches we have used range from approximately 450KB to 1050KB. We're sending > these batches to Solr in parallel using a number of send threads. > > > > There are two issues that we've run into: > > 1. When sending data from one VM to Solr on another VM we observed > that Solr did not seem to utilize CPU cores properly. The Solr VM had 8 > vCPUs available and we were using 4 threads sending data in parallel. We saw > a low (~29%) CPU utilization on the Solr VM with 2 cores doing almost all > the work while the remaining cores remained almost idle. Increasing the > number of send threads to 8 yielded the same result, capping our indexing > speed to about 4.88MB per second. The client VM had 4 vCPUs which were > hardly utilized as we were reading data from pre-generated files. > > To rule out network limitations we sent the test data to a server on the > Solr VM that simply accepted the request and returned an empty response. We > were able to send data at 219MB per second, so the network did not seem to > be the bottleneck. We also tested sending data to Solr locally from the Solr > VM to see if disk I/O was the problem. Surprisingly we were able to index > significantly faster at 7.34MB per second using 4 send threads (8.4MB with 6 > send threads) which indicated that the disk was not slowing us down when > sending data over the network. Worth noting is that the CPU utilization was > now higher (47,81% with 4 threads, 58,8% with 6) and the work was spread out > over all cores. As before we used pre-generated files and the process > sending the data used almost no CPU. > > 2. We decided to investigate how Solr would scale with additional > vCPUs when indexing locally. We increased the number of vCPUs to 16 and the > number of send threads to 8. Sadly we now experienced a decrease in > performance: 7MB/s with 8 threads, 6.4MB/s with 12 threads and 4.95/s with > 16 threads. The CPU usage was in average 30%, regardless of the number of > threads used. We know that additional vCPUs can cause decreased performance > in VMWare virtual machines due to time waiting for CPUs to become available. > We investigated this using esxtop which only showed a 1% CSTP. According to > VMWare > <http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=di > splayKC&externalId=1005362> a CSTP above 3% could indictate that multiple > vCPUs are causing performance issues. > > We noticed that the average disk write speed seemed to cap at around 11.5 > million bytes per second so we tested the same VM setup using a faster disk. > This did not yield any increase in performance (it was actually somewhat > slower), neither did using a RAM-mapped drive for Solr. > > > > Any help or ideas of what could be the bottleneck in our setup would be > greatly appreciated! > > > > Best regards, > > Frank Wennerdahl > > Developer > > Arcadelia AB >