Hi Frank,

If your servlet container had a crazy low setting for the max number
of threads I think you would see the CPU underutilized.  But I think
you would also see errors in on the client about connections being
requested.  Sounds like a possibly VM issue that's not
Solr-specific...

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Mon, Mar 25, 2013 at 1:18 PM, Frank Wennerdahl
<frank.wennerd...@arcadelia.com> wrote:
> Hi.
>
>
>
> We are currently benchmarking our Solr setup and are having trouble with
> scaling hardware for a single Solr instance. We want to investigate how one
> instance scales with hardware to find the optimal ratio of hardware vs
> sharding when scaling. Our main problem is that we cannot identify any
> hardware limitations, CPU is far from maxed out, disk I/O is not an issue as
> far as we can see and there is plenty of RAM available.
>
>
>
> In short we have a couple of questions that we hope someone here could help
> us with. Detailed information about our setup, use case and things we've
> tried is provided below the questions.
>
>
>
> Questions:
>
> 1.       What could cause Solr to utilize only 2 CPU cores when sending
> multiple update requests in parallel in a VMWare environment?
>
> 2.       Is there a software limit on the number of CPU cores that Solr can
> utilize while indexing?
>
> 3.       Ruling out network and disk performance, what could cause a
> decrease in indexing speed when sending data over a network as opposed to
> sending it from the local machine?
>
>
>
> We are running on three cores per Solr instance, however only one core
> receives any non-trivial load. We are using VMWare (ESX 5.0) virtual
> machines for hosting Solr and a QNAP NAS containing 12 HDDs in a RAID5 setup
> for storage. Our data consists of a huge amount of small-sized documents.
> When indexing we are using Solr's javabin format (although not through
> Solrj, we have implemented the format in C#/.NET) and our batch size is
> currently 1000 documents. The actual size of the data varies, but the
> batches we have used range from approximately 450KB to 1050KB. We're sending
> these batches to Solr in parallel using a number of send threads.
>
>
>
> There are two issues that we've run into:
>
> 1.       When sending data from one VM to Solr on another VM we observed
> that Solr did not seem to utilize CPU cores properly. The Solr VM had 8
> vCPUs available and we were using 4 threads sending data in parallel. We saw
> a low (~29%)  CPU utilization on the Solr VM with 2 cores doing almost all
> the work while the remaining cores remained almost idle. Increasing the
> number of send threads to 8 yielded the same result, capping our indexing
> speed to about 4.88MB per second. The client VM had 4 vCPUs which were
> hardly utilized as we were reading data from pre-generated files.
>
> To rule out network limitations we sent the test data to a server on the
> Solr VM that simply accepted the request and returned an empty response. We
> were able to send data at 219MB per second, so the network did not seem to
> be the bottleneck. We also tested sending data to Solr locally from the Solr
> VM to see if disk I/O was the problem. Surprisingly we were able to index
> significantly faster at 7.34MB per second using 4 send threads (8.4MB with 6
> send threads) which indicated that the disk was not slowing us down when
> sending data over the network. Worth noting is that the CPU utilization was
> now higher (47,81% with 4 threads, 58,8% with 6) and the work was spread out
> over all cores. As before we used pre-generated files and the process
> sending the data used almost no CPU.
>
> 2.       We decided to investigate how Solr would scale with additional
> vCPUs when indexing locally. We increased the number of vCPUs to 16 and the
> number of send threads to 8. Sadly we now experienced a decrease in
> performance: 7MB/s with 8 threads, 6.4MB/s with 12 threads and 4.95/s with
> 16 threads. The CPU usage was in average 30%, regardless of the number of
> threads used. We know that additional vCPUs can cause decreased performance
> in VMWare virtual machines due to time waiting for CPUs to become available.
> We investigated this using esxtop which only showed a 1% CSTP. According to
> VMWare
> <http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=di
> splayKC&externalId=1005362>  a CSTP above 3% could indictate that multiple
> vCPUs are causing performance issues.
>
> We noticed that the average disk write speed seemed to cap at around 11.5
> million bytes per second so we tested the same VM setup using a faster disk.
> This did not yield any increase in performance (it was actually somewhat
> slower), neither did using a RAM-mapped drive for Solr.
>
>
>
> Any help or ideas of what could be the bottleneck in our setup would be
> greatly appreciated!
>
>
>
> Best regards,
>
> Frank Wennerdahl
>
> Developer
>
> Arcadelia AB
>

Reply via email to