On 8/11/2014 5:27 PM, dancoleman wrote:
> My SolrCloud of 3 shard / 3 replicas is having a lot of OOM errors. Here are
> some specs on my setup: 
>
> hosts: all are EC2 m1.large with 250G data volumes
> documents: 120M total
> zookeeper: 5 external t1.micros

<snip>

> Linux "top" command output with no indexing
> =======================================================
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  8654 root      20   0 95.3g 6.4g 1.1g S 27.6 87.4  83:46.19 java
>
>
> Linux "top" command output with indexing
> =======================================================
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 12499 root      20   0 95.8g 5.8g 556m S 164.3 80.2 110:40.99 java

I think you're likely going to need a much larger heap than 5GB, or
you're going to need a lot more machines and shards, so that each
machine has a much smaller piece of the index.  The java heap is only
one part of the story here, though.

Solr performance is terrible when the OS cannot effectively cache the
index, because Solr must actually read the disk to get the data required
for a query.  Disks are incredibly SLOW.  Even SSD storage is a *lot*
slower than RAM.

Your setup does not have anywhere near enough memory for the size of
your shards.  Amazon's website says that the m1.large instance has 7.5GB
of RAM.  You're allocating 5GB of that to Solr (the java heap) according
to your startup options.  If you subtract a little more for the
operating system and basic system services, that leaves about 2GB of RAM
for the disk cache.  Based on the numbers from top, that Solr instance
is handling nearly 90GB of index.  2GB of RAM for caching is nowhere
near enough -- you will want between 32GB and 96GB of total RAM for that
much index.

http://wiki.apache.org/solr/SolrPerformanceProblems#RAM

Thanks,
Shawn

Reply via email to