Re: SolrCloud OOM Problem

2014-08-13 Thread tuxedomoon
Great info.  Can I ask how much data you are handling with that 6G or 7G
heap?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-OOM-Problem-tp4152389p4152712.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud OOM Problem

2014-08-13 Thread tuxedomoon
Have you used a queue to intercept queries and if so what was your
implementation?  We are indexing huge amounts of data from 7 SolrJ instances
which run independently, so there's a lot of concurrent indexing.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-OOM-Problem-tp4152389p4152717.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud OOM Problem

2014-08-13 Thread Shawn Heisey
On 8/13/2014 5:34 AM, tuxedomoon wrote:
 Great info.  Can I ask how much data you are handling with that 6G or 7G
 heap?

My dev server is the one with the 7GB heap.  My production servers only
handle half the index shards, so they have the smaller heap.  Here is
the index size info from my dev server:

[root@bigindy5 ~]# du -sh /index/solr4/data/
131G/index/solr4/data/

This represents about 116 million total documents.

Thanks,
Shawn



Re: SolrCloud OOM Problem

2014-08-13 Thread Shawn Heisey
On 8/13/2014 5:42 AM, tuxedomoon wrote:
 Have you used a queue to intercept queries and if so what was your
 implementation?  We are indexing huge amounts of data from 7 SolrJ instances
 which run independently, so there's a lot of concurrent indexing.

On my setup, the queries come from a java webapp that uses SolrJ, which
is running on multiple servers in a cluster.  The updates come from a
custom SolrJ application that I wrote.  There is no queue, Solr is more
than capable of handling the load that we give it.

Full rebuilds are done with the dataimport handler.  The source of all
our Solr data is a MySQL database.

Thanks,
Shawn



Re: SolrCloud OOM Problem

2014-08-13 Thread tuxedomoon
I applied the OPTS you pointed me to, here's the full string:

CATALINA_OPTS=${CATALINA_OPTS} -XX:NewSize=1536m -XX:MaxNewSize=1536m
-Xms12288m -Xmx12288m -XX:NewRatio=3 -XX:SurvivorRatio=4
-XX:TargetSurvivorRatio=90 -XX:MaxTenuringThreshold=8
-XX:+UseConcMarkSweepGC -XX:+CMSScavengeBeforeRemark
-XX:PretenureSizeThreshold=64m -XX:CMSFullGCsBeforeCompaction=1
-XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=70
-XX:CMSTriggerPermRatio=80 -XX:CMSMaxAbortablePrecleanTime=6000
-XX:+CMSParallelRemarkEnabled -XX:+ParallelRefProcEnabled -XX:+UseLargePages
-XX:+AggressiveOpts

jConsole is now showing lower heap usage.  It had been climbing to 12G
consistently, now it is only spiking to 10G every 10 minutes or so.

Here's my top output
===
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 4250 root  20   0  129g  14g  1.9g S2.021.317:40.61 java









--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-OOM-Problem-tp4152389p4152753.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud OOM Problem

2014-08-12 Thread Toke Eskildsen
On Tue, 2014-08-12 at 01:27 +0200, dancoleman wrote:
 My SolrCloud of 3 shard / 3 replicas is having a lot of OOM errors. Here are
 some specs on my setup: 
 
 hosts: all are EC2 m1.large with 250G data volumes

Is that 3 (each running a primary and a replica shard) or 6 instances?

 documents: 120M total
 zookeeper: 5 external t1.micros

If your facet fields has many unique values and if you have many
concurrent requests, then memory usage will be high. But by the looks of
it, I guess that the facets fields has relatively few values?

Still, if you have many concurrent queries, you might consider using a
queue in front of your SolrCloud instead of just starting new requests,
in order to set an effective limit on heap usage. 

- Toke Eskildsen, State and University Library, Denmark




Re: SolrCloud OOM Problem

2014-08-12 Thread tuxedomoon
I have modified my instances to m2.4xlarge 64-bit with 68.4G memory.  Hate to
ask this but can you recommend Java memory and GC settings for 90G data and
the above memory?  Currently I have

CATALINA_OPTS=${CATALINA_OPTS} -XX:NewSize=1536m -XX:MaxNewSize=1536m
-Xms5120m -Xmx5120m -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled
-XX:+UseConcMarkSweepGC

Doesn't this mean I am starting with 5G and never going over 5G?

I've seen a few of those univerted multi-valued field OOMs already on the
upgraded host.

Thanks

Tux







--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-OOM-Problem-tp4152389p4152585.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud OOM Problem

2014-08-12 Thread Shawn Heisey

On 8/12/2014 3:12 PM, tuxedomoon wrote:

I have modified my instances to m2.4xlarge 64-bit with 68.4G memory.  Hate to
ask this but can you recommend Java memory and GC settings for 90G data and
the above memory?  Currently I have

CATALINA_OPTS=${CATALINA_OPTS} -XX:NewSize=1536m -XX:MaxNewSize=1536m
-Xms5120m -Xmx5120m -XX:+UseParNewGC -XX:+CMSParallelRemarkEnabled
-XX:+UseConcMarkSweepGC

Doesn't this mean I am starting with 5G and never going over 5G?


Yes, that's exactly what it means -- you have a heap size limit of 5GB.  
The OutOfMemory error indicates that Solr needs more heap space than it 
is getting.  You'll need to raise the -Xmx value.  it is usually 
advisable to configure -Xms to match.


The wiki page I linked before includes a link to the following page, 
listing the GC options that I use beyond the -Xmx setting:


http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning

Thanks,
Shawn



SolrCloud OOM Problem

2014-08-11 Thread dancoleman
 this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-OOM-Problem-tp4152389.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud OOM Problem

2014-08-11 Thread Shawn Heisey
On 8/11/2014 5:27 PM, dancoleman wrote:
 My SolrCloud of 3 shard / 3 replicas is having a lot of OOM errors. Here are
 some specs on my setup: 

 hosts: all are EC2 m1.large with 250G data volumes
 documents: 120M total
 zookeeper: 5 external t1.micros

snip

 Linux top command output with no indexing
 ===
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
  8654 root  20   0 95.3g 6.4g 1.1g S 27.6 87.4  83:46.19 java


 Linux top command output with indexing
 ===
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
 12499 root  20   0 95.8g 5.8g 556m S 164.3 80.2 110:40.99 java

I think you're likely going to need a much larger heap than 5GB, or
you're going to need a lot more machines and shards, so that each
machine has a much smaller piece of the index.  The java heap is only
one part of the story here, though.

Solr performance is terrible when the OS cannot effectively cache the
index, because Solr must actually read the disk to get the data required
for a query.  Disks are incredibly SLOW.  Even SSD storage is a *lot*
slower than RAM.

Your setup does not have anywhere near enough memory for the size of
your shards.  Amazon's website says that the m1.large instance has 7.5GB
of RAM.  You're allocating 5GB of that to Solr (the java heap) according
to your startup options.  If you subtract a little more for the
operating system and basic system services, that leaves about 2GB of RAM
for the disk cache.  Based on the numbers from top, that Solr instance
is handling nearly 90GB of index.  2GB of RAM for caching is nowhere
near enough -- you will want between 32GB and 96GB of total RAM for that
much index.

http://wiki.apache.org/solr/SolrPerformanceProblems#RAM

Thanks,
Shawn



Re: SolrCloud OOM Problem

2014-08-11 Thread dancoleman
90G is correct, each host is currently holding that much data.

Are you saying that 32GB to 96GB would be needed for each host?   Assuming
we did not add more shards that is.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-OOM-Problem-tp4152389p4152401.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud OOM Problem

2014-08-11 Thread Shawn Heisey
 90G is correct, each host is currently holding that much data.

 Are you saying that 32GB to 96GB would be needed for each host?   Assuming
 we did not add more shards that is.

If you want good performance and enough memory to give Solr the heap it
will need, yes. Lucene (the search API that Solr uses) relies on good
operating system caching for the index. Having enough memory to catch the
ENTIRE index is not usually required, but it is recommended.

Alternatively, you can add a lot more hosts and create a new collection
with a lot more shards. The total memory requirement across the whole
cloud won't go down, but each host won't require as much.

Thanks,
Shawn