While I can't be as specific as other here will be, we encountered the same/similar problem. We simply loaded up our servers with 48GB and life is good. I too would like to be a bit more proactive on the provisioning front and hopefully someone will come along and help us out.
FWIW and I'm sure someone will correct me, but it seems as if the Java GC cannot keep up with cache allocation; in our case everything was fine until the nth query and then the box would go TU. But leave it to Solr, it would simply 'restart' and start serving queries again. -----Original Message----- From: Jason Toy [mailto:jason...@gmail.com] Sent: Wednesday, August 17, 2011 5:15 PM To: solr-user@lucene.apache.org Subject: solr keeps dying every few hours. I have a large ec2 instance(7.5 gb ram), it dies every few hours with out of heap memory issues. I started upping the min memory required, currently I use -Xms3072M . I insert about 50k docs an hour and I currently have about 65 million docs with about 10 fields each. Is this already too much data for one box? How do I know when I've reached the limit of this server? I have no idea how to keep control of this issue. Am I just supposed to keep upping the min ram used for solr? How do I know what the accurate amount of ram I should be using is? Must I keep adding more memory as the index size grows, I'd rather the query be a little slower if I can use constant memory and have the search read from disk.