I have Solr server set up on CentOS that's being queried from a Flask app in a very specific/controlled way. Basically, I just have a large (200 million) amount of largely static name/address data (along with an internal record ID field and a few integer fields). I'm running 50 threads that need to do a search on name/address/birth-date and return an ID value and an integer modeling score as quickly as possible.
Here is the schema.xml information for the fields I'm using: <field name="external_id" type="string" indexed="true" stored="false" required="false" multiValued="false" /> <field name="internal_id" type="string" indexed="false" stored="true" multiValued="false" /> <field name="score" type="int" indexed="false" stored="true" /> <field name="first_name" type="text_general" indexed="true" stored="true"/> <field name="last_name" type="text_general" indexed="true" stored="true"/> <field name="city" type="text_general" indexed="true" stored="true"/> <field name="state" type="string" indexed="true" stored="true"/> <field name="birth_year" type="string" indexed="true" stored="false" /> <field name="birth_month" type="string" indexed="true" stored="false" /> <field name="birth_day" type="string" indexed="true" stored="false" /> I had a similar set-up working well when I was using 1-4 threads, but since upping the number of threads querying the Solr server I'm running into Out Of Memory errors. I removed the autoWarming filter queries from solrconfig.xml and upped the RAM on the box to 24 gigs and JVM to 8 gigs and changed the directory Factory from MMap to NIOFS and that solved the memory problems but performance is pretty bad with most queries taking over 1 second to return a response. Here's a screenshot showing the breakdown of a heap dump I did before I upped the RAM/JVM the first time: <http://lucene.472066.n3.nabble.com/file/n4167111/Screen_Shot_2014-10-23_at_11.png> Since I'm only querying Solr in a very specific way, I'd like to set up the filterCache so that I have filters on U.S. State Abbreviation and Birth Month cached but how much memory would I need? Here's an example of what I had previously (now commented out) in the QuerySenderListener to auto-warm the filterCaches: <lst><str name="q">*:*</str><str name="fq">state:CA</str><str name="fq">birth_month:1</str></lst> <lst><str name="q">*:*</str><str name="fq">state:CA</str><str name="fq">birth_month:2</str></lst> <lst><str name="q">*:*</str><str name="fq">state:CA</str><str name="fq">birth_month:3</str></lst> <lst><str name="q">*:*</str><str name="fq">state:CA</str><str name="fq">birth_month:4</str></lst> The number of documents matching each query this way range in size from a few thousand to one million. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-filterCache-and-autoWarming-memory-requirements-tp4167111.html Sent from the Solr - User mailing list archive at Nabble.com.