Hi, As a proof of concept I have imported around ~11 million document in a solr index. my schema file has multiple fields defined
<dynamicField name="*_id" type="text" indexed="true" stored="true"/> <dynamicField name="*_start" type="tdate" indexed="true" stored="true"/> <dynamicField name="*_end" type="tdate" indexed="true" stored="true"/> <dynamicField name="*" type="string" indexed="true" stored="true"/> Above being the most important for my question. The average document has around 40 attributes. Each document has: * a minimum of 2 tdate fileds ( max of 10) * a minimum of 2 *_id fields each contain a space delimited list of ids (i.e. "4de5656 q23ew9h") The finial dynamicField causes all fields within a document to be indexed. This was done to firstly show the flexibility of solr and also due to me not knowing what fields we would use to query / filter on. The total size of my index is ~18GB However... we now know the fields we will be querying on. I have 3 questions 1) Do unused indexes on the same "dynamicField" affect solr's performance? Our query will always be (type:book book_id:*). Will the presents of 4 million documents (type:location store_id:*) affect solr's performance? Sounds obviously yes but may not be the case. 2) Do unused "dynamicField" indexes affect solr's performance? All documents have a attribute "version" which is indexed as "text" yet this is never used in any queries. Does their existence ( in 11 million documents ) effect performance? 3) How does one improve query times against an index Once an index is built is there a method to optimise the query analyzers or a method of removing unused indexes without rebuilding the entire index? The latter is a very important one. We want to replace the current schema with a more restrictive version. Most importantly <dynamicField name="*" type="string" indexed="true" stored="true" /> becomes <dynamicField name="*" type="string" indexed="*false*" stored="true" /> But this change alone does not cause the index to shrink. It would be lovely if there was a method to re-analyze an index post import. More than happy to be referred to related documentation. I have read and considered http://wiki.apache.org/solr/SolrPerformanceFactors http://wiki.apache.org/lucene-java/ImproveSearchingSpeed But there may be some fluid knowledge held here which is undocumented. Thank you in advance for any answers.