Hi,

As a proof of concept I have imported around ~11 million document in a solr
index. my schema file has multiple fields defined

<dynamicField name="*_id"    type="text"   indexed="true"  stored="true"/>
<dynamicField name="*_start" type="tdate"  indexed="true"  stored="true"/>
<dynamicField name="*_end"   type="tdate"  indexed="true"  stored="true"/>

<dynamicField name="*"       type="string" indexed="true" stored="true"/>

Above being the most important for my question.

The average document has around 40 attributes. Each document has:

* a minimum of 2 tdate fileds ( max of 10)
* a minimum of 2 *_id fields each contain a space delimited list of ids
(i.e. "4de5656 q23ew9h")

The finial dynamicField causes all fields within a document to be indexed.
This was done to firstly show the flexibility of solr and also due to me not
knowing what fields we would use to query / filter on. The total size of my
index is ~18GB

However... we now know the fields we will be querying on.

I have 3 questions

1) Do unused indexes on the same "dynamicField" affect solr's performance?
Our query will always be (type:book book_id:*). Will the presents of 4
million documents (type:location store_id:*) affect solr's performance?
Sounds obviously yes but may not be the case.

2) Do unused "dynamicField" indexes affect solr's performance?
All documents have a attribute "version" which is indexed as "text" yet this
is never used in any queries. Does their existence ( in 11 million documents
) effect performance?

3) How does one improve query times against an index
Once an index is built is there a method to optimise the query analyzers or
a method of removing unused indexes without rebuilding the entire index?

The latter is a very important one. We want to replace the current schema
with a more restrictive version. Most importantly

   <dynamicField name="*" type="string" indexed="true" stored="true" />

becomes

   <dynamicField name="*" type="string" indexed="*false*" stored="true" />


But this change alone does not cause the index to shrink. It would be lovely
if there was a method to re-analyze an index post import.

More than happy to be referred to related documentation.

I have read and considered
http://wiki.apache.org/solr/SolrPerformanceFactors
http://wiki.apache.org/lucene-java/ImproveSearchingSpeed


But there may be some fluid knowledge held here which is undocumented.

Thank you in advance for any answers.

Reply via email to