On 3/9/06, Øyvind Stegard <[EMAIL PROTECTED]> wrote:
> - How does many stored fields eventually affect indexing/query performance
> compared to if no fields were stored (only indexed) ?

Additional stored fields should have no effect on querying (the
internal information about a field is looked up in a hashmap).

Additional stored fields that are used has an impact on indexing since
that data must be copied every time segments are merged.

Additional stored fields that are not used in most documents (sparse)
should have very little performance impact on indexing.  The field
list is walked a few times linearly (in-memory) during a segment
merge, which should be very fast, but it's still O(n), so don't go
crazy and have a million stored field types.

> - Are there any known scalability issues with a large amount of distinct
> fields in an index (not necessarily the same set of fields for every doc) ?

If they are indexed fields, yes.
Each indexed field has a 1 byte norm *per document*, regardless of if
the document contains that field.  In the current version of lucene,
there is a way to omit these norms on a per field basis (see
Field.setOmitNorms()) if you don't need length normalization or
index-time field boosting.

-Yonik
http://incubator.apache.org/solr Solr, The Open Source Lucene Search Server

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to