On 3/9/06, Øyvind Stegard <[EMAIL PROTECTED]> wrote: > - How does many stored fields eventually affect indexing/query performance > compared to if no fields were stored (only indexed) ?
Additional stored fields should have no effect on querying (the internal information about a field is looked up in a hashmap). Additional stored fields that are used has an impact on indexing since that data must be copied every time segments are merged. Additional stored fields that are not used in most documents (sparse) should have very little performance impact on indexing. The field list is walked a few times linearly (in-memory) during a segment merge, which should be very fast, but it's still O(n), so don't go crazy and have a million stored field types. > - Are there any known scalability issues with a large amount of distinct > fields in an index (not necessarily the same set of fields for every doc) ? If they are indexed fields, yes. Each indexed field has a 1 byte norm *per document*, regardless of if the document contains that field. In the current version of lucene, there is a way to omit these norms on a per field basis (see Field.setOmitNorms()) if you don't need length normalization or index-time field boosting. -Yonik http://incubator.apache.org/solr Solr, The Open Source Lucene Search Server --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]