On Thu, Apr 4, 2013 at 11:03 PM, Wei Wang <welshw...@gmail.com> wrote: > Given the new Lucene 4.2 DocValues API, it seems no matter it is byte, > short, int, or long, they are all stored as NumericDocValuesField. Does > this mean "long" values are always stored regardless of the initial type? > If so, do we still save space if the value range is small? Do we need to > give some hint to NumericDocValuesField to save space?
Space savings are codec-dependent, but the default codecs use bit packing to save space. For example: - if all your values are between 0 and 255, Lucene will only use 8 bits per value on average, - if your documents only have three distinct values 1, 100 and 10000, Lucene will detect that this is a low-cardinality field and only use 2 bits per value on average. This makes doc values storage-efficient, and much more memory-efficient than FieldCache, that people had to use unti Lucene 4.0. -- Adrien --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org