Thanks! Good to know the codec uses variable length encoding mechanism here.
On Thu, Apr 4, 2013 at 3:36 PM, Adrien Grand <jpou...@gmail.com> wrote: > On Thu, Apr 4, 2013 at 11:03 PM, Wei Wang <welshw...@gmail.com> wrote: > > Given the new Lucene 4.2 DocValues API, it seems no matter it is byte, > > short, int, or long, they are all stored as NumericDocValuesField. Does > > this mean "long" values are always stored regardless of the initial type? > > If so, do we still save space if the value range is small? Do we need to > > give some hint to NumericDocValuesField to save space? > > Space savings are codec-dependent, but the default codecs use bit > packing to save space. For example: > - if all your values are between 0 and 255, Lucene will only use 8 > bits per value on average, > - if your documents only have three distinct values 1, 100 and 10000, > Lucene will detect that this is a low-cardinality field and only use 2 > bits per value on average. > > This makes doc values storage-efficient, and much more > memory-efficient than FieldCache, that people had to use unti Lucene > 4.0. > > -- > Adrien > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >