Somewhere in those numeric trie terms are the exact integers from your documents, encoded.
You can use oal.util.NumericUtils.prefixCodecToInt to get the int value back from the BytesRef term. But you need to filter out the "higher level" terms, e.g. using NumericUtils.getPrefixCodedLongShift(term) == 0. Or use NumericUtils.filterPrefixCodedLongs to wrap a TermsEnum. I believe all the terms you want come first, so once you hit a term where .getPrefixCodedLongShift is > 0, that's your max term and you can stop checking. BTW, in 5.0, the codec API for PostingsFormat has improved, so that you can e.g. pull your own TermsEnum and iterate the terms yourself. Mike McCandless http://blog.mikemccandless.com On Thu, Feb 6, 2014 at 5:16 AM, Ravikumar Govindarajan <ravikumar.govindara...@gmail.com> wrote: > I use a Codec to flush data. All methods delegate to actual Lucene42Codec, > except for intercepting one single-field. This field is indexed as an > IntField [Numeric-Trie...], with precisionStep=4. > > The purpose of the Codec is as follows > > 1. Note the first BytesRef for this field > 2. During finish() call [TermsConsumer.java], note the last BytesRef for > this field > 3. Converts both the first/last BytesRef to respective integers > 4. Store these 2 ints in segment-info diagnostics > > The problem with this approach is that, first/last BytesRef is totally > different from the actual "int" values I try to index. I guess, this is > because Numeric-Trie explodes all the integers into it's own format of > BytesRefs. Hence my Codec stores the wrong values in segment-diagnostics > > Is there a way I can record actual min/max int-values correctly in my codec > and still support NumericRange search? > > -- > Ravi --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org