I forgot, an alternative to this is to use the FieldCache parsers, which
automatically throw an RuntimeException, if a lower precision value is in
term to stop iteration in the FieldCache uninversion:
try {
while (next != null && next.field().equals("trie")) {
ints.add(FieldCache.NUMERIC_UTILS_INT_PARSER.parseInt(next.text()));
next = termEnum.next() ? termEnum.term() : null;
}
} catch (RuntimeException e) {}
See the code of FieldCacheImpl that does exactly that.
-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: [email protected]
> -----Original Message-----
> From: Uwe Schindler [mailto:[email protected]]
> Sent: Monday, October 26, 2009 10:43 AM
> To: [email protected]
> Subject: RE: Distinct terms values? (like in Luke)
>
> > @Test
> > public void distinct() throws Exception {
> > RAMDirectory directory = new RAMDirectory();
> > IndexWriter writer = new IndexWriter(directory, new
> > WhitespaceAnalyzer(), true, IndexWriter.MaxFieldLength.UNLIMITED);
> >
> > for (int l = -2; l <= 2; l++) {
> > Document doc = new Document();
> > doc.add(new Field("text", "the big brown", Field.Store.NO,
> > Field.Index.ANALYZED));
> > doc.add(new NumericField("trie", Field.Store.NO,
> > true).setIntValue(l));
> > writer.addDocument(doc);
> > }
> >
> > writer.close();
> >
> > IndexReader reader = IndexReader.open(directory, true);
> > TermEnum termEnum = reader.terms(new Term("trie", ""));
> > Term next = termEnum.term();
> > List<Integer> ints = new ArrayList<Integer>();
> >
> > while (next != null && next.field().equals("trie")) {
> > ints.add(NumericUtils.prefixCodedToInt(next.text()));
> > next = termEnum.next() ? termEnum.term() : null;
> > }
> >
> > reader.close();
> >
> > log.info(ints.toString());
> > }
> >
> > ==> [-2, -1, 0, 1, 2, -16, 0, -256, 0, -4096, 0, -65536, 0, -1048576, 0,
> > -16777216, 0, -268435456, 0]
>
> You can add a check in your while statement to break iteration, if the
> next
> lower precision is used:
>
> while (next != null && next.field().equals("trie") &&
> next.term().charAt(0)
> == NumericUtils.SHIFT_START_INT)...
>
> use the same constant for float, and SHIFT_START_LONG for long and double.
>
> This should work. Maybe we add a method to NumericUtils that checks this
> and
> returns true/false if the term is not of highest precision.
>
> Uwe
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]