Thanks Michael
On Tue, Nov 28, 2023 at 11:45 PM Michael Froh <[email protected]> wrote:
> Oh -- of course if you're using IntPoint / LongPoint for your numeric
> fields, they won't be indexed as terms, so loading terms for them won't
> work.
>
> It's not the prettiest solution, but I think the following should let you
> collect the set of distinct point values for an IntPoint field:
>
>
> final Set<Integer> collectedValues = new TreeSet<>();
> for (LeafReaderContext lrc : reader.leaves()) {
> LeafReader lr = lrc.reader();
> PointValues.IntersectVisitor collectingVisitor = new
> PointValues.IntersectVisitor() {
> @Override
> public void visit(int docID) throws IOException {
>
> }
>
> @Override
> public void visit(int docID, byte[] packedValue) {
>
> collectedValues.add(IntPoint.decodeDimension(packedValue, 0));
> }
>
> @Override
> public PointValues.Relation compare(byte[]
> minPackedValue, byte[] maxPackedValue) {
> return PointValues.Relation.CELL_CROSSES_QUERY;
> }
> };
>
> lr.getPointValues(fieldname).intersect(collectingVisitor);
> }
>
>
>
> On Tue, Nov 28, 2023 at 1:42 PM Michael Froh <[email protected]> wrote:
>
> > Hello!
> >
> > Instead of MultiFields.getFields(), you can use
> > MultiTerms.getTerms(reader, fieldname) to get the Terms instance.
> >
> > To decode your long / int values, you should be able to use
> > LongPoint/IntPoint.unpack to write the values into an array:
> >
> > long[] val = new long[1]; // Assuming 1-D values
> > LongPoint.unpack(value, 0, val);
> > values.add(val[0]);
> >
> > Hope that helps,
> > Froh
> >
> >
> > On Wed, Nov 22, 2023 at 11:09 AM <[email protected]> wrote:
> >
> >> Hello,
> >>
> >> In Lucene 6 I was doing this to get all values for a given field
> >> knowing its type:
> >>
> >> public List<Object> getDistinctValues(IndexReader reader, String
> >> fieldname,
> >> Class<? extends Object> type) throws IOException {
> >>
> >> List<Object> values = new ArrayList<Object>();
> >> Fields fields = MultiFields.getFields(reader);
> >> if (fields == null) return values;
> >>
> >> Terms terms = fields.terms(fieldname);
> >> if (terms == null) return values;
> >>
> >> TermsEnum iterator = terms.iterator();
> >>
> >> BytesRef value = iterator.next();
> >>
> >> while (value != null) {
> >> if (type == Long.class) {
> >> values.add(LegacyNumericUtils.prefixCodedToLong(value));
> >> } else if (type == Integer.class) {
> >> values.add(LegacyNumericUtils.prefixCodedToInt(value));
> >> } else if (type == Boolean.class) {
> >> values.add(LegacyNumericUtils.prefixCodedToInt(value) == 1 ?
> >> TRUE : FALSE);
> >> } else if (type == Date.class) {
> >> values.add(new
> >> Date(LegacyNumericUtils.prefixCodedToLong(value)));
> >> } else if (type == String.class) {
> >> values.add(value.utf8ToString());
> >> } else {
> >> // ...
> >> }
> >>
> >> value = iterator.next();
> >> }
> >>
> >> return values;
> >> }
> >>
> >> I am trying to upgrade to lucene 9.
> >> there were 2 changes over time:
> >> - LegacyNumericUtils has been removed in favor of PointBase
> >> - MultiFields.getFields() has been dropped, and I read we were
> encouraged
> >> to avoid fields in general
> >>
> >> what is proper way to implement getting distinct values for a specific
> >> field in a reader?
> >>
> >> thanks for your help,
> >>
> >> vs
> >>
> >
>