Hi Petko,

Lucene's comparators for numerics have this limitation indeed. We haven't
got many questions around that in the past, which I would guess is due to
the fact that most numeric fields do not use the entire long range,
specifically Long.MIN_VALUE and Long.MAX_VALUE, so using either of these
works as a way to sort missing values first or last. If you have a field
that may use Long.MIN_VALUE and long.MAX_VALUE, we do not have a comparator
that can easily sort missing values first or last reliably out of the box.

The easier option I can think of would consist of using the comparator for
longs with MIN_VALUE / MAX_VALUE for missing values depending on whether
you want missing values sorted first or last, and chain it with another
comparator (via a FieldComparatorSource) which would sort missing values
before/after existing values. The benefit of this approach is that you
would automatically benefit from some not-so-trivial features of Lucene's
comparator such as dynamic pruning.

On Wed, Nov 16, 2022 at 9:16 PM Petko Minkov <pmin...@gmail.com> wrote:

> Hello,
>
> When sorting documents by a NumericDocValuesField, how can documents be
> ordered such that those with missing values can come before anything else
> in ascending sorts? SortField allows to set a missing value:
>
>     var sortField = new SortField("price", SortField.Type.LONG);
>     sortField.setMissingValue(null);
>
> This null is however converted into a long 0 and documents with missing
> values are considered equally ordered with documents with an actual 0
> value. It's possible to set the missing value to Long.MIN_VALUE, but that
> will have the same problem, just for a different long value.
>
> Besides writing a custom comparator, is there any simpler and still
> performant way to achieve this sort?
>
> --Petko
>


-- 
Adrien

Reply via email to