Hi Greg, I think the general issue is one of the API, the ValueSource
seems really geared at returning values from single-valued fields.

IMO, for the way the API is used (e.g. sorting), it makes sense to
define a selector that works in O(1) time per-document, and use these
existing valuesources:

https://github.com/apache/lucene/blob/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/MultiValuedIntFieldSource.java
https://github.com/apache/lucene/blob/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/MultiValuedLongFieldSource.java
https://github.com/apache/lucene/blob/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/MultiValuedFloatFieldSource.java
https://github.com/apache/lucene/blob/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/MultiValuedDoubleFieldSource.java

These require that you specify a "selector" as to who will be the
"stuckee" (designated value) for the doc:
https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/SortedNumericSelector.java
I strongly recommend "min", as it can just read the first DV for each doc.

For terms (strings), there is a similar thing:

https://github.com/apache/lucene/blob/main/lucene/queries/src/java/org/apache/lucene/queries/function/valuesource/SortedSetFieldSource.java

And again, it has available selectors:
https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/SortedSetSelector.java
I would still strongly recommend "min", to just read the first DV for each doc.

On Tue, Oct 26, 2021 at 7:49 PM Greg Miller <[email protected]> wrote:
>
> Hi folks-
>
> Out of curiosity, is there a reason Lucene doesn't have
> implementations for concepts like DoubleValues / DoubleValuesSource
> that support multiple values per document? Or maybe something like
> this does exist in Lucen that I'm not aware of? I can't believe this
> hasn't been a topic of discussion at least once, but I couldn't turn
> up a past Jira issue.
>
> I ask because most of the faceting implementations in Lucene allow the
> user to provide their own xxValuesSource to use instead of assuming
> the data is in an indexed field, but there's an inherent limitation
> here forcing documents to have a single value. The faceting
> implementations have all been updated to operate correctly for
> multi-valued documents when referencing an indexed field, but there's
> a bit of a gap here if the user wants to supply their own source.
>
> Many thanks!
>
> Cheers,
> -Greg
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to