[jira] [Comment Edited] (SOLR-15880) Introduce Support to K Nearest Neighbors Search

Joel Bernstein (Jira) Tue, 18 Jan 2022 09:10:07 -0800


    [ 
https://issues.apache.org/jira/browse/SOLR-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477425#comment-17477425
 ]


Joel Bernstein edited comment on SOLR-15880 at 1/18/22, 5:09 PM:
-----------------------------------------------------------------

The patch also looks quite good. I'm a +1 to getting this in for 9.0.

Here is how easy it is to do knn regression predictions:

{code:java}
stats(collection1, q="{!knn f=vector 
topK=10}[0.31254637,0.21707416,0.8859923,0.75535464]", avg(outcome_d))
{code}

In this example the vector holds the predictor variables and the outcome_d 
field is the outcome being predicted. The stats streaming expression query 
finds the top 10 nearest neighbors and averages the outcome_d field to make the 
prediction.

Here is how easy it is to do knn classication predictions:

{code:java}
facet(collection1, q="{!knn f=vector 
topK=10}[0.31254637,0.21707416,0.8859923,0.75535464]", buckets="label_s", 
count(*))
{code}

In this example the vector is the predictor variables and the label_s field 
holds the class label. The facet streaming expression finds the top 10 nearest 
neighbors to the predictor vector and counts the label_s field to make the 
classification prediction. 



was (Author: joel.bernstein):
The patch also looks quite good. I'm a +1 to getting this in for 9.0.

Here is how easy it is to do knn regression predictions:

{code:java}
stats(collection1, q="{!knn f=vector 
topK=10}[0.31254637,0.21707416,0.8859923,0.75535464]", avg(outcome_d))
{code}

In this example the vector holds the predictor variables and the outcome_d 
field is the outcome being predicted. The stats streaming expression query 
finds the top 10 nearest neighbors and averages the outcome_d field to make the 
prediction.

Here is how easy it is to do knn classication predictions:

{code:java}
facet(collection1, q="{!knn f=vector 
topK=10}[0.31254637,0.21707416,0.8859923,0.75535464]", buckets="label_s", 
count(*))
{code}

In this example the vector is the predictor variables and the label_s field 
holds the class label. The facet streaming expression finds the top 10 nearest 
neighbors to the predictor vector and counts them to make the classification 
prediction. 


> Introduce Support to K Nearest Neighbors Search
> -----------------------------------------------
>
>                 Key: SOLR-15880
>                 URL: https://issues.apache.org/jira/browse/SOLR-15880
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 9.0
>            Reporter: Alessandro Benedetti
>            Assignee: Alessandro Benedetti
>            Priority: Major
>         Attachments: Screen Shot 2022-01-17 at 1.10.14 PM.png, Screen Shot 
> 2022-01-17 at 12.49.52 PM.png
>
>          Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> This contribution introduces in Apache Solr the ability to run vector-based 
> searches using native Apache Lucene data structures optimized for approximate 
> nearest neighbours.
> This is the first milestone in introducing Neural Search in Apache Solr.
> More will come later.
> It builds on top of https://issues.apache.org/jira/browse/LUCENE-9004 to 
> provide a:
> - *Dense Vector Field* - At moment Single valued, supporting stored, indexed
> - *Knn Query Parser* - simple query parser to run knn search on target vector 
> and field
> The Dense Vector field only supports KNN searches (no range query, no field 
> queries, no existence queries...)
> The Knn query parser can be used as query, filter query and rerank query.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SOLR-15880) Introduce Support to K Nearest Neighbors Search

Reply via email to