[
https://issues.apache.org/jira/browse/SOLR-15880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477425#comment-17477425
]
Joel Bernstein edited comment on SOLR-15880 at 1/18/22, 5:09 PM:
-----------------------------------------------------------------
The patch also looks quite good. I'm a +1 to getting this in for 9.0.
Here is how easy it is to do knn regression predictions:
{code:java}
stats(collection1, q="{!knn f=vector
topK=10}[0.31254637,0.21707416,0.8859923,0.75535464]", avg(outcome_d))
{code}
In this example the vector holds the predictor variables and the outcome_d
field is the outcome being predicted. The stats streaming expression query
finds the top 10 nearest neighbors and averages the outcome_d field to make the
prediction.
Here is how easy it is to do knn classication predictions:
{code:java}
facet(collection1, q="{!knn f=vector
topK=10}[0.31254637,0.21707416,0.8859923,0.75535464]", buckets="label_s",
count(*))
{code}
In this example the vector is the predictor variables and the label_s field
holds the class label. The facet streaming expression finds the top 10 nearest
neighbors to the predictor vector and counts the label_s field to make the
classification prediction.
was (Author: joel.bernstein):
The patch also looks quite good. I'm a +1 to getting this in for 9.0.
Here is how easy it is to do knn regression predictions:
{code:java}
stats(collection1, q="{!knn f=vector
topK=10}[0.31254637,0.21707416,0.8859923,0.75535464]", avg(outcome_d))
{code}
In this example the vector holds the predictor variables and the outcome_d
field is the outcome being predicted. The stats streaming expression query
finds the top 10 nearest neighbors and averages the outcome_d field to make the
prediction.
Here is how easy it is to do knn classication predictions:
{code:java}
facet(collection1, q="{!knn f=vector
topK=10}[0.31254637,0.21707416,0.8859923,0.75535464]", buckets="label_s",
count(*))
{code}
In this example the vector is the predictor variables and the label_s field
holds the class label. The facet streaming expression finds the top 10 nearest
neighbors to the predictor vector and counts them to make the classification
prediction.
> Introduce Support to K Nearest Neighbors Search
> -----------------------------------------------
>
> Key: SOLR-15880
> URL: https://issues.apache.org/jira/browse/SOLR-15880
> Project: Solr
> Issue Type: New Feature
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 9.0
> Reporter: Alessandro Benedetti
> Assignee: Alessandro Benedetti
> Priority: Major
> Attachments: Screen Shot 2022-01-17 at 1.10.14 PM.png, Screen Shot
> 2022-01-17 at 12.49.52 PM.png
>
> Time Spent: 7h 40m
> Remaining Estimate: 0h
>
> This contribution introduces in Apache Solr the ability to run vector-based
> searches using native Apache Lucene data structures optimized for approximate
> nearest neighbours.
> This is the first milestone in introducing Neural Search in Apache Solr.
> More will come later.
> It builds on top of https://issues.apache.org/jira/browse/LUCENE-9004 to
> provide a:
> - *Dense Vector Field* - At moment Single valued, supporting stored, indexed
> - *Knn Query Parser* - simple query parser to run knn search on target vector
> and field
> The Dense Vector field only supports KNN searches (no range query, no field
> queries, no existence queries...)
> The Knn query parser can be used as query, filter query and rerank query.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]