Custom DoubleValuesSource to Read from Multiple Indexed DocValue Fields

2020-07-16 Thread Kevin Manuel
Hi,

I'm trying to write a custom DoubleValuesSource for use with a
FunctionScoreQuery instance.

To generate the final score of a document I need to:
1) Read from three indexed docValue fields and
2) Use the score of the wrapped query passed in to the FunctionScoreQuery
instance

For example, a document A would be scored using a formula like:
((docA's_score_from_wrapped_query * some_value_x) + (docA's_field1_value *
some_value_y) + (docA's_field2_value * some_value_z)) * docA's_field3_value

How do I accomplish this?
Can this be done using one custom DoubleValuesSource or do I need one for
reading from each of the indexed docValue fields and then use a combination
of MultiFloatFunctions and SumFloatFunctions to achieve this?

Appreciate your time and help.

Thanks,
Kevin


Re: ANN search current state

2020-07-16 Thread Ali Akhtar
I’m a bit of a layman in this area, but if we are talking about formats for
vectors, I vote for the one used by FastAI word vectors. It’s pretty easy
to work with.

If we are talking about the same / similiar things, if not just ignore me 

On Thu, 16 Jul 2020 at 7:06 PM, Michael Sokolov  wrote:

> We have some prototype implementations in the issues you found.  If
> you want to try out the approaches in those issues, you could build
> Lucene from source and patch it, but there is no release containing
> KNN/vector support. We're still working to establish consensus on what
> the best way forward is. I think the most fruitful thing we can do at
> the moment is establish a format for storing and accessing vectors
> that will support different approaches since there is such a rich
> variety of algorithms and approaches in this area. The last issue you
> pointed to is focused on the format.
>
> On Wed, Jul 15, 2020 at 11:20 AM Alex K  wrote:
> >
> > Hi Mikhail,
> >
> > I'm not sure about the state of ANN in lucene proper. Very interested to
> > see the response from others.
> > I've been doing some work on ANN for an Elasticsearch plugin:
> > http://elastiknn.klibisz.com/
> > I think it's possible to extract my custom queries and modeling code so
> > that it's elasticsearch-agnostic and can be used directly in Lucene apps.
> > However I'm much more familiar with Elasticsearch's APIs and
> usage/testing
> > patterns than I am with raw Lucene, so I'd likely need to get some help
> > from the Lucene community.
> > Please LMK if that sounds interesting to anyone.
> >
> > - Alex
> >
> >
> >
> > On Wed, Jul 15, 2020 at 11:11 AM Mikhail 
> wrote:
> >
> > >
> > > Hi,
> > >
> > >I want to incorporate semantic search in my project, which
> uses
> > > Lucene. I want to use sentence embeddings and ANN (approximate nearest
> > > neighbor) search. I found the related Lucene issues:
> > > https://issues.apache.org/jira/browse/LUCENE-9004 ,
> > > https://issues.apache.org/jira/browse/LUCENE-9136 ,
> > > https://issues.apache.org/jira/browse/LUCENE-9322 . I see that there
> > > are some related work and related PRs. What is the current state of
> this
> > > functionality?
> > >
> > > --
> > > Thanks,
> > > Mikhail
> > >
> > >
>
> -
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>


Re: ANN search current state

2020-07-16 Thread Michael Sokolov
We have some prototype implementations in the issues you found.  If
you want to try out the approaches in those issues, you could build
Lucene from source and patch it, but there is no release containing
KNN/vector support. We're still working to establish consensus on what
the best way forward is. I think the most fruitful thing we can do at
the moment is establish a format for storing and accessing vectors
that will support different approaches since there is such a rich
variety of algorithms and approaches in this area. The last issue you
pointed to is focused on the format.

On Wed, Jul 15, 2020 at 11:20 AM Alex K  wrote:
>
> Hi Mikhail,
>
> I'm not sure about the state of ANN in lucene proper. Very interested to
> see the response from others.
> I've been doing some work on ANN for an Elasticsearch plugin:
> http://elastiknn.klibisz.com/
> I think it's possible to extract my custom queries and modeling code so
> that it's elasticsearch-agnostic and can be used directly in Lucene apps.
> However I'm much more familiar with Elasticsearch's APIs and usage/testing
> patterns than I am with raw Lucene, so I'd likely need to get some help
> from the Lucene community.
> Please LMK if that sounds interesting to anyone.
>
> - Alex
>
>
>
> On Wed, Jul 15, 2020 at 11:11 AM Mikhail  wrote:
>
> >
> > Hi,
> >
> >I want to incorporate semantic search in my project, which uses
> > Lucene. I want to use sentence embeddings and ANN (approximate nearest
> > neighbor) search. I found the related Lucene issues:
> > https://issues.apache.org/jira/browse/LUCENE-9004 ,
> > https://issues.apache.org/jira/browse/LUCENE-9136 ,
> > https://issues.apache.org/jira/browse/LUCENE-9322 . I see that there
> > are some related work and related PRs. What is the current state of this
> > functionality?
> >
> > --
> > Thanks,
> > Mikhail
> >
> >

-
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org