Re: [MASSMAIL]Re: How to boost query based on result of subquery?

2016-02-20 Thread Jorge Luis Betancourt González
Hi Rajesh,

Have you taked a look on Query Re-Ranking? The idea is a little different of 
what you want but i think it should work, essentially you use your normal 
search query and then re-rank the top-n documents using a sencod query, this 
second query could use the position field to influence your ranking. 

Regards,

- Mensaje original -
De: "Edward P" 
Para: solr-user@lucene.apache.org
Enviados: Sábado, 20 de Febrero 2016 17:56:52
Asunto: [MASSMAIL]Re: How to boost query based on result of subquery?

Hi Rajesh,

Thanks for the input but this does not directly address my problem.
Currently I am not having an issue with loading the "position" values into
my documents. They are indexed as regular fields with the rest of the
document.
It is a good tip about ExternalFileField. I might use that later if my
position data becomes bigger or changes on a different schedule than my
other data. But this isn't currently a problem.

My problem, rather, is that I don't want to the client to send a "position"
value. I want the client to send an item_id, and in a single Solr request,
I want (1) to find the position for the item_id, then (2) use the position
in another query's boost function. I need to find a way to chain these 2
queries together in a single Solr request.

thanks,
Ed


On Fri, Feb 19, 2016 at 12:25 PM, Rajesh Hazari 
wrote:

> Hi Ed,
>
> Did you look into ExternalFilefield type (for ex: with name ::
>  position_external_field  in your schema), which can be used to map to your
> field (for ex position, hope these are not changed very often) and then use
> position_external_field in your boost function.
>
> This can be used if you can comeup with unique field values for position
> field as this is application specific field,
> this can be changed to something like, if these are finite.
> position_5=5
> position_25=25
> position_55=55
>
> for ex: =custom_function(field(query_position_external),
> field(position_external))
>
> for more info refer wiki
> <
> https://cwiki.apache.org/confluence/display/solr/Working+with+External+Files+and+Processes
> >
> .
>
> pros:
> the value of this field can be refreshed with every newsearcher and
> firstsearcher
> using
>  "org.apache.solr.schema.ExternalFileFieldReloader"/>
>  "org.apache.solr.schema.ExternalFileFieldReloader"/>
>
> Cons: This file has to reside in data folder of each replica,
>updating of this file will have to be some bash script.
>
> *Please ignore if this may not work for you.*
>
> *Rajesh**.*
>
> On Fri, Feb 19, 2016 at 1:19 PM, Edward P  wrote:
>
> > Hello,
> >
> > I am using Solr 5.4.0, one collection, multiple shards with replication.
> > Sample documents:
> > {
> > "item_id": "30d1e667",
> > "date": "2014-01-01",
> > "position": "5",
> > "description": "automobile license plate holder"
> > }
> >
> > {
> > "item_id": "3cf18028",
> > "date": "2013-01-01",
> > "position": "23",
> > "description": "dinner plate"
> > }
> >
> > {
> > "item_id": "be1b2643",
> > "date": "2013-06-01",
> > "position": "21",
> > "description": "ceramic plate"
> > }
> >
> >
> > The client sends 2 queries like this:
> > (1) /select?q=item_id:30d1e667=position
> > (2)
> /select?q=plate_position=5=custom_function($query_position,
> > $position)=item_id,date,description
> >
> > The idea is, we have an application-specific data field "position" which
> we
> > use to compare 2 items. The client looks up a particular item by item_id,
> > gets the position data, then sends it back in the 2nd query to influence
> > the ranking of items when performing a text search for "plate". Our
> > custom_function is app-specific and may for example derive the boost from
> > the difference of query_position and document's position.
> >
> > My need is: I want to combine these into one query, so the client will
> only
> > have to send something like:
> >
> > /select?query_item_id=30d1e667_text=plate={… use of Solr nested
> > queries, boost functions etc …}=item_id,date,description
> >
> > I want this to be one request so that both queries are executed against
> the
> > same searcher (because index updates may change the position values) and
> so
> > the details of using the "position" field are abstracted from the client.
> >
> > I have considered the query(subquery,default) function. This is close,
> but
> > not exactly what I need because it returns the subquery score, not
> document
> > values.
> >
> > The join query parser is also close to what I need, but I can't see how
> to
> > use it to direct the results of a subquery into the boost function of
> > another.
> >
> > So how can I, in a single Solr request, extract a value from the result
> > document of one subquery, and pass that value into a boost function for a
> > 2nd query, all using the same underlying searcher? If it's not possible
> > with existing nested/sub-queries, then should I explore writing a custom
> > SearchComponent, QParser, or 

Re: How to boost query based on result of subquery?

2016-02-20 Thread Edward P
Hi Rajesh,

Thanks for the input but this does not directly address my problem.
Currently I am not having an issue with loading the "position" values into
my documents. They are indexed as regular fields with the rest of the
document.
It is a good tip about ExternalFileField. I might use that later if my
position data becomes bigger or changes on a different schedule than my
other data. But this isn't currently a problem.

My problem, rather, is that I don't want to the client to send a "position"
value. I want the client to send an item_id, and in a single Solr request,
I want (1) to find the position for the item_id, then (2) use the position
in another query's boost function. I need to find a way to chain these 2
queries together in a single Solr request.

thanks,
Ed


On Fri, Feb 19, 2016 at 12:25 PM, Rajesh Hazari 
wrote:

> Hi Ed,
>
> Did you look into ExternalFilefield type (for ex: with name ::
>  position_external_field  in your schema), which can be used to map to your
> field (for ex position, hope these are not changed very often) and then use
> position_external_field in your boost function.
>
> This can be used if you can comeup with unique field values for position
> field as this is application specific field,
> this can be changed to something like, if these are finite.
> position_5=5
> position_25=25
> position_55=55
>
> for ex: =custom_function(field(query_position_external),
> field(position_external))
>
> for more info refer wiki
> <
> https://cwiki.apache.org/confluence/display/solr/Working+with+External+Files+and+Processes
> >
> .
>
> pros:
> the value of this field can be refreshed with every newsearcher and
> firstsearcher
> using
>  "org.apache.solr.schema.ExternalFileFieldReloader"/>
>  "org.apache.solr.schema.ExternalFileFieldReloader"/>
>
> Cons: This file has to reside in data folder of each replica,
>updating of this file will have to be some bash script.
>
> *Please ignore if this may not work for you.*
>
> *Rajesh**.*
>
> On Fri, Feb 19, 2016 at 1:19 PM, Edward P  wrote:
>
> > Hello,
> >
> > I am using Solr 5.4.0, one collection, multiple shards with replication.
> > Sample documents:
> > {
> > "item_id": "30d1e667",
> > "date": "2014-01-01",
> > "position": "5",
> > "description": "automobile license plate holder"
> > }
> >
> > {
> > "item_id": "3cf18028",
> > "date": "2013-01-01",
> > "position": "23",
> > "description": "dinner plate"
> > }
> >
> > {
> > "item_id": "be1b2643",
> > "date": "2013-06-01",
> > "position": "21",
> > "description": "ceramic plate"
> > }
> >
> >
> > The client sends 2 queries like this:
> > (1) /select?q=item_id:30d1e667=position
> > (2)
> /select?q=plate_position=5=custom_function($query_position,
> > $position)=item_id,date,description
> >
> > The idea is, we have an application-specific data field "position" which
> we
> > use to compare 2 items. The client looks up a particular item by item_id,
> > gets the position data, then sends it back in the 2nd query to influence
> > the ranking of items when performing a text search for "plate". Our
> > custom_function is app-specific and may for example derive the boost from
> > the difference of query_position and document's position.
> >
> > My need is: I want to combine these into one query, so the client will
> only
> > have to send something like:
> >
> > /select?query_item_id=30d1e667_text=plate={… use of Solr nested
> > queries, boost functions etc …}=item_id,date,description
> >
> > I want this to be one request so that both queries are executed against
> the
> > same searcher (because index updates may change the position values) and
> so
> > the details of using the "position" field are abstracted from the client.
> >
> > I have considered the query(subquery,default) function. This is close,
> but
> > not exactly what I need because it returns the subquery score, not
> document
> > values.
> >
> > The join query parser is also close to what I need, but I can't see how
> to
> > use it to direct the results of a subquery into the boost function of
> > another.
> >
> > So how can I, in a single Solr request, extract a value from the result
> > document of one subquery, and pass that value into a boost function for a
> > 2nd query, all using the same underlying searcher? If it's not possible
> > with existing nested/sub-queries, then should I explore writing a custom
> > SearchComponent, QParser, or some other plugin?
> >
> > thanks,
> > Ed
> >
>


Re: How to boost query based on result of subquery?

2016-02-19 Thread Rajesh Hazari
Hi Ed,

Did you look into ExternalFilefield type (for ex: with name ::
 position_external_field  in your schema), which can be used to map to your
field (for ex position, hope these are not changed very often) and then use
position_external_field in your boost function.

This can be used if you can comeup with unique field values for position
field as this is application specific field,
this can be changed to something like, if these are finite.
position_5=5
position_25=25
position_55=55

for ex: =custom_function(field(query_position_external),
field(position_external))

for more info refer wiki

.

pros:
the value of this field can be refreshed with every newsearcher and
firstsearcher
using



Cons: This file has to reside in data folder of each replica,
   updating of this file will have to be some bash script.

*Please ignore if this may not work for you.*

*Rajesh**.*

On Fri, Feb 19, 2016 at 1:19 PM, Edward P  wrote:

> Hello,
>
> I am using Solr 5.4.0, one collection, multiple shards with replication.
> Sample documents:
> {
> "item_id": "30d1e667",
> "date": "2014-01-01",
> "position": "5",
> "description": "automobile license plate holder"
> }
>
> {
> "item_id": "3cf18028",
> "date": "2013-01-01",
> "position": "23",
> "description": "dinner plate"
> }
>
> {
> "item_id": "be1b2643",
> "date": "2013-06-01",
> "position": "21",
> "description": "ceramic plate"
> }
>
>
> The client sends 2 queries like this:
> (1) /select?q=item_id:30d1e667=position
> (2) /select?q=plate_position=5=custom_function($query_position,
> $position)=item_id,date,description
>
> The idea is, we have an application-specific data field "position" which we
> use to compare 2 items. The client looks up a particular item by item_id,
> gets the position data, then sends it back in the 2nd query to influence
> the ranking of items when performing a text search for "plate". Our
> custom_function is app-specific and may for example derive the boost from
> the difference of query_position and document's position.
>
> My need is: I want to combine these into one query, so the client will only
> have to send something like:
>
> /select?query_item_id=30d1e667_text=plate={… use of Solr nested
> queries, boost functions etc …}=item_id,date,description
>
> I want this to be one request so that both queries are executed against the
> same searcher (because index updates may change the position values) and so
> the details of using the "position" field are abstracted from the client.
>
> I have considered the query(subquery,default) function. This is close, but
> not exactly what I need because it returns the subquery score, not document
> values.
>
> The join query parser is also close to what I need, but I can't see how to
> use it to direct the results of a subquery into the boost function of
> another.
>
> So how can I, in a single Solr request, extract a value from the result
> document of one subquery, and pass that value into a boost function for a
> 2nd query, all using the same underlying searcher? If it's not possible
> with existing nested/sub-queries, then should I explore writing a custom
> SearchComponent, QParser, or some other plugin?
>
> thanks,
> Ed
>


How to boost query based on result of subquery?

2016-02-19 Thread Edward P
Hello,

I am using Solr 5.4.0, one collection, multiple shards with replication.
Sample documents:
{
"item_id": "30d1e667",
"date": "2014-01-01",
"position": "5",
"description": "automobile license plate holder"
}

{
"item_id": "3cf18028",
"date": "2013-01-01",
"position": "23",
"description": "dinner plate"
}

{
"item_id": "be1b2643",
"date": "2013-06-01",
"position": "21",
"description": "ceramic plate"
}


The client sends 2 queries like this:
(1) /select?q=item_id:30d1e667=position
(2) /select?q=plate_position=5=custom_function($query_position,
$position)=item_id,date,description

The idea is, we have an application-specific data field "position" which we
use to compare 2 items. The client looks up a particular item by item_id,
gets the position data, then sends it back in the 2nd query to influence
the ranking of items when performing a text search for "plate". Our
custom_function is app-specific and may for example derive the boost from
the difference of query_position and document's position.

My need is: I want to combine these into one query, so the client will only
have to send something like:

/select?query_item_id=30d1e667_text=plate={… use of Solr nested
queries, boost functions etc …}=item_id,date,description

I want this to be one request so that both queries are executed against the
same searcher (because index updates may change the position values) and so
the details of using the "position" field are abstracted from the client.

I have considered the query(subquery,default) function. This is close, but
not exactly what I need because it returns the subquery score, not document
values.

The join query parser is also close to what I need, but I can't see how to
use it to direct the results of a subquery into the boost function of
another.

So how can I, in a single Solr request, extract a value from the result
document of one subquery, and pass that value into a boost function for a
2nd query, all using the same underlying searcher? If it's not possible
with existing nested/sub-queries, then should I explore writing a custom
SearchComponent, QParser, or some other plugin?

thanks,
Ed