Re: How to combine third party search data as top results ?

2017-02-06 Thread shamik
Charlie, this looks something very close to what I'm looking for. Just
wondering if you've made this available as a jar or can be build from
source? Our Solr distribution is not built from source, I can only use an
external jar. I'll appreciate if you can let me know.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-combine-third-party-search-data-as-top-results-tp4318116p4319101.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to combine third party search data as top results ?

2017-02-01 Thread Joel Bernstein
Also this presentation discusses the RankQuery (Starting on slide 16)
http://www.slideshare.net/lucidworks/managed-search-presented-by-jacob-graves-getty-images

Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Feb 1, 2017 at 9:58 PM, Joel Bernstein <joels...@gmail.com> wrote:

> This type of ranking behavior is what the RankQuery is designed to do. A
> RankQuery allows you to inject your own TopDocs collector into the query
> and take full control of the ranking. It's more complex to implement
> though. Here is an example RankQuery implementation:
>
> https://github.com/apache/lucene-solr/blob/master/solr/
> core/src/java/org/apache/solr/search/ReRankQParserPlugin.java
>
> And the base class this extends:
>
> https://github.com/apache/lucene-solr/blob/master/solr/
> core/src/java/org/apache/solr/search/AbstractReRankQuery.java
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Wed, Feb 1, 2017 at 4:53 PM, Doug Turnbull <dturnbull@
> opensourceconnections.com> wrote:
>
>> I was going to say what Charlie said! I would trust Flax's work in this
>> area :)
>>
>> -Doug
>>
>> On Wed, Feb 1, 2017 at 3:10 PM shamik <sham...@gmail.com> wrote:
>>
>> > Charlie, thanks for sharing the information. I'm going to take a look
>> and
>> > get
>> > back to you.
>> >
>> >
>> >
>> > --
>> > View this message in context:
>> > http://lucene.472066.n3.nabble.com/How-to-combine-third-
>> party-search-data-as-top-results-tp4318116p4318349.html
>> > Sent from the Solr - User mailing list archive at Nabble.com.
>> >
>>
>
>


Re: How to combine third party search data as top results ?

2017-02-01 Thread Joel Bernstein
This type of ranking behavior is what the RankQuery is designed to do. A
RankQuery allows you to inject your own TopDocs collector into the query
and take full control of the ranking. It's more complex to implement
though. Here is an example RankQuery implementation:

https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/ReRankQParserPlugin.java

And the base class this extends:

https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/AbstractReRankQuery.java

Joel Bernstein
http://joelsolr.blogspot.com/

On Wed, Feb 1, 2017 at 4:53 PM, Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> I was going to say what Charlie said! I would trust Flax's work in this
> area :)
>
> -Doug
>
> On Wed, Feb 1, 2017 at 3:10 PM shamik <sham...@gmail.com> wrote:
>
> > Charlie, thanks for sharing the information. I'm going to take a look and
> > get
> > back to you.
> >
> >
> >
> > --
> > View this message in context:
> > http://lucene.472066.n3.nabble.com/How-to-combine-
> third-party-search-data-as-top-results-tp4318116p4318349.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
> >
>


Re: How to combine third party search data as top results ?

2017-02-01 Thread Doug Turnbull
I was going to say what Charlie said! I would trust Flax's work in this
area :)

-Doug

On Wed, Feb 1, 2017 at 3:10 PM shamik <sham...@gmail.com> wrote:

> Charlie, thanks for sharing the information. I'm going to take a look and
> get
> back to you.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-combine-third-party-search-data-as-top-results-tp4318116p4318349.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: How to combine third party search data as top results ?

2017-02-01 Thread shamik
Charlie, thanks for sharing the information. I'm going to take a look and get
back to you.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-combine-third-party-search-data-as-top-results-tp4318116p4318349.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to combine third party search data as top results ?

2017-02-01 Thread Charlie Hull

On 31/01/2017 19:04, Shamik Bandopadhyay wrote:

Hi,

  I'm trying to integrate results from a third party source with our
existing search. The idea is to include the top 5 results from this source
as the top result of our search.Though the external data is indexed in our
system, the use case dictates us to use their ranking (by getting the top
five result). Problem is, their result returns only text, title, and url.
To construct the final response, I need to include a bunch of metadata
fields which is only available in our index. Here are the steps:
1. Query external source, get top five results.
2. Query our index based on url from each result, retrieve their
corresponding id.
3. Query our index and pass the ids as elevateIds (dynamic query elevation)

This probably isn't a clean solution as it adds the overhead of an
additional query to retrieve document ids. Just wondering if there's a
better way to handle this situation, perhaps a way to combine step 2 and 3
in a single query or a different approach altogether?

Any pointers will be appreciated.

-Thanks,
Shamik


Hi Shamik,

I'm not sure if this will help, but we built a plugin for 'XJoin', 
allowing you to use results from an external system with Solr. Here are 
two blog posts about it:

http://www.flax.co.uk/blog/2016/01/25/xjoin-solr-part-1-filtering-using-price-discount-data/
http://www.flax.co.uk/blog/2016/01/29/xjoin-solr-part-2-click-example/

Cheers

Charlie

--
Charlie Hull
Flax - Open Source Enterprise Search

tel/fax: +44 (0)8700 118334
mobile:  +44 (0)7767 825828
web: www.flax.co.uk


Re: How to combine third party search data as top results ?

2017-01-31 Thread shamik
Thanks, John.

The title is not unique, so I can't really rely on it. Also, keeping an
external mapping for url and id might not feasible as we are talking about
possibly millions of documents.

URLs are unique in our case, unfortunately, it can't be used as part of
Query elevation component since it only accepts ids. As you've mentioned, I
can probably apply a huge boost factor to each of these urls (through "bq")
and see if they appear at the top in order.

I was hoping for an elegant solution to this :-)




--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-combine-third-party-search-data-as-top-results-tp4318116p4318127.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How to combine third party search data as top results ?

2017-01-31 Thread John Bickerstaff
Some random thoughts...  In case any of them are helpful...

If the URLs are unique you might be able to elevate them "as is" with a
boost or some other method...
If the titles are unique, you might be able to do the same (this might
require storing the EXACT title in another, non-indexed field)
If you stored the URL --> ID mapping in another data store (or a separate
collection) it might shorten the query time (although it won't eliminate
the step)

On Tue, Jan 31, 2017 at 12:04 PM, Shamik Bandopadhyay 
wrote:

> Hi,
>
>   I'm trying to integrate results from a third party source with our
> existing search. The idea is to include the top 5 results from this source
> as the top result of our search.Though the external data is indexed in our
> system, the use case dictates us to use their ranking (by getting the top
> five result). Problem is, their result returns only text, title, and url.
> To construct the final response, I need to include a bunch of metadata
> fields which is only available in our index. Here are the steps:
> 1. Query external source, get top five results.
> 2. Query our index based on url from each result, retrieve their
> corresponding id.
> 3. Query our index and pass the ids as elevateIds (dynamic query elevation)
>
> This probably isn't a clean solution as it adds the overhead of an
> additional query to retrieve document ids. Just wondering if there's a
> better way to handle this situation, perhaps a way to combine step 2 and 3
> in a single query or a different approach altogether?
>
> Any pointers will be appreciated.
>
> -Thanks,
> Shamik
>


How to combine third party search data as top results ?

2017-01-31 Thread Shamik Bandopadhyay
Hi,

  I'm trying to integrate results from a third party source with our
existing search. The idea is to include the top 5 results from this source
as the top result of our search.Though the external data is indexed in our
system, the use case dictates us to use their ranking (by getting the top
five result). Problem is, their result returns only text, title, and url.
To construct the final response, I need to include a bunch of metadata
fields which is only available in our index. Here are the steps:
1. Query external source, get top five results.
2. Query our index based on url from each result, retrieve their
corresponding id.
3. Query our index and pass the ids as elevateIds (dynamic query elevation)

This probably isn't a clean solution as it adds the overhead of an
additional query to retrieve document ids. Just wondering if there's a
better way to handle this situation, perhaps a way to combine step 2 and 3
in a single query or a different approach altogether?

Any pointers will be appreciated.

-Thanks,
Shamik