[ 
https://issues.apache.org/jira/browse/SOLR-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167034#comment-15167034
 ] 

Tom Winch commented on SOLR-8234:
---------------------------------

That would be another approach, I guess.  You'd still have to write the 
(custom) merge code, and the approach of this JIRA means you get back SOLR 
results as per usual, and it's a plugin that makes use of the existing 
distributed search mechanisms for requesting the top N unique ids from each 
server and merge-ranking them etc.

> Federated Search (new) - DJoin
> ------------------------------
>
>                 Key: SOLR-8234
>                 URL: https://issues.apache.org/jira/browse/SOLR-8234
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Tom Winch
>            Priority: Minor
>              Labels: federated_search
>             Fix For: 4.10.3
>
>         Attachments: SOLR-8234.patch, SOLR-8234.patch, SOLR-8234.patch
>
>
> This issue describes a MergeStrategy implementation (DJoin) to facilitate 
> federated search - that is, distributed search over documents stored in 
> separated instances of SOLR (for example, one server per continent), where a 
> single document (identified by an agreed, common unique id) may be stored in 
> more than one server instance, with (possibly) differing fields and data.
> When the MergeStrategy is used in a request handler (via the included 
> QParser) in combination with distributed search (shards=), documents having 
> an id that has already been seen are not discarded (as per the default 
> behaviour) but, instead, are collected and returned as a group of documents 
> all with the same id taking a single position in the result set (this is 
> implemented using parent/child documents, with an indicator field in the 
> parent - see example output, below).
> Documents are sorted in the result set based on the highest ranking document 
> with the same id. It is possible for a document ranking high in one shard to 
> rank very low on another shard. As a consequence of this, all shards must be 
> asked to return the fields for every document id in the result set (not just 
> of those documents they returned), so that all the component parts of each 
> document in the search result set are returned.
> As usual, search parameters are passed on to each shard. So that the shards 
> do not need any additional configurations in their definition of the /select 
> request handler, we use the FilterQParserSearchComponent which is configured 
> to filter out the \{!djoin\} search parameter - otherwise, the target request 
> handler complains about the missing query parser definition. See the example 
> config, below.
> This issue combines with others to provide full federated search support. See 
> also SOLR-8235 and SOLR-8236.
> Note that this is part of a new implementation of federated search as opposed 
> to the older issues SOLR-3799 through SOLR-3805.
> --
> Example request handler configuration:
> {code:xml}
>   <searchComponent name="filter" 
> class="org.apache.solr.search.federated.FilterDJoinQParserSearchComponent" />
>   
>   <queryParser name="djoin" 
> class="org.apache.solr.search.federated.DJoinQParserPlugin" />
>   <requestHandler name="djoin" class="solr.SearchHandler">
>     <lst name="defaults">
>       <str 
> name="shards">http://shard1/solr/core,http://shard2/solr/core,http://shard3/solr/core</str>
>       <bool name="shards.tolerant">true</bool>
>       <str name="rq">{!djoin}</str>
>     </lst>
>     <arr name="last-components">
>       <str>filter</str>
>     </arr>
>   </requestHandler> 
> {code}
> Example output:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
>   <lst name="responseHeader">
>     <int name="status">0</int>
>     <int name="QTime">33</int>
>     <lst name="params">
>       <str name="q">*:*</str>
>       <str 
> name="shards">http://shard1/solr/core,http://shard2/solr/core,http://shard3/solr/core</str>
>       <str name="shards.tolerant">true</str>
>       <str name="wt">xml</str>
>       <str name="rq">{!djoin}</str>
>       <str name="fl">*,[shard]</str>
>     </lst>
>   </lst>
>   <result name="response" numFound="5" start="0" maxScore="1.0">
>     <doc>
>       <bool name="__merge_parent__">true</bool>
>       <doc>
>         <int name="id">200</int>
>         <int name="value">1973</int>
>         <str name="[shard]">http://shard2/solr/core</str>
>         <long name="_version_">1515645309629235200</long>
>       </doc>
>       <doc>
>         <int name="id">200</int>
>         <int name="value">2015</int>
>         <str name="[shard]">http://shard1/solr/core</str>
>         <long name="_version_">1515645309682712576</long>
>       </doc>
>     </doc>
>     <doc>
>       <bool name="__merge_parent__">true</bool>
>       <doc>
>         <int name="id">100</int>
>         <int name="value">873</int>
>         <str name="[shard]">http://shard1/solr/core</str>
>         <long name="_version_">1515645309629254124</long>
>       </doc>
>       <doc>
>         <int name="id">100</int>
>         <int name="value">2001</int>
>         <str name="[shard]">http://shard3/solr/core</str>
>         <long name="_version_">1515645309682792852</long>
>       </doc>
>     </doc>
>     <doc>
>       <bool name="__merge_parent__">true</bool>
>       <doc>
>         <int name="id">300</int>
>         <int name="value">1492</int>
>         <str name="[shard]">http://shard2/solr/core</str>
>         <long name="_version_">1515645309629251252</long>
>       </doc>
>     </doc>
>   </result>
> </response>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to