[jira] [Commented] (SOLR-8234) Federated Search (new) - DJoin

2016-02-25 Thread Tom Winch (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15167034#comment-15167034
 ] 

Tom Winch commented on SOLR-8234:
-

That would be another approach, I guess.  You'd still have to write the 
(custom) merge code, and the approach of this JIRA means you get back SOLR 
results as per usual, and it's a plugin that makes use of the existing 
distributed search mechanisms for requesting the top N unique ids from each 
server and merge-ranking them etc.

> Federated Search (new) - DJoin
> --
>
> Key: SOLR-8234
> URL: https://issues.apache.org/jira/browse/SOLR-8234
> Project: Solr
>  Issue Type: New Feature
>Reporter: Tom Winch
>Priority: Minor
>  Labels: federated_search
> Fix For: 4.10.3
>
> Attachments: SOLR-8234.patch, SOLR-8234.patch, SOLR-8234.patch
>
>
> This issue describes a MergeStrategy implementation (DJoin) to facilitate 
> federated search - that is, distributed search over documents stored in 
> separated instances of SOLR (for example, one server per continent), where a 
> single document (identified by an agreed, common unique id) may be stored in 
> more than one server instance, with (possibly) differing fields and data.
> When the MergeStrategy is used in a request handler (via the included 
> QParser) in combination with distributed search (shards=), documents having 
> an id that has already been seen are not discarded (as per the default 
> behaviour) but, instead, are collected and returned as a group of documents 
> all with the same id taking a single position in the result set (this is 
> implemented using parent/child documents, with an indicator field in the 
> parent - see example output, below).
> Documents are sorted in the result set based on the highest ranking document 
> with the same id. It is possible for a document ranking high in one shard to 
> rank very low on another shard. As a consequence of this, all shards must be 
> asked to return the fields for every document id in the result set (not just 
> of those documents they returned), so that all the component parts of each 
> document in the search result set are returned.
> As usual, search parameters are passed on to each shard. So that the shards 
> do not need any additional configurations in their definition of the /select 
> request handler, we use the FilterQParserSearchComponent which is configured 
> to filter out the \{!djoin\} search parameter - otherwise, the target request 
> handler complains about the missing query parser definition. See the example 
> config, below.
> This issue combines with others to provide full federated search support. See 
> also SOLR-8235 and SOLR-8236.
> Note that this is part of a new implementation of federated search as opposed 
> to the older issues SOLR-3799 through SOLR-3805.
> --
> Example request handler configuration:
> {code:xml}
>class="org.apache.solr.search.federated.FilterDJoinQParserSearchComponent" />
>   
>class="org.apache.solr.search.federated.DJoinQParserPlugin" />
>   
> 
>name="shards">http://shard1/solr/core,http://shard2/solr/core,http://shard3/solr/core
>   true
>   {!djoin}
> 
> 
>   filter
> 
>
> {code}
> Example output:
> {code:xml}
> 
> 
>   
> 0
> 33
> 
>   *:*
>name="shards">http://shard1/solr/core,http://shard2/solr/core,http://shard3/solr/core
>   true
>   xml
>   {!djoin}
>   *,[shard]
> 
>   
>   
> 
>   true
>   
> 200
> 1973
> http://shard2/solr/core
> 1515645309629235200
>   
>   
> 200
> 2015
> http://shard1/solr/core
> 1515645309682712576
>   
> 
> 
>   true
>   
> 100
> 873
> http://shard1/solr/core
> 1515645309629254124
>   
>   
> 100
> 2001
> http://shard3/solr/core
> 1515645309682792852
>   
> 
> 
>   true
>   
> 300
> 1492
> http://shard2/solr/core
> 1515645309629251252
>   
> 
>   
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8234) Federated Search (new) - DJoin

2015-11-20 Thread Dennis Gove (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15019153#comment-15019153
 ] 

Dennis Gove commented on SOLR-8234:
---

Could the use-case here be solved using Streaming Aggregation (SOLR-7082) and 
in particular the joins added in SOLR-7584? With SA you can perform federated 
searches across multiple collections (even in other clouds) with joins, merges, 
uniqueness, ranking (top N), faceting, aggregations (group by). And you can do 
all of these in parallel across multiple workers like one would do in a 
map-reduce approach.


> Federated Search (new) - DJoin
> --
>
> Key: SOLR-8234
> URL: https://issues.apache.org/jira/browse/SOLR-8234
> Project: Solr
>  Issue Type: New Feature
>Reporter: Tom Winch
>Priority: Minor
>  Labels: federated_search
> Fix For: 4.10.3
>
> Attachments: SOLR-8234.patch, SOLR-8234.patch
>
>
> This issue describes a MergeStrategy implementation (DJoin) to facilitate 
> federated search - that is, distributed search over documents stored in 
> separated instances of SOLR (for example, one server per continent), where a 
> single document (identified by an agreed, common unique id) may be stored in 
> more than one server instance, with (possibly) differing fields and data.
> When the MergeStrategy is used in a request handler (via the included 
> QParser) in combination with distributed search (shards=), documents having 
> an id that has already been seen are not discarded (as per the default 
> behaviour) but, instead, are collected and returned as a group of documents 
> all with the same id taking a single position in the result set (this is 
> implemented using parent/child documents, with an indicator field in the 
> parent - see example output, below).
> Documents are sorted in the result set based on the highest ranking document 
> with the same id. It is possible for a document ranking high in one shard to 
> rank very low on another shard. As a consequence of this, all shards must be 
> asked to return the fields for every document id in the result set (not just 
> of those documents they returned), so that all the component parts of each 
> document in the search result set are returned.
> As usual, search parameters are passed on to each shard. So that the shards 
> do not need any additional configurations in their definition of the /select 
> request handler, we use the FilterQParserSearchComponent which is configured 
> to filter out the \{!djoin\} search parameter - otherwise, the target request 
> handler complains about the missing query parser definition. See the example 
> config, below.
> This issue combines with others to provide full federated search support. See 
> also SOLR-8235 and SOLR-8236.
> Note that this is part of a new implementation of federated search as opposed 
> to the older issues SOLR-3799 through SOLR-3805.
> --
> Example request handler configuration:
> {code:xml}
>class="org.apache.solr.search.djoin.FilterDJoinQParserSearchComponent" />
>   
>class="org.apache.solr.search.djoin.DJoinQParserPlugin" />
>   
>  class="org.apache.solr.search.djoin.LocalShardHandlerFactory" />
> 
>name="shards">http://shard1/solr/core,http://shard2/solr/core,http://shard3/solr/core
>   true
>   {!djoin}
> 
> 
>   filter
> 
>
> {code}
> Example output:
> {code:xml}
> 
> 
>   
> 0
> 33
> 
>   *:*
>name="shards">http://shard1/solr/core,http://shard2/solr/core,http://shard3/solr/core
>   true
>   xml
>   {!djoin}
>   *,[shard]
> 
>   
>   
> 
>   true
>   
> 200
> 1973
> http://shard2/solr/core
> 1515645309629235200
>   
>   
> 200
> 2015
> http://shard1/solr/core
> 1515645309682712576
>   
> 
> 
>   true
>   
> 100
> 873
> http://shard1/solr/core
> 1515645309629254124
>   
>   
> 100
> 2001
> http://shard3/solr/core
> 1515645309682792852
>   
> 
> 
>   true
>   
> 300
> 1492
> http://shard2/solr/core
> 1515645309629251252
>   
> 
>   
> 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org