[jira] [Commented] (SOLR-2592) Pluggable shard lookup mechanism for SolrCloud

Michael Garski (JIRA) Sat, 23 Jun 2012 10:34:44 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399999#comment-13399999
 ]


Michael Garski commented on SOLR-2592:
--------------------------------------

One of the use cases I have is identical to yours Andy, where shard membership 
is used to ensure accuracy of the numGroups in the response for a distributed 
grouping query.

The challenge in that use case is that during an update a document could 
potentially move from one shard to another, requiring deleting it from its 
current shard along with adding it to the shard where it will now reside. If 
the previous value of the shardKey is not known, the same delete by query 
operation you have in 'Now' would have to be broadcast to all shards to ensure 
there are no duplicate unique ids in the collection. It looks like that would 
result in the same overhead as using the composite id. Do you have any ideas on 
how to handle that during an update?

Adding a separate shardKey definition to the schema would also cascade the 
change to the real-time get handler, which currently only uses the unique 
document ids as an input.

Regarding date-based sharding, I look at that as being handled differently. 
With hashing a document is being assigned to a specific shard from a set of 
known shards where with date-based sharding I would imagine one would want to 
bring up a new shard for a specific time period, perhaps daily or hourly. I can 
imagine that it might be desirable in some cases to merge shards with older 
date ranges together as well if the use case favors recent updates at search 
time.
                
> Pluggable shard lookup mechanism for SolrCloud
> ----------------------------------------------
>
>                 Key: SOLR-2592
>                 URL: https://issues.apache.org/jira/browse/SOLR-2592
>             Project: Solr
>          Issue Type: New Feature
>          Components: SolrCloud
>    Affects Versions: 4.0
>            Reporter: Noble Paul
>         Attachments: SOLR-2592.patch, dbq_fix.patch, 
> pluggable_sharding.patch, pluggable_sharding_V2.patch
>
>
> If the data in a cloud can be partitioned on some criteria (say range, hash, 
> attribute value etc) It will be easy to narrow down the search to a smaller 
> subset of shards and in effect can achieve more efficient search.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-2592) Pluggable shard lookup mechanism for SolrCloud

Reply via email to