[
https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399999#comment-13399999
]
Michael Garski commented on SOLR-2592:
--------------------------------------
One of the use cases I have is identical to yours Andy, where shard membership
is used to ensure accuracy of the numGroups in the response for a distributed
grouping query.
The challenge in that use case is that during an update a document could
potentially move from one shard to another, requiring deleting it from its
current shard along with adding it to the shard where it will now reside. If
the previous value of the shardKey is not known, the same delete by query
operation you have in 'Now' would have to be broadcast to all shards to ensure
there are no duplicate unique ids in the collection. It looks like that would
result in the same overhead as using the composite id. Do you have any ideas on
how to handle that during an update?
Adding a separate shardKey definition to the schema would also cascade the
change to the real-time get handler, which currently only uses the unique
document ids as an input.
Regarding date-based sharding, I look at that as being handled differently.
With hashing a document is being assigned to a specific shard from a set of
known shards where with date-based sharding I would imagine one would want to
bring up a new shard for a specific time period, perhaps daily or hourly. I can
imagine that it might be desirable in some cases to merge shards with older
date ranges together as well if the use case favors recent updates at search
time.
> Pluggable shard lookup mechanism for SolrCloud
> ----------------------------------------------
>
> Key: SOLR-2592
> URL: https://issues.apache.org/jira/browse/SOLR-2592
> Project: Solr
> Issue Type: New Feature
> Components: SolrCloud
> Affects Versions: 4.0
> Reporter: Noble Paul
> Attachments: SOLR-2592.patch, dbq_fix.patch,
> pluggable_sharding.patch, pluggable_sharding_V2.patch
>
>
> If the data in a cloud can be partitioned on some criteria (say range, hash,
> attribute value etc) It will be easy to narrow down the search to a smaller
> subset of shards and in effect can achieve more efficient search.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]