[
https://issues.apache.org/jira/browse/SOLR-8586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15137248#comment-15137248
]
Yonik Seeley commented on SOLR-8586:
------------------------------------
bq. The first thing I see is to compare hashes between the shards and if there
is a difference use the ComplementStream to determine which id's are missing.
Implementing eventual consistency with this is problematic in a general sense:
If one shard has an ID and another doesn't, you don't know what the correct
state is.
The other general issue is the inability to actually retrieve an arbitrary
document from the index (i.e. all source fields must be stored).
It may still be useful for add-only systems that do store all source fields...
but in that case, we could make things much more efficient by adding in the
ability to use hash trees to drastically narrow the ids that need to be
communicated.
> Implement hash over all documents to check for shard synchronization
> --------------------------------------------------------------------
>
> Key: SOLR-8586
> URL: https://issues.apache.org/jira/browse/SOLR-8586
> Project: Solr
> Issue Type: Improvement
> Components: SolrCloud
> Reporter: Yonik Seeley
> Fix For: 5.5, Trunk
>
> Attachments: SOLR-8586.patch, SOLR-8586.patch, SOLR-8586.patch,
> SOLR-8586.patch
>
>
> An order-independent hash across all of the versions in the index should
> suffice. The hash itself is pretty easy, but we need to figure out
> when/where to do this check (for example, I think PeerSync is currently used
> in multiple contexts and this check would perhaps not be appropriate for all
> PeerSync calls?)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]