[jira] [Comment Edited] (SOLR-4260) Inconsistent numDocs between leader and replica

Hari Sekhon (JIRA) Wed, 15 Apr 2015 04:17:36 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496055#comment-14496055
 ]


Hari Sekhon edited comment on SOLR-4260 at 4/15/15 11:16 AM:
-------------------------------------------------------------

I've seen discrepancies between leader and followers of much higher numbers on 
newer versions of Solr than in this ticket - tens to hundreds of thousands of 
numDocs difference when doing bulk online indexing jobs (hundreds of millions 
of docs) from Hive. I'm not sure if it's related but it seemed it would be 
marked as a duplicate if I raised it separately. I was using Solr 4.7.2 and 
Solr 4.10.3 when I observed this.


was (Author: harisekhon):
I've seen discrepancies between leader and followers of much higher numbers on 
newer versions of Solr than in this ticket - tens to hundreds of thousands of 
numDocs difference when doing bulk online indexing jobs (hundreds of millions 
of docs) from Hive.

> Inconsistent numDocs between leader and replica
> -----------------------------------------------
>
>                 Key: SOLR-4260
>                 URL: https://issues.apache.org/jira/browse/SOLR-4260
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>         Environment: 5.0.0.2013.01.04.15.31.51
>            Reporter: Markus Jelsma
>            Assignee: Mark Miller
>            Priority: Critical
>             Fix For: 4.6.1, Trunk
>
>         Attachments: 192.168.20.102-replica1.png, 
> 192.168.20.104-replica2.png, SOLR-4260.patch, clusterstate.png, 
> demo_shard1_replicas_out_of_sync.tgz
>
>
> After wiping all cores and reindexing some 3.3 million docs from Nutch using 
> CloudSolrServer we see inconsistencies between the leader and replica for 
> some shards.
> Each core hold about 3.3k documents. For some reason 5 out of 10 shards have 
> a small deviation in then number of documents. The leader and slave deviate 
> for roughly 10-20 documents, not more.
> Results hopping ranks in the result set for identical queries got my 
> attention, there were small IDF differences for exactly the same record 
> causing a record to shift positions in the result set. During those tests no 
> records were indexed. Consecutive catch all queries also return different 
> number of numDocs.
> We're running a 10 node test cluster with 10 shards and a replication factor 
> of two and frequently reindex using a fresh build from trunk. I've not seen 
> this issue for quite some time until a few days ago.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SOLR-4260) Inconsistent numDocs between leader and replica

Reply via email to