In that case, it seems to be a reporting issue, not a fundamental replication problem. Whew!
So it's worth raising a JIRA, but since you report the indexes are identical, I'm not sure how high a priority it would be. If you do raise a JIRA, you should reference this discussion. Best, Erick On Mon, Nov 12, 2018 at 3:00 AM Vadim Ivanov <vadim.iva...@spb.ntk-intourist.ru> wrote: > > Hi, Erick > I have about 1300 cores in my test environment for 159 colllections > Today I have wrote a script to check all of them. > For 138 out of 1300 cores "generation" and "indexversion" information > returned by mbeans and replicationhandler do not match. > Most of the replicas has more than 1 gap in generation (fro ex. 14 - returned > by mbeans. 6 - returned by RH) (so it's not indexing for sure) > None of these 138 replicas are leader of corresponding shards. > All of these 138 replicas when queried with &distrib=false returned > absolutely the same documents as their leaders. > I've checked some replicas for segments - yes they have the same segments as > their leaders with absolutely same sizes in bytes. > > It seems to me this issue does not affect indexing or searching... it's just > curious misread of some information I faced. > > My autocommit is: > > <autoCommit> > <maxTime>${solr.autoCommit.maxTime:60000}</maxTime> > <openSearcher>false</openSearcher> > </autoCommit> > > <autoSoftCommit> > <maxTime>${solr.autoSoftCommit.maxTime:300000}</maxTime> > </autoSoftCommit> > > -- > BR, Vadim > > > > -----Original Message----- > > From: Erick Erickson [mailto:erickerick...@gmail.com] > > Sent: Sunday, November 11, 2018 9:51 PM > > To: solr-user > > Subject: Re: Replicationhandler with TLOG replicas > > > > Vadim: > > > > The next time you see this, is it possible to check that the replicas > > showing different index versions have the same documents? Actually, it > > should be sufficient to verify that they have the same segments in > > their data/index directory, and they should match the segments on the > > leader _assuming_ you're not actively indexing and you stopped > > indexing more than the polling interval ago. > > > > If you are actively indexing, it should be sufficient to check that > > the questionable replica's index files are changing over time, that > > would mean that replication is happening. > > > > And what's your commit interval? The polling interval on the followers is: > > 1> 1/2 the hard commit interval if defined to be > -1. If not > > 2> 1/2 the soft commit interval if defined to be > -1. If not > > 3> 3000ms > > > > There are two possibilities here as I see it. > > 1> this is just a reporting error, which we should still address but > > doesn't worry me much. > > 2> the TLOG/PULL replication process has some bug and the indexes are, > > indeed different > > 2a> when you reloaded the collection, it's possible that the startup > > progress kicked off a replication > > and if there's really a bug reloading just masked it. > > > > Best, > > Erick > > On Sun, Nov 11, 2018 at 2:34 AM Vadim Ivanov > > <vadim.iva...@spb.ntk-intourist.ru> wrote: > > > > > > Reload collection helps ! > > > After reloading collection generation and indexversion returned by > > Replicationhandler catch up with the leader > > > > > > > > > > -----Original Message----- > > > > From: Vadim Ivanov [mailto:vadim.iva...@spb.ntk-intourist.ru] > > > > Sent: Sunday, November 11, 2018 1:09 PM > > > > To: solr-user@lucene.apache.org > > > > Subject: RE: Replicationhandler with TLOG replicas > > > > > > > > Thanks, Shawn > > > > I have anticipated the answer about information returned by > > > > ReplicationHandler. > > > > What baffled me is that usually on most of replicas indexversion and > > generation > > > > returned by ReplicationHandler is right and it increases with commits. > > > > But on some replicas it's not - it stops changing at some moment in the > > > > past > > > > forever. > > > > For example, I have 5 TLOG replicas: > > > > For leader(and all good 3 replicas) > > > > http://host_n:8983/solr/core_n/replication?command=indexversion > > returnes > > > > { > > > > "responseHeader":{ > > > > "status":0, > > > > "QTime":0}, > > > > "indexversion":1541885907200, > > > > "generation":1704} > > > > > > > > But for one replica: > > > > { > > > > "responseHeader":{ > > > > "status":0, > > > > "QTime":0}, > > > > "indexversion":1540842454653, > > > > "generation":1216} > > > > > > > > Could it be sign of some hidden issue? Where that information stored and > > why > > > > it stops changing at some moment? > > > > No indexing is going on of that collection at the moment of request. I'm > > > > "deltaimporting" that collection ones per hour and only if needed. > > > > So usually there is only 5-10 commits per day. > > > > It's not a crucial issue for my use case as I have adequate information > > > > of > > > > indexversion > > > > and generation returned by mbeans, just curious of that strange > > > > behavior. > > > > > > > > > -----Original Message----- > > > > > From: Shawn Heisey [mailto:apa...@elyograg.org] > > > > > Sent: Saturday, November 10, 2018 6:46 PM > > > > > To: solr-user@lucene.apache.org > > > > > Subject: Re: Replicationhandler with TLOG replicas > > > > > > > > > > On 11/10/2018 8:05 AM, Vadim Ivanov wrote: > > > > > > Seems, the latter gets some wrong information as indexversion and > > > > > generation > > > > > > is far behind then leader. > > > > > > But core index seems up to date and healthy. > > > > > > Why such things could happen on some replicas? (Most of the replicas > > > > > retuned > > > > > > the same information by both commands) > > > > > > Is information returned by Replicationhandler not applicable to > > > > > > tlog/pull > > > > > > replicas and is not reliable ? > > > > > > > > > > SolrCloud does not use the replication handler in the same way that > > > > > master/slave replication does. It "manually" initiates any > > > > > replication > > > > > that takes place -- the replication handler is not in charge. You > > > > > cannot be sure that the indexes the replication handler thinks are > > > > > master and slave are in fact the indexes that will be replicated next. > > > > > Just ignore anything that the replication handler tells you. It may > > > > > have absolutely no bearing on what's happening. > > > > > > > > > > Was indexing happening when you looked, or was it entirely stopped? > > > > > If > > > > > indexing is ongoing, you may have seen the difference in the index > > > > > versions in between data being indexed on the leader and the time that > > > > > the replication is initiated. > > > > > > > > > > Thanks, > > > > > Shawn > > > >