Re: Replicationhandler with TLOG replicas

Erick Erickson Mon, 12 Nov 2018 15:51:48 -0800

In that case, it seems to be a reporting issue, not a fundamental
replication problem. Whew!


So it's worth raising a JIRA, but since you report the indexes are
identical, I'm not sure how
high a priority it would be. If you do raise a JIRA, you should
reference this discussion.

Best,
Erick
On Mon, Nov 12, 2018 at 3:00 AM Vadim Ivanov
<vadim.iva...@spb.ntk-intourist.ru> wrote:
>
> Hi, Erick
> I have about 1300 cores in my test environment for 159 colllections
> Today I have wrote a script to check all of them.
> For 138 out of 1300 cores "generation" and "indexversion" information 
> returned by mbeans and replicationhandler do not match.
> Most of the replicas has more than 1 gap in generation (fro ex. 14 - returned 
> by mbeans. 6 - returned by RH) (so  it's not indexing for sure)
> None of these 138 replicas are leader of corresponding shards.
> All of these 138 replicas when queried with &distrib=false returned 
> absolutely the same documents as their leaders.
> I've checked some replicas for segments - yes they have the same segments as 
> their  leaders with absolutely same sizes in bytes.
>
> It seems to me this issue does not affect indexing or searching... it's just 
> curious misread of some information I faced.
>
> My autocommit is:
>
> <autoCommit>
>        <maxTime>${solr.autoCommit.maxTime:60000}</maxTime>
>        <openSearcher>false</openSearcher>
>      </autoCommit>
>
>      <autoSoftCommit>
>        <maxTime>${solr.autoSoftCommit.maxTime:300000}</maxTime>
>      </autoSoftCommit>
>
> --
> BR, Vadim
>
>
> > -----Original Message-----
> > From: Erick Erickson [mailto:erickerick...@gmail.com]
> > Sent: Sunday, November 11, 2018 9:51 PM
> > To: solr-user
> > Subject: Re: Replicationhandler with TLOG replicas
> >
> > Vadim:
> >
> > The next time you see this, is it possible to check that the replicas
> > showing different index versions have the same documents? Actually, it
> > should be sufficient to verify that they have the same segments in
> > their data/index directory, and they should match the segments on the
> > leader _assuming_ you're not actively indexing and you stopped
> > indexing more than the polling interval ago.
> >
> > If you are actively indexing, it should be sufficient to check that
> > the questionable replica's index files are changing over time, that
> > would mean that replication is happening.
> >
> > And what's your commit interval? The polling interval on the followers is:
> > 1> 1/2 the hard commit interval if defined to be > -1. If not
> > 2> 1/2 the soft commit interval if defined to be > -1. If not
> > 3> 3000ms
> >
> > There are two possibilities here as I see it.
> > 1> this is just a reporting error, which we should still address but
> > doesn't worry me much.
> > 2> the TLOG/PULL replication process has some bug and the indexes are,
> > indeed different
> > 2a> when you reloaded the collection, it's possible that the startup
> > progress kicked off a replication
> >        and if there's really a bug reloading just masked it.
> >
> > Best,
> > Erick
> > On Sun, Nov 11, 2018 at 2:34 AM Vadim Ivanov
> > <vadim.iva...@spb.ntk-intourist.ru> wrote:
> > >
> > > Reload collection helps !
> > > After reloading collection  generation and indexversion returned by
> > Replicationhandler  catch up with the leader
> > >
> > >
> > > > -----Original Message-----
> > > > From: Vadim Ivanov [mailto:vadim.iva...@spb.ntk-intourist.ru]
> > > > Sent: Sunday, November 11, 2018 1:09 PM
> > > > To: solr-user@lucene.apache.org
> > > > Subject: RE: Replicationhandler with TLOG replicas
> > > >
> > > > Thanks, Shawn
> > > > I have anticipated the answer about information returned by
> > > > ReplicationHandler.
> > > > What baffled me is that usually on most of replicas indexversion and
> > generation
> > > > returned by ReplicationHandler is right and it increases with commits.
> > > > But on some replicas it's not - it stops changing at some moment in the 
> > > > past
> > > > forever.
> > > > For example, I have 5 TLOG replicas:
> > > > For leader(and all good 3 replicas)
> > > > http://host_n:8983/solr/core_n/replication?command=indexversion
> > returnes
> > > > {
> > > >   "responseHeader":{
> > > >     "status":0,
> > > >     "QTime":0},
> > > >   "indexversion":1541885907200,
> > > >   "generation":1704}
> > > >
> > > > But for one replica:
> > > > {
> > > >   "responseHeader":{
> > > >     "status":0,
> > > >     "QTime":0},
> > > >   "indexversion":1540842454653,
> > > >   "generation":1216}
> > > >
> > > > Could it be sign of some hidden issue? Where that information stored and
> > why
> > > > it stops changing at some moment?
> > > > No indexing is going on of that collection at the moment of request. I'm
> > > > "deltaimporting" that collection ones per hour and only if needed.
> > > > So usually there is only 5-10 commits per day.
> > > > It's not a crucial issue for my use case as I have adequate information 
> > > > of
> > > > indexversion
> > > > and generation returned by mbeans, just curious of that strange 
> > > > behavior.
> > > >
> > > > > -----Original Message-----
> > > > > From: Shawn Heisey [mailto:apa...@elyograg.org]
> > > > > Sent: Saturday, November 10, 2018 6:46 PM
> > > > > To: solr-user@lucene.apache.org
> > > > > Subject: Re: Replicationhandler with TLOG replicas
> > > > >
> > > > > On 11/10/2018 8:05 AM, Vadim Ivanov wrote:
> > > > > > Seems, the latter gets some wrong information as indexversion and
> > > > > generation
> > > > > > is far behind then leader.
> > > > > > But core index seems up to date and healthy.
> > > > > > Why such things could happen on some replicas? (Most of the replicas
> > > > > retuned
> > > > > > the same information by both commands)
> > > > > > Is information returned  by Replicationhandler  not applicable to 
> > > > > > tlog/pull
> > > > > > replicas and is not reliable ?
> > > > >
> > > > > SolrCloud does not use the replication handler in the same way that
> > > > > master/slave replication does.  It "manually" initiates any 
> > > > > replication
> > > > > that takes place -- the replication handler is not in charge.  You
> > > > > cannot be sure that the indexes the replication handler thinks are
> > > > > master and slave are in fact the indexes that will be replicated next.
> > > > > Just ignore anything that the replication handler tells you.  It may
> > > > > have absolutely no bearing on what's happening.
> > > > >
> > > > > Was indexing happening when you looked, or was it entirely stopped?  
> > > > > If
> > > > > indexing is ongoing, you may have seen the difference in the index
> > > > > versions in between data being indexed on the leader and the time that
> > > > > the replication is initiated.
> > > > >
> > > > > Thanks,
> > > > > Shawn
> > >
>

Re: Replicationhandler with TLOG replicas

Reply via email to