[jira] [Comment Edited] (SOLR-6286) TestReplicationHandler.doTestReplicateAfterCoreReload reliably reproducing seed failures comparing master commits before/after reload
[ https://issues.apache.org/jira/browse/SOLR-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440132#comment-16440132 ] Steve Rowe edited comment on SOLR-6286 at 4/16/18 11:19 PM: [~hossman] wrote: {quote}i don't understand the point of this test at all ... it doesn't compare anything between master/slave except after a commit – so where does the "AfterCoreReload" part come into play? {quote} Explained here: https://issues.apache.org/jira/browse/SOLR-2705?focusedCommentId=13082578=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13082578 Rationale from [~markrmil...@gmail.com] for checking that commits are the same before/after a core reload, also from SOLR-2705: {quote} bq. I saw in the patch that you've added getIndexversion and getCommits, but haven't used them. Right - they where good for debugging and I figured I would use them to investigate SOLR-2326. bq. Can Junit check for indexversion/commits to be same right before and right after coreReload? I've committed this check. {quote} I don't see how comparing commits before/after reload is providing useful information - as Hoss noted in a comment above in response to [~shalinmangar]'s earlier commit: {quote} bq. ... I'd expect that since there were no pending changes, there's be no need to write a new segment. ... That seems like a naive assumption given the randomized merge settings – there could easily be background merges in other threads, or the randomized merge scheduler could decide to do an arbitrary/useless merge on commit (IIRC) {quote} I'll attach a patch to remove this check from the test. I'll wait a couple days before committing in case there are objections. was (Author: steve_rowe): [~hossman] wrote: {quote}i don't understand the point of this test at all ... it doesn't compare anything between master/slave except after a commit – so where does the "AfterCoreReload" part come into play? {quote} Explained here: https://issues.apache.org/jira/browse/SOLR-2705?focusedCommentId=13082578=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13082578 Rationale from [~markrmil...@gmail.com] for checking that commits are the same before/after a core reload, also from SOLR-2705: {quote} bq. I saw in the patch that you've added getIndexversion and getCommits, but haven't used them. Right - they where good for debugging and I figured I would use them to investigate SOLR-2326. bq. Can Junit check for indexversion/commits to be same right before and right after coreReload? I've committed this check. I don't see how comparing commits before/after reload is providing useful information - as Hoss noted in a comment above in response to [~shalinmangar]'s earlier commit: {quote} bq. ... I'd expect that since there were no pending changes, there's be no need to write a new segment. ... That seems like a naive assumption given the randomized merge settings – there could easily be background merges in other threads, or the randomized merge scheduler could decide to do an arbitrary/useless merge on commit (IIRC) {quote} I'll attach a patch to remove this check from the test. I'll wait a couple days before committing in case there are objections. > TestReplicationHandler.doTestReplicateAfterCoreReload reliably reproducing > seed failures comparing master commits before/after reload > - > > Key: SOLR-6286 > URL: https://issues.apache.org/jira/browse/SOLR-6286 > Project: Solr > Issue Type: Bug > Components: Tests >Reporter: Shalin Shekhar Mangar >Assignee: Shalin Shekhar Mangar >Priority: Major > Fix For: 4.10, 6.0 > > Attachments: SOLR-6286.patch > > > There have been a few failures on jenkins. > {code} > 3 tests failed. > REGRESSION: > org.apache.solr.handler.TestReplicationHandler.doTestReplicateAfterCoreReload > Error Message: > expected:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, > _bta.fdx, _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, > _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, > _yok.cfe, _yok.cfs, _yok.si, _yp3.cfe, _yp3.cfs, _yp3.si, _yp4.cfe, _yp4.cfs, > _yp4.si, _yp5.cfe, _yp5.cfs, _yp5.si, _yp6.cfe, _yp6.cfs, _yp6.si, _yp7.cfe, > _yp7.cfs, _yp7.si, _yp8.cfe, _yp8.cfs, _yp8.si, _yp9.cfe, _yp9.cfs, _yp9.si, > _ypa.cfe, _ypa.cfs, _ypa.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, > _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, > _ypy.cfs, _ypy.si, segments_2]}]> but > was:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, _bta.fdx, > _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, >
[jira] [Comment Edited] (SOLR-6286) TestReplicationHandler.doTestReplicateAfterCoreReload reliably reproducing seed failures comparing master commits before/after reload
[ https://issues.apache.org/jira/browse/SOLR-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440146#comment-16440146 ] Steve Rowe edited comment on SOLR-6286 at 4/16/18 11:12 PM: Responding to another of [~hossman]'s comments: {quote} At this line where this test fails, a non-nightly run won't have indexed a single doc – so this particular failure will only be observable with -Dtests.nightly=true ... {code:java}int docs = TEST_NIGHTLY ? 20 : 0;{code} {quote} I suspect that this was a typo when [~markrmil...@gmail.com] committed it on SOLR-4032 ([http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/test/org/apache/solr/handler/TestReplicationHandler.java?r1=1414428=1414427=1414428]) - likely the intent was for the non-nightly doc count to remain at the {{10}} docs it was before this change. was (Author: steve_rowe): Responding to another of [~hossman]'s comments: {quote} At this line where this test fails, a non-nightly run won't have indexed a single doc – so this particular failure will only be observable with -Dtests.nightly=true ... bq. int docs = TEST_NIGHTLY ? 20 : 0; {quote} I suspect that this was a typo when [~markrmil...@gmail.com] committed it on SOLR-4032 ([http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/test/org/apache/solr/handler/TestReplicationHandler.java?r1=1414428=1414427=1414428]) - likely the intent was for the non-nightly doc count to remain at the {{10}} docs it was before this change. > TestReplicationHandler.doTestReplicateAfterCoreReload reliably reproducing > seed failures comparing master commits before/after reload > - > > Key: SOLR-6286 > URL: https://issues.apache.org/jira/browse/SOLR-6286 > Project: Solr > Issue Type: Bug > Components: Tests >Reporter: Shalin Shekhar Mangar >Assignee: Shalin Shekhar Mangar >Priority: Major > Fix For: 4.10, 6.0 > > Attachments: SOLR-6286.patch > > > There have been a few failures on jenkins. > {code} > 3 tests failed. > REGRESSION: > org.apache.solr.handler.TestReplicationHandler.doTestReplicateAfterCoreReload > Error Message: > expected:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, > _bta.fdx, _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, > _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, > _yok.cfe, _yok.cfs, _yok.si, _yp3.cfe, _yp3.cfs, _yp3.si, _yp4.cfe, _yp4.cfs, > _yp4.si, _yp5.cfe, _yp5.cfs, _yp5.si, _yp6.cfe, _yp6.cfs, _yp6.si, _yp7.cfe, > _yp7.cfs, _yp7.si, _yp8.cfe, _yp8.cfs, _yp8.si, _yp9.cfe, _yp9.cfs, _yp9.si, > _ypa.cfe, _ypa.cfs, _ypa.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, > _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, > _ypy.cfs, _ypy.si, segments_2]}]> but > was:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, _bta.fdx, > _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, > _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, > _yok.cfe, _yok.cfs, _yok.si, _yp3.cfe, _yp3.cfs, _yp3.si, _yp4.cfe, _yp4.cfs, > _yp4.si, _yp5.cfe, _yp5.cfs, _yp5.si, _yp6.cfe, _yp6.cfs, _yp6.si, _yp7.cfe, > _yp7.cfs, _yp7.si, _yp8.cfe, _yp8.cfs, _yp8.si, _yp9.cfe, _yp9.cfs, _yp9.si, > _ypa.cfe, _ypa.cfs, _ypa.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, > _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, > _ypy.cfs, _ypy.si, segments_2]}, > {indexVersion=1406477990053,generation=3,filelist=[_bta.fdt, _bta.fdx, > _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, > _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, > _ypc.cfe, _ypc.cfs, _ypc.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, > _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, > _ypy.cfs, _ypy.si, segments_3]}]> > Stack Trace: > java.lang.AssertionError: > expected:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, > _bta.fdx, _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, > _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, > _yok.cfe, _yok.cfs, _yok.si, _yp3.cfe, _yp3.cfs, _yp3.si, _yp4.cfe, _yp4.cfs, > _yp4.si, _yp5.cfe, _yp5.cfs, _yp5.si, _yp6.cfe, _yp6.cfs, _yp6.si, _yp7.cfe, > _yp7.cfs, _yp7.si, _yp8.cfe, _yp8.cfs, _yp8.si, _yp9.cfe, _yp9.cfs, _yp9.si, > _ypa.cfe, _ypa.cfs, _ypa.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, > _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, > _ypy.cfs, _ypy.si, segments_2]}]> but > was:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, _bta.fdx, > _bta.fnm, _bta.si,
[jira] [Comment Edited] (SOLR-6286) TestReplicationHandler.doTestReplicateAfterCoreReload reliably reproducing seed failures comparing master commits before/after reload
[ https://issues.apache.org/jira/browse/SOLR-6286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16440146#comment-16440146 ] Steve Rowe edited comment on SOLR-6286 at 4/16/18 11:11 PM: Responding to another of [~hossman]'s comments: {quote} At this line where this test fails, a non-nightly run won't have indexed a single doc – so this particular failure will only be observable with -Dtests.nightly=true ... bq. int docs = TEST_NIGHTLY ? 20 : 0; {quote} I suspect that this was a typo when [~markrmil...@gmail.com] committed it on SOLR-4032 ([http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/test/org/apache/solr/handler/TestReplicationHandler.java?r1=1414428=1414427=1414428]) - likely the intent was for the non-nightly doc count to remain at the {{10}} docs it was before this change. was (Author: steve_rowe): Responding to another of [~hossman]'s comments: {quote} bq. At this line where this test fails, a non-nightly run won't have indexed a single doc – so this particular failure will only be observable with -Dtests.nightly=true ... int docs = TEST_NIGHTLY ? 20 : 0; {quote} I suspect that this was a typo when [~markrmil...@gmail.com] committed it on SOLR-4032 ([http://svn.apache.org/viewvc/lucene/dev/trunk/solr/core/src/test/org/apache/solr/handler/TestReplicationHandler.java?r1=1414428=1414427=1414428]) - likely the intent was for the non-nightly doc count to remain at the {{10}} docs it was before this change. > TestReplicationHandler.doTestReplicateAfterCoreReload reliably reproducing > seed failures comparing master commits before/after reload > - > > Key: SOLR-6286 > URL: https://issues.apache.org/jira/browse/SOLR-6286 > Project: Solr > Issue Type: Bug > Components: Tests >Reporter: Shalin Shekhar Mangar >Assignee: Shalin Shekhar Mangar >Priority: Major > Fix For: 4.10, 6.0 > > Attachments: SOLR-6286.patch > > > There have been a few failures on jenkins. > {code} > 3 tests failed. > REGRESSION: > org.apache.solr.handler.TestReplicationHandler.doTestReplicateAfterCoreReload > Error Message: > expected:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, > _bta.fdx, _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, > _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, > _yok.cfe, _yok.cfs, _yok.si, _yp3.cfe, _yp3.cfs, _yp3.si, _yp4.cfe, _yp4.cfs, > _yp4.si, _yp5.cfe, _yp5.cfs, _yp5.si, _yp6.cfe, _yp6.cfs, _yp6.si, _yp7.cfe, > _yp7.cfs, _yp7.si, _yp8.cfe, _yp8.cfs, _yp8.si, _yp9.cfe, _yp9.cfs, _yp9.si, > _ypa.cfe, _ypa.cfs, _ypa.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, > _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, > _ypy.cfs, _ypy.si, segments_2]}]> but > was:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, _bta.fdx, > _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, > _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, > _yok.cfe, _yok.cfs, _yok.si, _yp3.cfe, _yp3.cfs, _yp3.si, _yp4.cfe, _yp4.cfs, > _yp4.si, _yp5.cfe, _yp5.cfs, _yp5.si, _yp6.cfe, _yp6.cfs, _yp6.si, _yp7.cfe, > _yp7.cfs, _yp7.si, _yp8.cfe, _yp8.cfs, _yp8.si, _yp9.cfe, _yp9.cfs, _yp9.si, > _ypa.cfe, _ypa.cfs, _ypa.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, > _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, > _ypy.cfs, _ypy.si, segments_2]}, > {indexVersion=1406477990053,generation=3,filelist=[_bta.fdt, _bta.fdx, > _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, > _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, > _ypc.cfe, _ypc.cfs, _ypc.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, > _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, > _ypy.cfs, _ypy.si, segments_3]}]> > Stack Trace: > java.lang.AssertionError: > expected:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, > _bta.fdx, _bta.fnm, _bta.si, _bta_Lucene41_0.doc, _bta_Lucene41_0.tim, > _bta_Lucene41_0.tip, _bta_nrm.cfe, _bta_nrm.cfs, _nik.cfe, _nik.cfs, _nik.si, > _yok.cfe, _yok.cfs, _yok.si, _yp3.cfe, _yp3.cfs, _yp3.si, _yp4.cfe, _yp4.cfs, > _yp4.si, _yp5.cfe, _yp5.cfs, _yp5.si, _yp6.cfe, _yp6.cfs, _yp6.si, _yp7.cfe, > _yp7.cfs, _yp7.si, _yp8.cfe, _yp8.cfs, _yp8.si, _yp9.cfe, _yp9.cfs, _yp9.si, > _ypa.cfe, _ypa.cfs, _ypa.si, _ypu.cfe, _ypu.cfs, _ypu.si, _ypv.cfe, _ypv.cfs, > _ypv.si, _ypw.cfe, _ypw.cfs, _ypw.si, _ypx.cfe, _ypx.cfs, _ypx.si, _ypy.cfe, > _ypy.cfs, _ypy.si, segments_2]}]> but > was:<[{indexVersion=1406477990053,generation=2,filelist=[_bta.fdt, _bta.fdx, > _bta.fnm, _bta.si,