[
https://issues.apache.org/jira/browse/SOLR-7936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699081#comment-14699081
]
Shalin Shekhar Mangar commented on SOLR-7936:
---------------------------------------------
Some of those recent bogus deletion failures are likely because of SOLR-5756
> Bogus failure when deleting collections.
> ----------------------------------------
>
> Key: SOLR-7936
> URL: https://issues.apache.org/jira/browse/SOLR-7936
> Project: Solr
> Issue Type: Bug
> Reporter: Erick Erickson
> Assignee: Erick Erickson
>
> When looking at the CDCR test failures, we began to wonder whether the
> problem was
> 1> the cdcr code itself
> 2> the test framework
> 3> Solr
> Some of the failures seem to be "impossible" assuming collection
> creation/deletion work OK.
> So I wrote a little program to exercise collection creation/deletion outside
> the test framework by just adding and deleting the same collection over and
> over and over again, and it started regularly failing in
> OverseerCollectionMessageHandler.deleteCollection about line 780 it would
> throw the "Could not fully remove the collection" exception:
> {code}
> TimeOut timeout = new TimeOut(30, TimeUnit.SECONDS);
> boolean removed = false;
> while (! timeout.hasTimedOut()) {
> Thread.sleep(100);
> // WORKS SO FAR IF UNCOMMENTED zkStateReader.updateClusterState();
> removed = !zkStateReader.getClusterState().hasCollection(collection);
> if (removed) {
> Thread.sleep(500); // just a bit of time so it's more likely other
> // readers see on return
> break;
> }
> }
> if (!removed) {
> throw new SolrException(ErrorCode.SERVER_ERROR,
> "Could not fully remove collection: " + collection);
> }
> {code}
> However, the collection is really gone from clusterstate. When I put the
> updateClusterState() in above, it doesn't seem to fail. Is it as simple as
> the updateClusterState() call?
> Without the update in place, it failed within 20 reps very regularly. So far,
> with the update in place we're at 132 and counting. Any comments?
> If this runs 1,000 times tonight, I'll check it in if there are no
> objections. I don't know what it means for CDCR yet though.
> I'm also suspicious of the 500ms sleep. Anyone have a clue what that's in
> there for?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]