[
https://issues.apache.org/jira/browse/SOLR-8129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972658#comment-14972658
]
Mark Miller commented on SOLR-8129:
-----------------------------------
I've just finished beasting HdfsChaosMonkeyNothingIsSafeTest for 100 runs. The
only change I made was to interrupt the update executor on shutdown. Mostly
this was just a test to see if this helped and what else might still pop up.
In 100 runs, I saw 4 or 5 fails, but no replica inconsistency.
* I saw control off with cloud, by 1 and by more than 1. (There is a known test
issue here, but we don't know it covers all cases)
* I saw a hang in CUSC#blockUntilFinished - I think I already have an open JIRA
issue for something like that.
* Before this 100 runs I saw a strange fail where the connection pool for the
test client is shutdown while we still try and use - I removed the try, finally
that wraps the test prints the zklayout on failure - I saw that printout but
could not work out the real issue - just hanging threads because we didn't
actually wait for indexing threads to stop. Since removing that, have not seen
this, but could just be lucky. It seemed like somehow the true fail reason was
being swallowed somehow.
I'll keep it running for now, while we work out the best way to cut things off
faster.
> HdfsChaosMonkeyNothingIsSafeTest failures
> -----------------------------------------
>
> Key: SOLR-8129
> URL: https://issues.apache.org/jira/browse/SOLR-8129
> Project: Solr
> Issue Type: Bug
> Reporter: Yonik Seeley
> Attachments: fail.151005_064958, fail.151005_080319
>
>
> New HDFS chaos test in SOLR-8123 hits a number of types of failures,
> including shard inconsistency.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]