[
https://issues.apache.org/jira/browse/SOLR-12313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682039#comment-16682039
]
Hoss Man commented on SOLR-12313:
---------------------------------
[~caomanhdat] ...
RecoveryAfterSoftCommitTest has been failing roughly 50% of the time the past
few days - but only on master, and git bisect identifies your
[13a8356|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=13a8356]
commit as the cause...
Here is an example of a seed from jenkins that reproduces reliably for me (and
fails a the same place everytime: {{RecoveryAfterSoftCommitTest.java:87}} ) ...
{noformat}
[junit4] 2> NOTE: reproduce with: ant test
-Dtestcase=RecoveryAfterSoftCommitTest -Dtests.method=test
-Dtests.seed=9AB4E0C0AB3BEF87 -Dtests.multiplier=3 -Dtests.slow=true
-Dtests.badapples=true -Dtests.locale=ru
-Dtests.timezone=America/Indiana/Tell_City -Dtests.asserts=true
-Dtests.file.encoding=ISO-8859-1
[junit4] ERROR 78.5s | RecoveryAfterSoftCommitTest.test <<<
[junit4] > Throwable #1:
org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting
response from server at: http://127.0.0.1:52448/ol_wuc/collection1
[junit4] > at
__randomizedtesting.SeedInfo.seed([9AB4E0C0AB3BEF87:12E0DF1A05C7827F]:0)
[junit4] > at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:654)
[junit4] > at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
[junit4] > at
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
[junit4] > at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:483)
[junit4] > at
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:413)
[junit4] > at
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1107)
[junit4] > at
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:884)
[junit4] > at
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:817)
[junit4] > at
org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1260)
[junit4] > at
org.apache.solr.cloud.RecoveryAfterSoftCommitTest.test(RecoveryAfterSoftCommitTest.java:87)
[junit4] > at
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:1010)
[junit4] > at
org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:985)
[junit4] > at java.lang.Thread.run(Thread.java:748)
[junit4] > Caused by: java.net.SocketTimeoutException: Read timed out
[junit4] > at java.net.SocketInputStream.socketRead0(Native Method)
[junit4] > at
java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
[junit4] > at
java.net.SocketInputStream.read(SocketInputStream.java:171)
[junit4] > at
java.net.SocketInputStream.read(SocketInputStream.java:141)
[junit4] > at
org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137)
[junit4] > at
org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153)
[junit4] > at
org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282)
[junit4] > at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138)
[junit4] > at
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56)
[junit4] > at
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259)
[junit4] > at
org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163)
[junit4] > at
org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165)
[junit4] > at
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273)
[junit4] > at
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
[junit4] > at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272)
[junit4] > at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185)
[junit4] > at
org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
[junit4] > at
org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111)
[junit4] > at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
[junit4] > at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
[junit4] > at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
[junit4] > at
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:542)
[junit4] > ... 50 more
{noformat}
> TestInjection#waitForInSyncWithLeader needs improvement.
> --------------------------------------------------------
>
> Key: SOLR-12313
> URL: https://issues.apache.org/jira/browse/SOLR-12313
> Project: Solr
> Issue Type: Test
> Security Level: Public(Default Security Level. Issues are Public)
> Reporter: Mark Miller
> Priority: Major
>
> This really should have some doc for why it would be used.
> I also think it causes BasicDistributedZkTest to take forever for sometimes
> and perhaps other tests?
> I think checking for uncommitted data is probably a race condition and should
> be removed.
> Checking index versions should follow the rules that replication does - if
> the slave is higher than the leader, it's in sync, being equal is not
> required. If it's expected for a test it should be a specific test that
> fails. This just introduces massive delays.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]