[ 
https://issues.apache.org/jira/browse/SOLR-13236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763937#comment-16763937
 ] 

Hoss Man commented on SOLR-13236:
---------------------------------

Examples of some of the types of failures i've observed in jenkins logs...

----


This error occurs inside of a catch block while trying to log some info about 
the state of hte election when the Error/Exception happened.  The original 
exception is completely lost in the logs because of this 
IllegalArgumentException, which arises from calling zkClient().getChildren() on 
the hardcoded string 
{{"/collections/allReplicasInLIR/leader_elect/shard1/election/"}} -- which as 
the error indicates is completley illegal, and indicates that this code path 
was never sanity checked when the test was written.

{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=LIROnShardRestartTest -Dtests.method=testAllReplicasInLIR 
-Dtests.seed=10B31070AB4A4496 -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true -Dtests.badapples=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-BadApples-NightlyTests-7.x/test-data/enwiki.random.lines.txt
 -Dtests.locale=sv-SE -Dtests.timezone=Africa/Lusaka -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] ERROR    144s J2 | LIROnShardRestartTest.testAllReplicasInLIR <<<
   [junit4]    > Throwable #1: java.lang.IllegalArgumentException: Path must 
not end with / character
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([10B31070AB4A4496:4A2B2AB6D5CA2371]:0)
   [junit4]    >        at 
org.apache.zookeeper.common.PathUtils.validatePath(PathUtils.java:58)
   [junit4]    >        at 
org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1523)
   [junit4]    >        at 
org.apache.solr.common.cloud.SolrZkClient.lambda$getChildren$4(SolrZkClient.java:346)
   [junit4]    >        at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:71)
   [junit4]    >        at 
org.apache.solr.common.cloud.SolrZkClient.getChildren(SolrZkClient.java:346)
   [junit4]    >        at 
org.apache.solr.cloud.LIROnShardRestartTest.testAllReplicasInLIR(LIROnShardRestartTest.java:168)
   [junit4]    >        at java.lang.Thread.run(Thread.java:748)
 {noformat}

This is a failure in the last line of the test, after all assertions ahve 
passed, to delete the collection -- i believe because the checks that " waiting 
for replicas rejoin election" doesn't first wait to see all the nodes 
disconnected from jetty and be marged "down" -- so the election may not have 
even happened yet by the time the test finishes, it may just be getting to the 
point where all the solr nodes are marked "down" when it tries to clean up...

{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=LIROnShardRestartTest -Dtests.method=testAllReplicasInLIR 
-Dtests.seed=10B31070AB4A4496 -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true -Dtests.badapples=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-BadApples-NightlyTests-7.x/test-data/enwiki.random.lines.txt
 -Dtests.locale=sv-SE -Dtests.timezone=Africa/Lusaka -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] ERROR   94.6s J1 | LIROnShardRestartTest.testAllReplicasInLIR <<<
   [junit4]    > Throwable #1: 
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this request
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([10B31070AB4A4496:4A2B2AB6D5CA2371]:0)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:461)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1110)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:884)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:817)
   [junit4]    >        at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
   [junit4]    >        at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
   [junit4]    >        at 
org.apache.solr.cloud.LIROnShardRestartTest.testAllReplicasInLIR(LIROnShardRestartTest.java:175)
   [junit4]    >        at java.lang.Thread.run(Thread.java:748)
{noformat}


This is a (similar) failure in the first line of another test method to create 
the collection it wants to use, which can happen if the former test fails (or 
passes) and the next test method is started before all the nodes have a chance 
to re-connect to zk...

{noformat}
  [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=LIROnShardRestartTest -Dtests.method=testSeveralReplicasInLIR 
-Dtests.seed=10B31070AB4A4496 -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true -Dtests.badapples=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-BadApples-NightlyTests-7.x/test-data/enwiki.random.lines.txt
 -Dtests.locale=sv-SE -Dtests.timezone=Africa/Lusaka -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.60s J1 | LIROnShardRestartTest.testSeveralReplicasInLIR 
<<<
   [junit4]    > Throwable #1: 
org.apache.solr.client.solrj.SolrServerException: No live SolrServers available 
to handle this request
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([10B31070AB4A4496:96E987448B1009CF]:0)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:461)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1110)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:884)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:817)
   [junit4]    >        at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
   [junit4]    >        at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
   [junit4]    >        at 
org.apache.solr.cloud.LIROnShardRestartTest.testSeveralReplicasInLIR(LIROnShardRestartTest.java:190)
   [junit4]    >        at java.lang.Thread.run(Thread.java:748)
{noformat}

Here is another error showing how the effects of one test method may not be 
adequately cleaned up by the time the next test method starts...

{noformat}
   [junit4]   2> NOTE: reproduce with: ant test  
-Dtestcase=LIROnShardRestartTest -Dtests.method=testSeveralReplicasInLIR 
-Dtests.seed=10B31070AB4A4496 -Dtests.multiplier=2 -Dtests.nightly=true 
-Dtests.slow=true -Dtests.badapples=true 
-Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-BadApples-NightlyTests-7.x/test-data/enwiki.random.lines.txt
 -Dtests.locale=sv-SE -Dtests.timezone=Africa/Lusaka -Dtests.asserts=true 
-Dtests.file.encoding=UTF-8
   [junit4] ERROR   0.54s J2 | LIROnShardRestartTest.testSeveralReplicasInLIR 
<<<
   [junit4]    > Throwable #1: 
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
from server at http://127.0.0.1:44441/solr: Cannot create collection 
severalReplicasInLIR. Value of maxShardsPerNode is 1, and the number of nodes 
currently live or live and part of your createNodeSet is 2. This allows a 
maximum of 2 to be created. Value of numShards is 1, value of nrtReplicas is 3, 
value of tlogReplicas is 0 and value of pullReplicas is 0. This requires 3 
shards to be created (higher than the allowed number)
   [junit4]    >        at 
__randomizedtesting.SeedInfo.seed([10B31070AB4A4496:96E987448B1009CF]:0)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:643)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:484)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:414)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:1110)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:884)
   [junit4]    >        at 
org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:817)
   [junit4]    >        at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
   [junit4]    >        at 
org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:211)
   [junit4]    >        at 
org.apache.solr.cloud.LIROnShardRestartTest.testSeveralReplicasInLIR(LIROnShardRestartTest.java:190)
   [junit4]    >        at java.lang.Thread.run(Thread.java:748)
{noformat}


> numerous problems with LIROnShardRestartTest
> --------------------------------------------
>
>                 Key: SOLR-13236
>                 URL: https://issues.apache.org/jira/browse/SOLR-13236
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>            Priority: Major
>
> LIROnShardRestartTest is a frequent cause of jenkins failures -- but only on 
> the 7x jenkins jobs, because it was removed from master/8x as part of 
> SOLR-11812 since the underlying implementation being tested was deprecated 
> and removed in 8x.
> I spent some time looking into trying to fix this test, but the amount of 
> work it appears it would take to fix doesn't seem worth the effort given it's 
> deprecated status.  so i'm filing this issue purely for tracking purposes 
> with the plan to disable the test and resolve this jira as "Won't Fix" -- if 
> anyone else is intereste in working on it they can feel free to re-open



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to