[
https://issues.apache.org/jira/browse/SOLR-13240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906293#comment-16906293
]
Christine Poerschke commented on SOLR-13240:
--------------------------------------------
Thanks [~goodman] revising the patch, this time it passed QA (yay!) and I've
gone ahead and pulled out two small separate bits from it into the commit above
(to potentially make for an easier diff on the remaining changes).
So as a next step then I was trying to better understand the {{TestPolicy}}
changes:
* for TestPolicy.testReplicaCountSuggestions just a comment seems sufficient
* for TestPolicy.testNodeLostMultipleReplica it was a bit tricky to understand
first of all what the existing test actually does; with that in mind the
just-attached patch includes two small refactors to surface the sequential
nature of part of the test – would appreciate any thoughts on that.
With the small refactor in place we can more easily see that two sub-tests are
being changed:
* "lets change the policy such that all replicas that were on node1 can now
fit on node2"
* "now lets change the policy such that a node can have 2 shard2 replicas"
Now to the tricky bit. My interpretion of the revised test expectations is that
now then the "a node can have 2 shardX replicas" condition is no longer being
met i.e. previously r3 and r5 were expected to be on node2 but now r3 and r5
are expected on different nodes. So whilst technically speaking the test
passes, the coverage it intended to provide has been lost and so further
adjustments to that sub-test might be needed to preserve the intended coverage.
What do you think?
> UTILIZENODE action results in an exception
> ------------------------------------------
>
> Key: SOLR-13240
> URL: https://issues.apache.org/jira/browse/SOLR-13240
> Project: Solr
> Issue Type: Bug
> Components: AutoScaling
> Affects Versions: 7.6
> Reporter: Hendrik Haddorp
> Priority: Major
> Attachments: SOLR-13240.patch, SOLR-13240.patch, SOLR-13240.patch,
> SOLR-13240.patch, SOLR-13240.patch, SOLR-13240.patch, SOLR-13240.patch,
> solr-solrj-7.5.0.jar
>
>
> When I invoke the UTILIZENODE action the REST call fails like this after it
> moved a few replicas:
> {
> "responseHeader":{
> "status":500,
> "QTime":40220},
> "Operation utilizenode caused
> exception:":"java.lang.IllegalArgumentException:java.lang.IllegalArgumentException:
> Comparison method violates its general contract!",
> "exception":{
> "msg":"Comparison method violates its general contract!",
> "rspCode":-1},
> "error":{
> "metadata":[
> "error-class","org.apache.solr.common.SolrException",
> "root-error-class","org.apache.solr.common.SolrException"],
> "msg":"Comparison method violates its general contract!",
> "trace":"org.apache.solr.common.SolrException: Comparison method violates
> its general contract!\n\tat
> org.apache.solr.client.solrj.SolrResponse.getException(SolrResponse.java:53)\n\tat
>
> org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:274)\n\tat
>
> org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:246)\n\tat
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)\n\tat
>
> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:734)\n\tat
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:715)\n\tat
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:496)\n\tat
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:377)\n\tat
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:323)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1634)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)\n\tat
>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)\n\tat
>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1317)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)\n\tat
>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1219)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
>
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
> org.eclipse.jetty.server.Server.handle(Server.java:531)\n\tat
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:352)\n\tat
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)\n\tat
>
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:281)\n\tat
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)\n\tat
> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)\n\tat
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)\n\tat
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)\n\tat
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)\n\tat
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)\n\tat
>
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)\n\tat
>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:762)\n\tat
>
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:680)\n\tat
> java.lang.Thread.run(Thread.java:748)\n",
> "code":500}}
> The logs show this additional exception:
> 2019-02-10 00:09:00.539 ERROR
> (OverseerThreadFactory-1268-thread-38-processing-n:agent2:9151_solr) [ ]
> o.a.s.c.a.c.OverseerCollectionMessageHandler Operation utilizenode
> failed:java.lang.IllegalArgumentException: Comparison method violates its
> general contract!
> at java.util.TimSort.mergeLo(TimSort.java:777)
> at java.util.TimSort.mergeAt(TimSort.java:514)
> at java.util.TimSort.mergeCollapse(TimSort.java:439)
> at java.util.TimSort.sort(TimSort.java:245)
> at java.util.Arrays.sort(Arrays.java:1512)
> at java.util.ArrayList.sort(ArrayList.java:1462)
> at
> org.apache.solr.client.solrj.cloud.autoscaling.MoveReplicaSuggester.tryEachNode(MoveReplicaSuggester.java:50)
> at
> org.apache.solr.client.solrj.cloud.autoscaling.MoveReplicaSuggester.init(MoveReplicaSuggester.java:38)
> at
> org.apache.solr.client.solrj.cloud.autoscaling.Suggester.getSuggestion(Suggester.java:187)
> at
> org.apache.solr.cloud.api.collections.UtilizeNodeCmd.call(UtilizeNodeCmd.java:100)
> at
> org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:259)
> at
> org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:478)
> at
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> I suspect this to be caused by this comparator:
> https://github.com/apache/lucene-solr/blob/master/solr/solrj/src/java/org/apache/solr/client/solrj/cloud/autoscaling/MoveReplicaSuggester.java#L98
> In case it is possible that both compared replicas are leaders the result
> would not be correct.
> see also the mail thread about this:
> https://www.mail-archive.com/[email protected]&q=subject:%22Re%5C%3A+CloudSolrClient+getDocCollection%22&o=newest&f=1
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]