[ 
https://issues.apache.org/jira/browse/SOLR-12075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrzej Bialecki  resolved SOLR-12075.
--------------------------------------
    Resolution: Fixed

Fixed as a part of SOLR-12923.

> TestLargeCluster is too flaky
> -----------------------------
>
>                 Key: SOLR-12075
>                 URL: https://issues.apache.org/jira/browse/SOLR-12075
>             Project: Solr
>          Issue Type: Bug
>          Components: AutoScaling
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>            Priority: Major
>
> This test is failing a lot in jenkins builds, with two types of failures:
>  * specific test method failures - this may be caused by either bugs in the 
> autoscaling code, bugs in the simulator or timing issues. It should be 
> possible to narrow down the cause by using different speeds of simulated time.
>  * suite-level failures due to leaked threads - most of these failures 
> indicate the ongoing Policy calculations, eg:
> {code}
> com.carrotsearch.randomizedtesting.ThreadLeakError: 1 thread leaked from 
> SUITE scope at org.apache.solr.cloud.autoscaling.sim.TestLargeCluster: 
>   1) Thread[id=21406, name=AutoscalingActionExecutor-7277-thread-1, 
> state=RUNNABLE, group=TGRP-TestLargeCluster]
>        at java.util.ArrayList.iterator(ArrayList.java:834)
>        at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:131)
>        at org.apache.solr.common.util.Utils.makeDeepCopy(Utils.java:110)
>        at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:92)
>        at org.apache.solr.common.util.Utils.makeDeepCopy(Utils.java:108)
>        at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:92)
>        at org.apache.solr.common.util.Utils.getDeepCopy(Utils.java:74)
>        at org.apache.solr.client.solrj.cloud.autoscaling.Row.copy(Row.java:91)
>        at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.lambda$getMatrixCopy$1(Policy.java:297)
>        at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session$$Lambda$466/1757323495.apply(Unknown
>  Source)
>        at 
> java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
>        at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
>        at 
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
>        at 
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
>        at 
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>        at 
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>        at 
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>        at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.getMatrixCopy(Policy.java:298)
>        at 
> org.apache.solr.client.solrj.cloud.autoscaling.Policy$Session.copy(Policy.java:287)
>        at 
> org.apache.solr.client.solrj.cloud.autoscaling.Row.removeReplica(Row.java:156)
>        at 
> org.apache.solr.client.solrj.cloud.autoscaling.MoveReplicaSuggester.tryEachNode(MoveReplicaSuggester.java:60)
>        at 
> org.apache.solr.client.solrj.cloud.autoscaling.MoveReplicaSuggester.init(MoveReplicaSuggester.java:34)
>        at 
> org.apache.solr.client.solrj.cloud.autoscaling.Suggester.getSuggestion(Suggester.java:129)
>        at 
> org.apache.solr.cloud.autoscaling.ComputePlanAction.process(ComputePlanAction.java:98)
>        at 
> org.apache.solr.cloud.autoscaling.ScheduledTriggers.lambda$null$3(ScheduledTriggers.java:307)
>        at 
> org.apache.solr.cloud.autoscaling.ScheduledTriggers$$Lambda$439/951218654.run(Unknown
>  Source)
>        at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>        at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:188)
>        at 
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$9/1677458082.run(Unknown
>  Source)
>        at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>        at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>        at java.lang.Thread.run(Thread.java:748)
>       at __randomizedtesting.SeedInfo.seed([C6FA0364D13DAFCC]:0)
> {code}
> It's possible that somewhere an InterruptedException is caught and not 
> propagated so that the Policy calculations don't terminate when the thread is 
> interrupted when closing parent components.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to