On Fri, Apr 20, 2018 at 9:43 AM Thomas SEGISMONT <tsegism...@gmail.com> wrote:
> I tried our test suite on a slower machine (iMac from 2011). It passes > consistently there. > > On my laptop, I keep seeing this from time to time (in different tests): > > 2018-04-19T19:53:09.513 WARN [Context=org.infinispan.LOCKS]ISPN000320: > After merge (or coordinator change), cache still hasn't recovered a > majority of members and must stay in degraded mode. Current members are > [sombrero-19385], lost members are [sombrero-42917], stable members are > [sombrero-42917, sombrero-19385] > I would expect the nodes to be leaving gracefully, which shouldn't cause a merge. I am not sure how your test is producing that. Can you produce a TRACE log and a JIRA for it? However if there is a merge, if you go down a single node it will always be in DEGRADED mode, when using partition handling. This is due to not having a simple majority as described in http://infinispan.org/docs/stable/user_guide/user_guide.html#partition_handling > > It happens when we shutdown nodes one after the other (even when waiting > for cluster status to be "healthy" plus extra 2 seconds). > > After that the nodes remains blocked in DefaultCacheManager.stop > > 2018-04-19T19:49:29.242 AVERTISSEMENT Thread > Thread[vert.x-worker-thread-5,5,main] has been blocked for 60774 ms, time > limit is 60000 > io.vertx.core.VertxException: Thread blocked > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1693) > at > java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323) > at > java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1729) > at > java.util.concurrent.CompletableFuture.join(CompletableFuture.java:1934) > at > org.infinispan.manager.DefaultCacheManager.terminate(DefaultCacheManager.java:688) > at > org.infinispan.manager.DefaultCacheManager.stopCaches(DefaultCacheManager.java:734) > at > org.infinispan.manager.DefaultCacheManager.stop(DefaultCacheManager.java:711) > at > io.vertx.ext.cluster.infinispan.InfinispanClusterManager.lambda$leave$5(InfinispanClusterManager.java:285) > at > io.vertx.ext.cluster.infinispan.InfinispanClusterManager$$Lambda$421/578931659.handle(Unknown > Source) > at > io.vertx.core.impl.ContextImpl.lambda$executeBlocking$1(ContextImpl.java:265) > at io.vertx.core.impl.ContextImpl$$Lambda$27/1330754528.run(Unknown > Source) > > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > This looks like the exact issue that Radim mentioned with https://issues.jboss.org/browse/ISPN-8859. > > > > 2018-04-18 17:00 GMT+02:00 Thomas SEGISMONT <tsegism...@gmail.com>: > >> So here's the Circular Referenced Suppressed Exception >> >> [stateTransferExecutor-thread--p221-t33] 2018-04-18T16:15:06.662+02:00 >> WARN [org.infinispan.statetransfer.InboundTransferTask] ISPN000210: Failed >> to request state of cache __vertx.subs from node sombrero-25286, segments >> {0} >> org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: >> Node sombrero-25286 was suspected >> at >> org.infinispan.remoting.transport.ResponseCollectors.remoteNodeSuspected(ResponseCollectors.java:33) >> at >> org.infinispan.remoting.transport.impl.SingleResponseCollector.targetNotFound(SingleResponseCollector.java:31) >> at >> org.infinispan.remoting.transport.impl.SingleResponseCollector.targetNotFound(SingleResponseCollector.java:17) >> at >> org.infinispan.remoting.transport.ValidSingleResponseCollector.addResponse(ValidSingleResponseCollector.java:23) >> at >> org.infinispan.remoting.transport.impl.SingleTargetRequest.receiveResponse(SingleTargetRequest.java:51) >> at >> org.infinispan.remoting.transport.impl.SingleTargetRequest.onNewView(SingleTargetRequest.java:42) >> at >> org.infinispan.remoting.transport.jgroups.JGroupsTransport.addRequest(JGroupsTransport.java:921) >> at >> org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeCommand(JGroupsTransport.java:815) >> at >> org.infinispan.remoting.transport.jgroups.JGroupsTransport.invokeCommand(JGroupsTransport.java:123) >> at >> org.infinispan.remoting.rpc.RpcManagerImpl.invokeCommand(RpcManagerImpl.java:138) >> at >> org.infinispan.statetransfer.InboundTransferTask.startTransfer(InboundTransferTask.java:134) >> at >> org.infinispan.statetransfer.InboundTransferTask.requestSegments(InboundTransferTask.java:113) >> at >> org.infinispan.conflict.impl.StateReceiverImpl$SegmentRequest.lambda$requestState$2(StateReceiverImpl.java:164) >> at >> org.infinispan.executors.LimitedExecutor.lambda$executeAsync$1(LimitedExecutor.java:101) >> at >> org.infinispan.executors.LimitedExecutor.runTasks(LimitedExecutor.java:144) >> at >> org.infinispan.executors.LimitedExecutor.access$100(LimitedExecutor.java:33) >> at >> org.infinispan.executors.LimitedExecutor$Runner.run(LimitedExecutor.java:174) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> at java.lang.Thread.run(Thread.java:748) >> Suppressed: java.util.concurrent.ExecutionException: >> org.infinispan.remoting.transport.jgroups.SuspectException: ISPN000400: >> Node sombrero-25286 was suspected >> at >> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) >> at >> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) >> at >> org.infinispan.util.concurrent.CompletableFutures.await(CompletableFutures.java:82) >> at >> org.infinispan.remoting.rpc.RpcManagerImpl.blocking(RpcManagerImpl.java:260) >> ... 10 more >> [CIRCULAR >> REFERENCE:org.infinispan.remoting.transport.jgroups.SuspectException: >> ISPN000400: Node sombrero-25286 was suspected] >> >> It does not happen with 9.2.0.Final and prevents from using ISPN embedded >> with logback. Do you want me to file an issue ? >> >> 2018-04-18 11:45 GMT+02:00 Thomas SEGISMONT <tsegism...@gmail.com>: >> >>> Hi folks, >>> >>> Sorry I've been busy on other things and couldn't get back to you >>> earlier. >>> >>> I tried running vertx-infinispan test suite with 9.2.1.Final today. >>> There are some problems still but I can't say which ones yet because I hit: >>> https://jira.qos.ch/browse/LOGBACK-1027 >>> >>> We use logback for test logs and all I get is: >>> >>> 2018-04-18 11:37:46,678 [stateTransferExecutor-thread--p4453-t24] ERROR >>> o.i.executors.LimitedExecutor - Exception in task >>> java.lang.StackOverflowError: null >>> at >>> ch.qos.logback.classic.spi.ThrowableProxy.<init>(ThrowableProxy.java:54) >>> at >>> ch.qos.logback.classic.spi.ThrowableProxy.<init>(ThrowableProxy.java:60) >>> at >>> ch.qos.logback.classic.spi.ThrowableProxy.<init>(ThrowableProxy.java:72) >>> at >>> ch.qos.logback.classic.spi.ThrowableProxy.<init>(ThrowableProxy.java:60) >>> at >>> ch.qos.logback.classic.spi.ThrowableProxy.<init>(ThrowableProxy.java:72) >>> at >>> ch.qos.logback.classic.spi.ThrowableProxy.<init>(ThrowableProxy.java:60) >>> at >>> ch.qos.logback.classic.spi.ThrowableProxy.<init>(ThrowableProxy.java:72) >>> at >>> ch.qos.logback.classic.spi.ThrowableProxy.<init>(ThrowableProxy.java:60) >>> at >>> ch.qos.logback.classic.spi.ThrowableProxy.<init>(ThrowableProxy.java:72) >>> at >>> ch.qos.logback.classic.spi.ThrowableProxy.<init>(ThrowableProxy.java:60) >>> at >>> ch.qos.logback.classic.spi.ThrowableProxy.<init>(ThrowableProxy.java:72) >>> ... so on so forth >>> >>> I will run the suite again without logback and tell you what the actual >>> problem is. >>> >>> Regards, >>> Thomas >>> >>> 2018-03-27 11:15 GMT+02:00 Pedro Ruivo <pe...@infinispan.org>: >>> >>>> JIRA: https://issues.jboss.org/browse/ISPN-8994 >>>> >>>> On 27-03-2018 10:08, Pedro Ruivo wrote: >>>> > >>>> > >>>> > On 27-03-2018 09:03, Sebastian Laskawiec wrote: >>>> >> At the moment, the cluster health status checker enumerates all >>>> caches >>>> >> in the cache manager [1] and checks whether those cashes are running >>>> >> and not in degraded more [2]. >>>> >> >>>> >> I'm not sure how counter caches have been implemented. One thing is >>>> >> for sure - they should be taken into account in this loop [3]. >>>> > >>>> > The private caches aren't listed by CacheManager.getCacheNames(). We >>>> > have to check them via InternalCacheRegistry.getInternalCacheNames(). >>>> > >>>> > I'll open a JIRA if you don't mind :) >>>> > >>>> >> >>>> >> [1] >>>> >> >>>> https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/ClusterHealthImpl.java#L22 >>>> >> >>>> >> [2] >>>> >> >>>> https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/CacheHealthImpl.java#L25 >>>> >> >>>> >> [3] >>>> >> >>>> https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/health/impl/ClusterHealthImpl.java#L23-L24 >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev@lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>> >>> >> > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
