Thanks Dan! Seems to work fine now. I still don't like the exceptions being logged when a node is shutting down, but they are harmless.
Cheers, Sanne On 8 February 2012 10:17, Dan Berindei <[email protected]> wrote: > Sanne, > > I was able to run LiveRunningTest as well after I removed > TestableJGroupsTransport from the Infinispan configuration, and I > disabled queueing in the SHARED_LOOPBACK OOB thread pool: > > <SHARED_LOOPBACK > thread_pool.enabled="true" > thread_pool.min_threads="2" > thread_pool.max_threads="30" > thread_pool.keep_alive_time="60000" > thread_pool.queue_enabled="false" > thread_pool.queue_max_size="100" > thread_pool.rejection_policy="Discard" > > oob_thread_pool.enabled="true" > oob_thread_pool.min_threads="2" > oob_thread_pool.max_threads="30" > oob_thread_pool.keep_alive_time="60000" > oob_thread_pool.queue_enabled="false" > oob_thread_pool.queue_max_size="100" > oob_thread_pool.rejection_policy="Discard" > /> > > > I think the test fails with queuing enabled and core thread pool size > 2 because the coordinator sends a PREPARE_VIEW command and several > APPLY_STATE commands (at least one for each cache) at approximately > the same time. If two APPLY_STATE commands get to the other node > before the PREPARE_VIEW command, they will be stuck waiting for state > transfer to start. > > FD also sends messages using OOB, so if the OOB thread pool stops > processing messages FD on other members will soon suspect the stuck > member and kick it out of the cluster. > > For now I think increasing the number of available threads is the only > solution. For 5.2 I'm thinking of moving both the sending of state and > the handling of state to a separate thread, so that OOB threads won't > have to block waiting for the state transfer to start. > > Cheers > Dan > > > On Wed, Feb 8, 2012 at 9:59 AM, Dan Berindei <[email protected]> wrote: >> Hi Sanne >> >> I got the sources and even TwoNodesTest hang for me every time. >> >> I think the problem is that your TestableJGroupsTransport is trying to >> modify the cluster name during startup - which is no longer supported. >> >> I have also created https://issues.jboss.org/browse/ISPN-1852 to fix >> startup so that after an error like this another getCache() call >> doesn't block forever. Ideally it should report the same error, >> whether we attempt to start the component again or we save the >> exception somewhere. >> >> Cheers >> Dan >> >> >> On Tue, Feb 7, 2012 at 6:15 PM, Sanne Grinovero <[email protected]> wrote: >>> Dan, >>> you can easily checkout Hibernate Search, it's a Maven project and you >>> should be able to set it up in your IDE quickly. >>> >>> git clone git://github.com/Sanne/hibernate-search.git >>> git checkout componentsUpdates >>> >>> Then the failing test is in the module "hibernate-search-infinispan".. >>> which is just a couple of classes. >>> >>> Sanne >>> >>> >>> >>> On 7 February 2012 16:10, Dan Berindei <[email protected]> wrote: >>>> Rado, is there a specific test in the AS7 test suite that is failing? >>>> Is it only in Jenkins or on your machine as well? >>>> >>>> I only know about https://issues.jboss.org/browse/ISPN-1806, but Paul >>>> said that he doesn't see it any more in CI runs (he never managed to >>>> reproduce it on his machine). >>>> >>>> Cheers >>>> Dan >>>> >>>> >>>> On Tue, Feb 7, 2012 at 3:13 PM, Radoslav Husar <[email protected]> wrote: >>>>> I am also seeing this/similar exception in AS7 during session >>>>> replication even with 5.1.1.FINAL :-( >>>>> >>>>> On 02/07/2012 01:54 PM, Dan Berindei wrote: >>>>>> Sanne, this sounds very similar to >>>>>> https://issues.jboss.org/browse/ISPN-1814, but I thought I had fixed >>>>>> that for 5.1.1.FINAL. >>>>>> >>>>>> I see CacheViewsManagerImpl is trying to install a view with 6 nodes, >>>>>> should there be 6 nodes in the cluster or should there be less nodes? >>>>>> Do you have DEBUG logs for org.infinispan and org.jgroups? >>>>>> >>>>>> Cheers >>>>>> Dan >>>>>> >>>>>> >>>>>> On Tue, Feb 7, 2012 at 12:58 PM, Sanne Grinovero<[email protected]> >>>>>> wrote: >>>>>>> Can anyone explain this error? >>>>>>> >>>>>>> I'm updating Hibernate Search, and having a simple test which in a loop >>>>>>> does: >>>>>>> >>>>>>> - write to shared index >>>>>>> - add a node / remove a node >>>>>>> - wait for joins >>>>>>> - verifies index state >>>>>>> >>>>>>> This is expected to work, as it already did with all previous >>>>>>> Infinispan versions. >>>>>>> >>>>>>> Using Infinispan 5.1.1.FINAL and JGroups 3.0.5.Final. >>>>>>> >>>>>>> 2012-02-07 10:42:38,668 WARN [CacheViewControlCommand] >>>>>>> (OOB-4,sanne-20017) ISPN000071: Caught exception when handling command >>>>>>> CacheViewControlCommand{cache=LuceneIndexesMetadata, >>>>>>> type=PREPARE_VIEW, sender=sanne-3158, newViewId=8, >>>>>>> newMembers=[sanne-3158, sanne-63971, sanne-20017, sanne-2794, >>>>>>> sanne-25511, sanne-30075], oldViewId=7, oldMembers=[sanne-3158, >>>>>>> sanne-63971, sanne-20017, sanne-2794, sanne-25511]} >>>>>>> java.util.concurrent.ExecutionException: >>>>>>> org.infinispan.remoting.transport.jgroups.SuspectException: One or >>>>>>> more nodes have left the cluster while replicating command >>>>>>> StateTransferControlCommand{cache=LuceneIndexesMetadata, >>>>>>> type=APPLY_STATE, sender=sanne-20017, viewId=8, state=4} >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> [email protected] >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> [email protected] >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> >>> _______________________________________________ >>> infinispan-dev mailing list >>> [email protected] >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
