Hi, Ignite uses ring topology. Almost all network exchange done via TcpDiscoverySPI [1] (see, you bind it to port 47500). Topology updates, cluster hearbeat use it. Failure detection timeout is a time window when every node should send update to the NEXT node in topology via Discovery. Also, Ignite allow nodes to communicate to each other directly via CommunicationSPI (by deafult it is 47100 port).
>From stacktrace you can see that connection failed to communication port. Some times users forget to open communication ports for nodes and keep only dicovery ports open. This can cause grid operation hangs as nodes neither able to exchange data nor leave topology. So, if node can not be reached via communication for some time - it should be kicked off topology. That is what you see in logs. [1] https://ignite.apache.org/releases/mobile/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySpi.html On Thu, Apr 20, 2017 at 6:31 AM, <[email protected]> wrote: > Hi Team, > > > > I want to configure failureDetectionTimeOut so that I can customize after > how long the clients get disconnected, in case of server failure. > > > > Just for testing purposes, I brought up one server and one client in a > cluster, and had below property set: > > > > Heres my config: > > > > *<?**xml version**="1.0" **encoding**="UTF-8"* > *?>* > <*beans **xmlns* > *="http://www.springframework.org/schema/beans > <http://www.springframework.org/schema/beans>" **xmlns:**xsi* > *="http://www.w3.org/2001/XMLSchema-instance > <http://www.w3.org/2001/XMLSchema-instance>" **xmlns:**util* > *="http://www.springframework.org/schema/util > <http://www.springframework.org/schema/util>" **xsi**:schemaLocation* > > > > *=" http://www.springframework.org/schema/beans > <http://www.springframework.org/schema/beans> > http://www.springframework.org/schema/beans/spring-beans-2.5.xsd > <http://www.springframework.org/schema/beans/spring-beans-2.5.xsd> > http://www.springframework.org/schema/util > <http://www.springframework.org/schema/util> > http://www.springframework.org/schema/util/spring-util-2.0.xsd > <http://www.springframework.org/schema/util/spring-util-2.0.xsd>"*> > <*bean **class**="org.apache.ignite.configuration.IgniteConfiguration"*> > > *<!-- Set to true to enable grid-aware class loading for examples, default is > false. --> *<*property **name**="peerClassLoadingEnabled" > **value**="true"*/> > <*property **name**="failureDetectionTimeout" **value**="20000"*/> > > > * <!-- Enable events for examples. --> *<*property > **name**="includeEventTypes"*> > <*util**:constant > **static-field**="org.apache.ignite.events.EventType.EVTS_ALL"*/> > </*property*> > > > *<!-- Explicitly configure TCP discovery SPI to provide list of initial > nodes. --> *<*property **name**="discoverySpi"*> > <*bean > **class**="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi"*> > <*property **name**="ipFinder"*> > > > *<!-- Uncomment multicast IP finder to enable multicast-based discovery of > initial nodes. --> <!--<bean > class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">--> > *<*bean > **class**="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder"*> > <*property **name**="addresses"*> > <*list*> > > *<!-- In distributed environment, replace with actual host IP address. --> > *<*value*>127.0.0.1:47500</*value*> > </*list*> > </*property*> > </*bean*> > </*property*> > </*bean*> > </*property*> > > > <*property **name**="cacheConfiguration"*> > <*bean **class**="org.apache.ignite.configuration.CacheConfiguration"*> > <*property **name**="name" **value**="test_NextcacheLocalStore"*/> > <*property **name**="cacheMode" **value**="PARTITIONED"*/> > </*bean*> > </*property*> > > </*bean*> > </*beans*> > > > > Whats happening is that when I bring down my server, the client gets > disconnected before the failureDetectionTimeOut has passed. > > > > I brought down the server @ 8:42:50, and the client gets disconnected within > 10 seconds. Here are the logs (from client): > > > > Apr 20, 2017 8:52:52 AM org.apache.ignite.logger.java.JavaLogger warning > > WARNING: Connect timed out (consider increasing 'failureDetectionTimeout' > configuration property) [addr=/0:0:0:0:0:0:0:1:47100, > failureDetectionTimeout=20000] > > Apr 20, 2017 8:52:53 AM org.apache.ignite.logger.java.JavaLogger warning > > WARNING: Connect timed out (consider increasing 'failureDetectionTimeout' > configuration property) [addr=/127.0.0.1:47100, failureDetectionTimeout=20000] > > Apr 20, 2017 8:52:54 AM org.apache.ignite.logger.java.JavaLogger warning > > WARNING: Connect timed out (consider increasing 'failureDetectionTimeout' > configuration property) > [addr=NYKDWMVDI012486.INTRANET.BARCAPINT.com/10.136.138.135:47100, > failureDetectionTimeout=20000] > > Apr 20, 2017 8:52:54 AM org.apache.ignite.logger.java.JavaLogger warning > > WARNING: Failed to connect to a remote node (make sure that destination node > is alive and operating system firewall is disabled on local and remote hosts) > [addrs=[/0:0:0:0:0:0:0:1:47100, /127.0.0.1:47100, > NYKDWMVDI012486.INTRANET.BARCAPINT.com/10.136.138.135:47100]] > > Apr 20, 2017 8:52:58 AM org.apache.ignite.logger.java.JavaLogger error > > SEVERE: Failed to reconnect to cluster (consider increasing 'networkTimeout' > configuration property) [networkTimeout=5000] > > Apr 20, 2017 8:53:03 AM org.apache.ignite.logger.java.JavaLogger info > > INFO: > > > > >>> +---------------------------------------------------------------------------------+ > > >>> Ignite ver. 1.7.3#20161110-sha1:10582ae13b52d679a5827b409328a452ead2f1aa > >>> stopped OK > > >>> +---------------------------------------------------------------------------------+ > > >>> Grid uptime: 00:00:21:509 > > > > > > javax.cache.CacheException: class > org.apache.ignite.IgniteClientDisconnectedException: Failed to ping node, > client node disconnected. > > at > org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1507) > > at > org.apache.ignite.internal.processors.cache.IgniteCacheProxy.cacheException(IgniteCacheProxy.java:2138) > > at > org.apache.ignite.internal.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1338) > > at > org.gridgain.examples.Smriti.CacheLocalstore.CachePut.addEmpToCache(CachePut.java:68) > > at > org.gridgain.examples.Smriti.CacheLocalstore.CachePut.main(CachePut.java:34) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:601) > > at > com.intellij.rt.execution.application.AppMain.main(AppMain.java:144) > > Caused by: class org.apache.ignite.IgniteClientDisconnectedException: Failed > to ping node, client node disconnected. > > at > org.apache.ignite.internal.util.IgniteUtils$15.apply(IgniteUtils.java:841) > > at > org.apache.ignite.internal.util.IgniteUtils$15.apply(IgniteUtils.java:839) > > ... 10 more > > Caused by: class > org.apache.ignite.internal.IgniteClientDisconnectedCheckedException: Failed > to ping node, client node disconnected. > > at > org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.pingNode(GridDiscoveryManager.java:1423) > > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:846) > > at > org.apache.ignite.internal.processors.cache.GridCacheIoManager.send(GridCacheIoManager.java:990) > > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.mapSingle(GridNearAtomicAbstractUpdateFuture.java:269) > > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:504) > > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:434) > > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:209) > > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$23.apply(GridDhtAtomicCache.java:1150) > > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$23.apply(GridDhtAtomicCache.java:1148) > > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.asyncOp(GridDhtAtomicCache.java:846) > > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAsync0(GridDhtAtomicCache.java:1148) > > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.putAsync0(GridDhtAtomicCache.java:618) > > at > org.apache.ignite.internal.processors.cache.GridCacheAdapter.putAsync(GridCacheAdapter.java:2541) > > at > org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.put(GridDhtAtomicCache.java:595) > > at > org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2215) > > at > org.apache.ignite.internal.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1331) > > ... 7 more > > > > > > Smriti. > > > > _______________________________________________ > > This message is for information purposes only, it is not a recommendation, > advice, offer or solicitation to buy or sell a product or service nor an > official confirmation of any transaction. It is directed at persons who are > professionals and is not intended for retail customer use. Intended for > recipient only. This message is subject to the terms at: www.barclays.com/ > emaildisclaimer. > > For important disclosures, please see: www.barclays.com/ > salesandtradingdisclaimer regarding market commentary from Barclays Sales > and/or Trading, who are active market participants; and in respect of > Barclays Research, including disclosures relating to specific issuers, > please see http://publicresearch.barclays.com. > > _______________________________________________ > -- Best regards, Andrey V. Mashenkov
