Hello! I can see the following stack trace:
"rest-#49" #96 prio=5 os_prio=0 tid=0x00007fe4f4006000 nid=0x573b runnable [0x00007fe4de5f4000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.Net.poll(Native Method) at sun.nio.ch.SocketChannelImpl.poll(SocketChannelImpl.java:954) - locked <0x00000000eb361618> (a java.lang.Object) at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:110) - locked <0x00000000eb361608> (a java.lang.Object) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3299) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2987) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2870) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2713) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2672) It seems that nodes in your cluster can't reach each other via communication port (usually 47100). Regards, -- Ilya Kasnacheev чт, 30 мая 2019 г. в 23:47, Jay Fernandez <[email protected]>: > Hello, attached the threaddump for the one node. > > The client LoadCaches is throwing this warning when I turn on verbose mode > over and over again. > > WARNING: Failed to wait for initial partition map exchange. Possible > reasons are: > ^-- Transactions in deadlock. > ^-- Long running transactions (ignore if this is the case). > ^-- Unreleased explicit locks. > May 30, 2019 3:55:59 PM org.apache.ignite.logger.java.JavaLogger warning > WARNING: Still waiting for initial partition map exchange > [fut=GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent > [evtNode=TcpDiscoveryNode [id=aac48b1a-1a69-4046-a570-ca1346149a5b, > addrs=[0:0:0:0:0:0:0:1, 10.0.164.68, 127.0.0.1], sockAddrs=[ > GNLT-T580Jfernandez.boston.gryphonnetworks.com/10.0.164.68:0, > /0:0:0:0:0:0:0:1:0, /127.0.0.1:0], discPort=0, order=2, intOrder=0, > lastExchangeTime=1559246117333, loc=true, ver=2.7.0#20181130-sha1:256ae401, > isClient=true], topVer=2, nodeId8=aac48b1a, msg=null, type=NODE_JOINED, > tstamp=1559246119378], crd=TcpDiscoveryNode > [id=da20f8f5-3889-4aed-a394-c789d75f336a, addrs=[0:0:0:0:0:0:0:1%lo, > 10.128.0.10, 127.0.0.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, / > 127.0.0.1:47500, /10.128.0.10:47500], discPort=47500, order=1, > intOrder=1, lastExchangeTime=1559246119213, loc=false, > ver=2.7.0#20181130-sha1:256ae401, isClient=false], > exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion > [topVer=2, minorTopVer=0], discoEvt=DiscoveryEvent > [evtNode=TcpDiscoveryNode [id=aac48b1a-1a69-4046-a570-ca1346149a5b, > addrs=[0:0:0:0:0:0:0:1, 10.0.164.68, 127.0.0.1], sockAddrs=[ > GNLT-T580Jfernandez.boston.gryphonnetworks.com/10.0.164.68:0, > /0:0:0:0:0:0:0:1:0, /127.0.0.1:0], discPort=0, order=2, intOrder=0, > lastExchangeTime=1559246117333, loc=true, ver=2.7.0#20181130-sha1:256ae401, > isClient=true], topVer=2, nodeId8=aac48b1a, msg=null, type=NODE_JOINED, > tstamp=1559246119378], nodeId=aac48b1a, evt=NODE_JOINED], added=true, > initFut=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null, > hash=641664202], init=false, lastVer=null, partReleaseFut=null, > exchActions=ExchangeActions [startCaches=null, stopCaches=null, > startGrps=[], stopGrps=[], resetParts=null, stateChangeRequest=null], > affChangeMsg=null, initTs=1559246119400, centralizedAff=false, > forceAffReassignment=false, exchangeLocE=null, > cacheChangeFailureMsgSent=false, done=false, state=CLIENT, > registerCachesFuture=null, partitionsSent=false, partitionsReceived=false, > delayedLatestMsg=null, afterLsnrCompleteFut=GridFutureAdapter > [ignoreInterrupts=false, state=INIT, res=null, hash=12139181], evtLatch=0, > remaining=[da20f8f5-3889-4aed-a394-c789d75f336a], super=GridFutureAdapter > [ignoreInterrupts=false, state=INIT, res=null, hash=1103017075]]] > > > On Thu, May 30, 2019 at 5:25 AM Ilya Kasnacheev <[email protected]> > wrote: > >> Hello! >> >> Can you collect thread dumps from all nodes in the cluster, share those >> with us? >> >> Regards, >> -- >> Ilya Kasnacheev >> >> >> чт, 30 мая 2019 г. в 00:31, Jay Fernandez <[email protected]>: >> >>> This did stop the error from being logged. However, when I start the >>> loadCaches program, nothing is logged and it seems to just hang. The >>> ignite logs show that a client connected but nothing after that. In >>> addition, the web console heap size monitoring jumps up right away and then >>> stops monitoring immediately after. >>> >>> On Tue, May 28, 2019 at 9:42 AM Jay Fernandez <[email protected]> >>> wrote: >>> >>>> Thanks for the reply Denis. Is the correct way to disable the checker? >>>> >>>> <property name="systemWorkerBlockedTimeout" value="#{-1}"/> >>>> >>>> On Fri, May 24, 2019 at 5:59 PM Denis Magda <[email protected]> wrote: >>>> >>>>> Hi Jay, >>>>> >>>>> Could you please try to disable the "crtical workers checker"? >>>>> >>>>> https://apacheignite.readme.io/docs/critical-failures-handling#section-critical-workers-health-check >>>>> >>>>> It will be disabled by default in Ignite 2.7.5 since requires more >>>>> automation and tuning. >>>>> >>>>> Let us know if it doesn't work. >>>>> >>>>> - >>>>> Denis >>>>> >>>>> >>>>> On Fri, May 24, 2019 at 9:57 AM jay.fernandez <[email protected]> >>>>> wrote: >>>>> >>>>>> Hello, very new to Ignite and excited about using the application. I >>>>>> have >>>>>> installed one Apache Ignite 2.7 node on a GCP VM. I have the web >>>>>> agent >>>>>> running locally and I am using Gridgain's Web Console. I am getting >>>>>> an >>>>>> error trying to run the LoadCaches java application that the Gridgain >>>>>> Web >>>>>> Console generated based on my MySQL database. >>>>>> >>>>>> Logs from Ignite Server: >>>>>> >>>>>> May 24 16:54:50 gdw-mysql57 service.sh[26542]: [16:54:50] Ignite node >>>>>> started OK (id=1b7f4add) >>>>>> May 24 16:54:50 gdw-mysql57 service.sh[26542]: [16:54:50] Topology >>>>>> snapshot >>>>>> [ver=1, locNode=1b7f4add, servers=1, clients=0, state=ACTIVE, CPUs=2, >>>>>> offheap=1.5GB, heap=1.0GB] >>>>>> May 24 16:55:03 gdw-mysql57 service.sh[26542]: [16:55:03] Topology >>>>>> snapshot >>>>>> [ver=2, locNode=1b7f4add, servers=1, clients=1, state=ACTIVE, CPUs=10, >>>>>> offheap=1.5GB, heap=8.1GB] >>>>>> >>>>>> >>>>>> Error from the Java project below, any help would be appreciated. >>>>>> >>>>>> May 24, 2019 12:53:02 PM java.util.logging.LogManager$RootLogger log >>>>>> WARNING: Failed to resolve default logging config file: >>>>>> config/java.util.logging.properties >>>>>> [12:53:02] __________ ________________ >>>>>> [12:53:02] / _/ ___/ |/ / _/_ __/ __/ >>>>>> [12:53:02] _/ // (7 7 // / / / / _/ >>>>>> [12:53:02] /___/\___/_/|_/___/ /_/ /___/ >>>>>> [12:53:02] >>>>>> [12:53:02] ver. 2.7.0#20181130-sha1:256ae401 >>>>>> [12:53:02] 2018 Copyright(C) Apache Software Foundation >>>>>> [12:53:02] >>>>>> [12:53:02] Ignite documentation: http://ignite.apache.org >>>>>> [12:53:02] >>>>>> [12:53:02] Quiet mode. >>>>>> [12:53:02] ^-- Logging by 'JavaLogger [quiet=true, config=null]' >>>>>> [12:53:02] ^-- To see **FULL** console log here add >>>>>> -DIGNITE_QUIET=false >>>>>> or "-v" to ignite.{sh|bat} >>>>>> [12:53:02] >>>>>> [12:53:02] OS: Windows 10 10.0 amd64 >>>>>> [12:53:02] VM information: Java(TM) SE Runtime Environment >>>>>> 1.8.0_201-b09 >>>>>> Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.201-b09 >>>>>> [12:53:02] Please set system property >>>>>> '-Djava.net.preferIPv4Stack=true' to >>>>>> avoid possible problems in mixed environments. >>>>>> [12:53:02] Initial heap size is 510MB (should be no less than 512MB, >>>>>> use >>>>>> -Xms512m -Xmx512m). >>>>>> [12:53:02] Configured plugins: >>>>>> [12:53:02] ^-- None >>>>>> [12:53:02] >>>>>> [12:53:02] Configured failure handler: >>>>>> [hnd=StopNodeOrHaltFailureHandler >>>>>> [tryStop=false, timeout=0, super=AbstractFailureHandler >>>>>> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]]] >>>>>> [12:53:03] Message queue limit is set to 0 which may lead to >>>>>> potential OOMEs >>>>>> when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due >>>>>> to >>>>>> message queues growth on sender and receiver sides. >>>>>> [12:53:03] Security status [authentication=off, tls/ssl=off] >>>>>> [12:53:03] REST protocols do not start on client node. To start the >>>>>> protocols on client node set '-DIGNITE_REST_START_ON_CLIENT=true' >>>>>> system >>>>>> property. >>>>>> log4j:WARN No appenders could be found for logger >>>>>> >>>>>> (org.springframework.beans.factory.support.DefaultListableBeanFactory). >>>>>> log4j:WARN Please initialize the log4j system properly. >>>>>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig >>>>>> for >>>>>> more info. >>>>>> May 24, 2019 12:53:18 PM org.apache.ignite.logger.java.JavaLogger >>>>>> error >>>>>> SEVERE: Blocked system-critical thread has been detected. This can >>>>>> lead to >>>>>> cluster-wide undefined behaviour [threadName=partition-exchanger, >>>>>> blockedFor=12s] >>>>>> May 24, 2019 12:53:18 PM java.util.logging.LogManager$RootLogger log >>>>>> SEVERE: Critical system error detected. Will be handled accordingly to >>>>>> configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false, >>>>>> timeout=0, super=AbstractFailureHandler >>>>>> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]], >>>>>> failureCtx=FailureContext >>>>>> [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException: >>>>>> GridWorker >>>>>> [name=partition-exchanger, igniteInstanceName=ImportedCluster, >>>>>> finished=false, heartbeatTs=1558716785615]]] >>>>>> class org.apache.ignite.IgniteException: GridWorker >>>>>> [name=partition-exchanger, igniteInstanceName=ImportedCluster, >>>>>> finished=false, heartbeatTs=1558716785615] >>>>>> at >>>>>> >>>>>> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831) >>>>>> at >>>>>> >>>>>> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826) >>>>>> at >>>>>> >>>>>> org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233) >>>>>> at >>>>>> >>>>>> org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297) >>>>>> at >>>>>> >>>>>> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221) >>>>>> at >>>>>> >>>>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) >>>>>> at java.lang.Thread.run(Thread.java:748) >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/ >>>>>> >>>>>
