Re: Error Running Gridgain's LoadCaches java application

Ilya Kasnacheev Fri, 31 May 2019 06:40:49 -0700

Hello!

I can see the following stack trace:


"rest-#49" #96 prio=5 os_prio=0 tid=0x00007fe4f4006000 nid=0x573b runnable
[0x00007fe4de5f4000]
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.Net.poll(Native Method)
at sun.nio.ch.SocketChannelImpl.poll(SocketChannelImpl.java:954)
- locked <0x00000000eb361618> (a java.lang.Object)
at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:110)
- locked <0x00000000eb361608> (a java.lang.Object)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3299)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2987)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2870)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2713)
at
org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2672)

It seems that nodes in your cluster can't reach each other via
communication port (usually 47100).

Regards,
-- 
Ilya Kasnacheev


чт, 30 мая 2019 г. в 23:47, Jay Fernandez <[email protected]>:

> Hello, attached the threaddump for the one node.
>
> The client LoadCaches is throwing this warning when I turn on verbose mode
> over and over again.
>
> WARNING: Failed to wait for initial partition map exchange. Possible
> reasons are:
>   ^-- Transactions in deadlock.
>   ^-- Long running transactions (ignore if this is the case).
>   ^-- Unreleased explicit locks.
> May 30, 2019 3:55:59 PM org.apache.ignite.logger.java.JavaLogger warning
> WARNING: Still waiting for initial partition map exchange
> [fut=GridDhtPartitionsExchangeFuture [firstDiscoEvt=DiscoveryEvent
> [evtNode=TcpDiscoveryNode [id=aac48b1a-1a69-4046-a570-ca1346149a5b,
> addrs=[0:0:0:0:0:0:0:1, 10.0.164.68, 127.0.0.1], sockAddrs=[
> GNLT-T580Jfernandez.boston.gryphonnetworks.com/10.0.164.68:0,
> /0:0:0:0:0:0:0:1:0, /127.0.0.1:0], discPort=0, order=2, intOrder=0,
> lastExchangeTime=1559246117333, loc=true, ver=2.7.0#20181130-sha1:256ae401,
> isClient=true], topVer=2, nodeId8=aac48b1a, msg=null, type=NODE_JOINED,
> tstamp=1559246119378], crd=TcpDiscoveryNode
> [id=da20f8f5-3889-4aed-a394-c789d75f336a, addrs=[0:0:0:0:0:0:0:1%lo,
> 10.128.0.10, 127.0.0.1], sockAddrs=[/0:0:0:0:0:0:0:1%lo:47500, /
> 127.0.0.1:47500, /10.128.0.10:47500], discPort=47500, order=1,
> intOrder=1, lastExchangeTime=1559246119213, loc=false,
> ver=2.7.0#20181130-sha1:256ae401, isClient=false],
> exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion
> [topVer=2, minorTopVer=0], discoEvt=DiscoveryEvent
> [evtNode=TcpDiscoveryNode [id=aac48b1a-1a69-4046-a570-ca1346149a5b,
> addrs=[0:0:0:0:0:0:0:1, 10.0.164.68, 127.0.0.1], sockAddrs=[
> GNLT-T580Jfernandez.boston.gryphonnetworks.com/10.0.164.68:0,
> /0:0:0:0:0:0:0:1:0, /127.0.0.1:0], discPort=0, order=2, intOrder=0,
> lastExchangeTime=1559246117333, loc=true, ver=2.7.0#20181130-sha1:256ae401,
> isClient=true], topVer=2, nodeId8=aac48b1a, msg=null, type=NODE_JOINED,
> tstamp=1559246119378], nodeId=aac48b1a, evt=NODE_JOINED], added=true,
> initFut=GridFutureAdapter [ignoreInterrupts=false, state=INIT, res=null,
> hash=641664202], init=false, lastVer=null, partReleaseFut=null,
> exchActions=ExchangeActions [startCaches=null, stopCaches=null,
> startGrps=[], stopGrps=[], resetParts=null, stateChangeRequest=null],
> affChangeMsg=null, initTs=1559246119400, centralizedAff=false,
> forceAffReassignment=false, exchangeLocE=null,
> cacheChangeFailureMsgSent=false, done=false, state=CLIENT,
> registerCachesFuture=null, partitionsSent=false, partitionsReceived=false,
> delayedLatestMsg=null, afterLsnrCompleteFut=GridFutureAdapter
> [ignoreInterrupts=false, state=INIT, res=null, hash=12139181], evtLatch=0,
> remaining=[da20f8f5-3889-4aed-a394-c789d75f336a], super=GridFutureAdapter
> [ignoreInterrupts=false, state=INIT, res=null, hash=1103017075]]]
>
>
> On Thu, May 30, 2019 at 5:25 AM Ilya Kasnacheev <[email protected]>
> wrote:
>
>> Hello!
>>
>> Can you collect thread dumps from all nodes in the cluster, share those
>> with us?
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> чт, 30 мая 2019 г. в 00:31, Jay Fernandez <[email protected]>:
>>
>>> This did stop the error from being logged.   However, when I start the
>>> loadCaches program, nothing is logged and it seems to just hang.  The
>>> ignite logs show that a client connected but nothing after that.  In
>>> addition, the web console heap size monitoring jumps up right away and then
>>> stops monitoring immediately after.
>>>
>>> On Tue, May 28, 2019 at 9:42 AM Jay Fernandez <[email protected]>
>>> wrote:
>>>
>>>> Thanks for the reply Denis.  Is the correct way to disable the checker?
>>>>
>>>> <property name="systemWorkerBlockedTimeout" value="#{-1}"/>
>>>>
>>>> On Fri, May 24, 2019 at 5:59 PM Denis Magda <[email protected]> wrote:
>>>>
>>>>> Hi Jay,
>>>>>
>>>>> Could you please try to disable the "crtical workers checker"?
>>>>>
>>>>> https://apacheignite.readme.io/docs/critical-failures-handling#section-critical-workers-health-check
>>>>>
>>>>> It will be disabled by default in Ignite 2.7.5 since requires more
>>>>> automation and tuning.
>>>>>
>>>>> Let us know if it doesn't work.
>>>>>
>>>>> -
>>>>> Denis
>>>>>
>>>>>
>>>>> On Fri, May 24, 2019 at 9:57 AM jay.fernandez <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hello, very new to Ignite and excited about using the application.  I
>>>>>> have
>>>>>> installed one Apache Ignite 2.7 node on a GCP VM.  I have the web
>>>>>> agent
>>>>>> running locally and I am using Gridgain's Web Console.  I am getting
>>>>>> an
>>>>>> error trying to run the LoadCaches java application that the Gridgain
>>>>>> Web
>>>>>> Console generated based on my MySQL database.
>>>>>>
>>>>>> Logs from Ignite Server:
>>>>>>
>>>>>> May 24 16:54:50 gdw-mysql57 service.sh[26542]: [16:54:50] Ignite node
>>>>>> started OK (id=1b7f4add)
>>>>>> May 24 16:54:50 gdw-mysql57 service.sh[26542]: [16:54:50] Topology
>>>>>> snapshot
>>>>>> [ver=1, locNode=1b7f4add, servers=1, clients=0, state=ACTIVE, CPUs=2,
>>>>>> offheap=1.5GB, heap=1.0GB]
>>>>>> May 24 16:55:03 gdw-mysql57 service.sh[26542]: [16:55:03] Topology
>>>>>> snapshot
>>>>>> [ver=2, locNode=1b7f4add, servers=1, clients=1, state=ACTIVE, CPUs=10,
>>>>>> offheap=1.5GB, heap=8.1GB]
>>>>>>
>>>>>>
>>>>>> Error from the Java project below, any help would be appreciated.
>>>>>>
>>>>>> May 24, 2019 12:53:02 PM java.util.logging.LogManager$RootLogger log
>>>>>> WARNING: Failed to resolve default logging config file:
>>>>>> config/java.util.logging.properties
>>>>>> [12:53:02]    __________  ________________
>>>>>> [12:53:02]   /  _/ ___/ |/ /  _/_  __/ __/
>>>>>> [12:53:02]  _/ // (7 7    // /  / / / _/
>>>>>> [12:53:02] /___/\___/_/|_/___/ /_/ /___/
>>>>>> [12:53:02]
>>>>>> [12:53:02] ver. 2.7.0#20181130-sha1:256ae401
>>>>>> [12:53:02] 2018 Copyright(C) Apache Software Foundation
>>>>>> [12:53:02]
>>>>>> [12:53:02] Ignite documentation: http://ignite.apache.org
>>>>>> [12:53:02]
>>>>>> [12:53:02] Quiet mode.
>>>>>> [12:53:02]   ^-- Logging by 'JavaLogger [quiet=true, config=null]'
>>>>>> [12:53:02]   ^-- To see **FULL** console log here add
>>>>>> -DIGNITE_QUIET=false
>>>>>> or "-v" to ignite.{sh|bat}
>>>>>> [12:53:02]
>>>>>> [12:53:02] OS: Windows 10 10.0 amd64
>>>>>> [12:53:02] VM information: Java(TM) SE Runtime Environment
>>>>>> 1.8.0_201-b09
>>>>>> Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.201-b09
>>>>>> [12:53:02] Please set system property
>>>>>> '-Djava.net.preferIPv4Stack=true' to
>>>>>> avoid possible problems in mixed environments.
>>>>>> [12:53:02] Initial heap size is 510MB (should be no less than 512MB,
>>>>>> use
>>>>>> -Xms512m -Xmx512m).
>>>>>> [12:53:02] Configured plugins:
>>>>>> [12:53:02]   ^-- None
>>>>>> [12:53:02]
>>>>>> [12:53:02] Configured failure handler:
>>>>>> [hnd=StopNodeOrHaltFailureHandler
>>>>>> [tryStop=false, timeout=0, super=AbstractFailureHandler
>>>>>> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]]]
>>>>>> [12:53:03] Message queue limit is set to 0 which may lead to
>>>>>> potential OOMEs
>>>>>> when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due
>>>>>> to
>>>>>> message queues growth on sender and receiver sides.
>>>>>> [12:53:03] Security status [authentication=off, tls/ssl=off]
>>>>>> [12:53:03] REST protocols do not start on client node. To start the
>>>>>> protocols on client node set '-DIGNITE_REST_START_ON_CLIENT=true'
>>>>>> system
>>>>>> property.
>>>>>> log4j:WARN No appenders could be found for logger
>>>>>>
>>>>>> (org.springframework.beans.factory.support.DefaultListableBeanFactory).
>>>>>> log4j:WARN Please initialize the log4j system properly.
>>>>>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig
>>>>>> for
>>>>>> more info.
>>>>>> May 24, 2019 12:53:18 PM org.apache.ignite.logger.java.JavaLogger
>>>>>> error
>>>>>> SEVERE: Blocked system-critical thread has been detected. This can
>>>>>> lead to
>>>>>> cluster-wide undefined behaviour [threadName=partition-exchanger,
>>>>>> blockedFor=12s]
>>>>>> May 24, 2019 12:53:18 PM java.util.logging.LogManager$RootLogger log
>>>>>> SEVERE: Critical system error detected. Will be handled accordingly to
>>>>>> configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false,
>>>>>> timeout=0, super=AbstractFailureHandler
>>>>>> [ignoredFailureTypes=[SYSTEM_WORKER_BLOCKED]]],
>>>>>> failureCtx=FailureContext
>>>>>> [type=SYSTEM_WORKER_BLOCKED, err=class o.a.i.IgniteException:
>>>>>> GridWorker
>>>>>> [name=partition-exchanger, igniteInstanceName=ImportedCluster,
>>>>>> finished=false, heartbeatTs=1558716785615]]]
>>>>>> class org.apache.ignite.IgniteException: GridWorker
>>>>>> [name=partition-exchanger, igniteInstanceName=ImportedCluster,
>>>>>> finished=false, heartbeatTs=1558716785615]
>>>>>>         at
>>>>>>
>>>>>> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1831)
>>>>>>         at
>>>>>>
>>>>>> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$2.apply(IgnitionEx.java:1826)
>>>>>>         at
>>>>>>
>>>>>> org.apache.ignite.internal.worker.WorkersRegistry.onIdle(WorkersRegistry.java:233)
>>>>>>         at
>>>>>>
>>>>>> org.apache.ignite.internal.util.worker.GridWorker.onIdle(GridWorker.java:297)
>>>>>>         at
>>>>>>
>>>>>> org.apache.ignite.internal.processors.timeout.GridTimeoutProcessor$TimeoutWorker.body(GridTimeoutProcessor.java:221)
>>>>>>         at
>>>>>>
>>>>>> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
>>>>>>         at java.lang.Thread.run(Thread.java:748)
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>>>>>>
>>>>>

Re: Error Running Gridgain's LoadCaches java application

Reply via email to