Hello,
I'm running a 2.9.0 cluster with 2 nodes. I tried to use grid grain's
ControlCenterAgent to investigate a slowdown.
When I removed the agent files from server (I don't like to have to put
it in all clients), the second node cannot join the cluster when I
start it.
If I start node A, then node B, node B fails, but if I start node B,
then node A, node A fails.
If I put the agent files back, then all nodes can start, but clients
fail because they don't have the agent classes themselves.
When a node fails to start, it prints this log :
[17:52:45,265][INFO][tcp-disco-sock-reader-[2f3f6f3a
192.168.43.29:39675]-#6%ClusterWA%-#50%ClusterWA%][TcpDiscoverySpi] Initialized
connection with remote server node
[nodeId=2f3f6f3a-accb-4708-a5cc-26d324a07816, rmtAddr=/192.168.43.29:39675]
[17:52:45,268][SEVERE][main][IgniteKernal%ClusterWA] Failed to start manager:
GridManagerAdapter [enabled=true,
name=o.a.i.i.managers.discovery.GridDiscoveryManager]
class org.apache.ignite.IgniteCheckedException: Failed to start SPI:
TcpDiscoverySpi [addrRslvr=null, sockTimeout=5000, ackTimeout=5000,
marsh=JdkMarshaller
[clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@39a8e2fa],
reconCnt=10, reconDelay=2000, maxAckTimeout=600000, soLinger=5,
forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null,
skipAddrsRandomization=false]
at
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:302)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:967)
at
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1935)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1298)
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2046)
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1698)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1114)
at
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1032)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:918)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:817)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:687)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:656)
at org.apache.ignite.Ignition.start(Ignition.java:353)
at
org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:300)
Caused by: class org.apache.ignite.spi.IgniteSpiException: Unable to unmarshal
key=metastorage.cluster.id.tag
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.checkFailedError(TcpDiscoverySpi.java:2018)
at
org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:1189)
at
org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:462)
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2120)
at
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:299)
... 13 more
[17:52:45,271][SEVERE][main][IgniteKernal%ClusterWA] Got exception while
starting (will rollback startup routine).
class org.apache.ignite.IgniteCheckedException: Failed to start manager:
GridManagerAdapter [enabled=true,
name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
at
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1940)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1298)
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2046)
at
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1698)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1114)
at
org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1032)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:918)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:817)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:687)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:656)
at org.apache.ignite.Ignition.start(Ignition.java:353)
at
org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:300)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to start SPI:
TcpDiscoverySpi [addrRslvr=null, sockTimeout=5000, ackTimeout=5000,
marsh=JdkMarshaller
[clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@39a8e2fa],
reconCnt=10, reconDelay=2000, maxAckTimeout=600000, soLinger=5,
forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null,
skipAddrsRandomization=false]
at
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:302)
at
org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:967)
at
org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1935)
... 11 more
Caused by: class org.apache.ignite.spi.IgniteSpiException: Unable to unmarshal
key=metastorage.cluster.id.tag
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.checkFailedError(TcpDiscoverySpi.java:2018)
at
org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:1189)
at
org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:462)
at
org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2120)
at
org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:299)
... 13 more
[17:52:45,271][INFO][tcp-disco-sock-reader-[2f3f6f3a
192.168.43.29:39675]-#6%ClusterWA%-#50%ClusterWA%][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/192.168.43.29:39675, rmtPort=39675
And the running node has this :
[17:52:45,223][INFO][tcp-disco-sock-reader-[9a3233c6
192.168.43.30:54951]-#4%ClusterWA%-#55%ClusterWA%][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/192.168.43.30:54951, rmtPort=54951
[17:52:45,246][INFO][tcp-disco-msg-worker-[crd]-#2%ClusterWA%-#46%ClusterWA%][GridEncryptionManager]
Joining node doesn't have stored group keys
[node=9a3233c6-3a6c-4be0-b5e7-19cdff30f69e]
[17:52:45,266][WARNING][disco-pool-#56%ClusterWA%][TcpDiscoverySpi] Unable to
unmarshal key=metastorage.cluster.id.tag
If I start the nodes in the reverse order, it has this :
[17:56:52,426][INFO][tcp-disco-sock-reader-[4b8b92f5
192.168.43.29:42557]-#4%ClusterWA%-#53%ClusterWA%][TcpDiscoverySpi] Finished
serving remote node connection [rmtAddr=/192.168.43.29:42557, rmtPort=42557
[17:56:52,446][INFO][tcp-disco-msg-worker-[crd]-#2%ClusterWA%-#46%ClusterWA%][GridEncryptionManager]
Joining node doesn't have stored group keys
[node=4b8b92f5-1753-4b1b-9902-476c925fa49d]
[17:56:52,466][WARNING][disco-pool-#54%ClusterWA%][TcpDiscoverySpi] Unable to
unmarshal key=metastorage.cluster.id.tag
Is there a way to recover ?
Thanks,
--
Bastien Durel
DATA
Intégration des données de l'entreprise,
Systèmes d'information décisionnels.
[email protected]
tel : +33 (0) 1 57 19 59 28
fax : +33 (0) 1 57 19 59 73
45 avenue Carnot, 94230 CACHAN France
www.data.fr