Re: Ignite visor timeout when calling node command on thick client in Kubernets cluster.

John Smith Fri, 21 Jul 2023 06:57:02 -0700

Never mind, my Kubernetes Service wasn't getting endpoints. But weirdly
enough there was still some sort of connection going on.


On Thu, Jul 20, 2023 at 9:16 PM John Smith <java.dev....@gmail.com> wrote:

> So the client is exposed as node ports and I have been able to provide the
> proper ports back to the client and cluster...
>
> When I look at the node details I see...
>
> | Address (0)                 | 10.xxx.xxx.xxx                        |
> <---- Kubernetes internal I.P
> | Address (1)                 | 127.0.0.1                                |
>
> So it only knows the 2 addresses but  somehow the timeout is aware of the
> 3rd address see below....
>
> addrs=[/10.xxx.xxx.xxx:47100, /127.0.0.1:47100, /172.xxx.xxx.xxx:30524]]
> <----- 172 Is where the thick client is exposed as node port. So it somehow
> knows it?
>
> Even on the client I can see ignite visor connected
>
> Completed partition exchange
> [localNode=c9b86d24-0f0d-4198-98c5-59ce677669f8,
> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion
> [topVer=434, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
> [id=c16d4ff0-e37e-4a18-a2ae-ec770ad25c39,
> consistentId=0:0:0:0:0:0:0:1%lo,127.0.0.1,172.xxx.xxx.xxx:47500,
> addrs=ArrayList [0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.xxx.xxx.xxx],
> sockAddrs=HashSet [0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500,
> xxxxxx-visor-0001/172.xxx.xxx.xxx:47500], discPort=47500, order=434,
> intOrder=227, lastExchangeTime=1689901276893, loc=false,
> ver=2.12.0#20220108-sha1:b1289f75, isClient=false], rebalanced=true,
> done=true, newCrdFut=null], topVer=AffinityTopologyVersion [topVer=434,
> minorTopVer=0]]
> AffinityTopologyVersion [topVer=434, minorTopVer=0], evt=NODE_JOINED,
> evtNode=c16d4ff0-e37e-4a18-a2ae-ec770ad25c39, client=true]
>
> On ignite visor we see the error below
>
> [00:29:07,053][SEVERE][main][TcpCommunicationSpi] Failed to send message
> to remote node [node=TcpDiscoveryNode
> [id=c9b86d24-0f0d-4198-98c5-59ce677669f8,
> consistentId=c9b86d24-0f0d-4198-98c5-59ce677669f8, addrs=ArrayList
> [10.xxx.xxx.xxx, 127.0.0.1], sockAddrs=HashSet [/10.xxx.xxx.xxx:0, /
> 127.0.0.1:0], discPort=0, order=429, intOrder=224,
> lastExchangeTime=1689899260716, loc=false,
> ver=2.12.0#20220108-sha1:b1289f75, isClient=true], msg=GridIoMessage
> [plc=3, topic=TOPIC_JOB, topicOrd=0, ordered=false, timeout=0,
> skipOnTimeout=false, msg=GridJobExecuteRequest
> [sesId=fd758d57981-a43b0db8-3b02-4506-ac69-412e46736682,
> jobId=0e758d57981-a43b0db8-3b02-4506-ac69-412e46736682,
> startTaskTime=1689899286905, timeout=9223372036854775807,
> taskName=org.apache.ignite.internal.visor.node.VisorNodeDataCollectorTask,
> userVer=0,
> taskClsName=org.apache.ignite.internal.visor.node.VisorNodeDataCollectorTask,
> ldrParticipants=null, cpSpi=null, createTime=1689899286989,
> clsLdrId=90558d57981-a43b0db8-3b02-4506-ac69-412e46736682,
> depMode=ISOLATED, dynamicSiblings=false, forceLocDep=true,
> sesFullSup=false, internal=true, topPred=null, part=-1, topVer=null,
> execName=null]]]
> class org.apache.ignite.IgniteCheckedException: Failed to connect to node
> (is node still alive?). Make sure that each ComputeTask and cache
> Transaction has a timeout set in order to prevent parties from waiting
> forever in case of network issues
> [nodeId=c9b86d24-0f0d-4198-98c5-59ce677669f8, addrs=[/10.xxx.xxx.xxx:47100,
> /127.0.0.1:47100, /172.xxx.xxx.xxx:30524]]
>

Re: Ignite visor timeout when calling node command on thick client in Kubernets cluster.

Reply via email to