OK. I have found steps to reproduce.

For reproducing we need Ignite with slow disk. Or I known how to emulate
slow hard disk:

Add to partitions cycle in
GridCacheDatabaseSharedManager#restorePartitionStates:
//...
            for (int i = 0; i < grp.affinity().partitions(); i++) {
                try {
                    log.error("Wait");
                    Thread.sleep(10000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
//...

- Now we can start server
- Then start client
- Wait message "Join cluster while cluster state transition is in
progress, waiting when transition finish."
- Kill server
- Wait repeatable exception java.net.ConnectException: Connection
refused (Connection refused)
- Start server and I have 100% chance to reproduce on my computer

Thank you.

On 11/29/18 14:03, Stanislav Lukyanov wrote:
>
> Hi,
>
>  
>
> The interesting (and disappointing) part is the NPE:
>
> java.lang.NullPointerException: null
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequest(ClientImpl.java:666)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:546)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.access$900(ClientImpl.java:128)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.tryJoin(ClientImpl.java:1846)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1757)
>
>  
>
> Which version do you use?
>
> Is this reproducible? Every time?
>
>  
>
> Thanks,
>
> Stan
>
>  
>
>  
>
> *From: *Dmitry Lazurkin <mailto:dila...@gmail.com>
> *Sent: *20 ноября 2018 г. 15:44
> *To: *user@ignite.apache.org <mailto:user@ignite.apache.org>
> *Subject: *Client stucks and doesn't connect
>
>  
>
> Hello.
>
>  
>
> Ignite client stops connecting to server after exception:
>
>  
>
> 2018-11-19 16:00:49,257 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Resolved addresses from IP finder:
>
> [/10.48.14.1:47500]
>
> 2018-11-19 16:00:49,257 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Send join request
>
> [addr=/10.48.14.1:47500, reconnect=true,
>
> locNodeId=cd323c53-d1de-4608-8eec-e373b1f68b71]
>
> 2018-11-19 16:00:49,258 [tcp-client-disco-reconnector-#5] ERROR
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Exception on joining: Connection refused
>
> (Connection refused)
>
> java.net.ConnectException: Connection refused (Connection refused)
>
>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>
>         at
>
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>
>         at
>
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>
>         at
>
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>
>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>
>         at java.net.Socket.connect(Socket.java:589)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.openSocket(TcpDiscoverySpi.java:1450)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.openSocket(TcpDiscoverySpi.java:1413)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequest(ClientImpl.java:637)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:546)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.access$900(ClientImpl.java:128)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl$Reconnector.body(ClientImpl.java:1408)
>
>         at
>
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>
> 2018-11-19 16:00:49,258 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Failed to join to address
>
> [addr=/10.48.14.1:47500, recon=true, errs=[java.net.ConnectException:
>
> Connection refused (Connection refused)]]
>
> 2018-11-19 16:00:51,344 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Resolved addresses from IP finder:
>
> [/10.48.14.1:47500]
>
> 2018-11-19 16:00:51,344 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Send join request
>
> [addr=/10.48.14.1:47500, reconnect=true,
>
> locNodeId=cd323c53-d1de-4608-8eec-e373b1f68b71]
>
> 2018-11-19 16:00:51,364 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Message has been sent to address
>
> [msg=TcpDiscoveryClientReconnectMessage
>
> [routerNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20,
>
> lastMsgId=c42b1bc2761-bbc61e78-97e8-49cd-844e-2dc3e8aacc68,
>
> super=TcpDiscoveryAbstractMessage [sndNodeId=null,
>
> id=48722bc2761-cd323c53-d1de-4608-8eec-e373b1f68b71,
>
> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>
> isClient=true]], addr=/10.48.14.1:47500,
>
> rmtNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20]
>
> 2018-11-19 16:00:51,365 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Received response to join request
>
> [addr=/10.48.14.1:47500, res=100]
>
> 2018-11-19 16:00:51,365 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Will wait before retry join.
>
> 2018-11-19 16:00:53,365 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Send join request
>
> [addr=/10.48.14.1:47500, reconnect=true,
>
> locNodeId=cd323c53-d1de-4608-8eec-e373b1f68b71]
>
> 2018-11-19 16:00:53,368 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Message has been sent to address
>
> [msg=TcpDiscoveryClientReconnectMessage
>
> [routerNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20,
>
> lastMsgId=c42b1bc2761-bbc61e78-97e8-49cd-844e-2dc3e8aacc68,
>
> super=TcpDiscoveryAbstractMessage [sndNodeId=null,
>
> id=a8722bc2761-cd323c53-d1de-4608-8eec-e373b1f68b71,
>
> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>
> isClient=true]], addr=/10.48.14.1:47500,
>
> rmtNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20]
>
> 2018-11-19 16:00:53,368 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Received response to join request
>
> [addr=/10.48.14.1:47500, res=1]
>
> 2018-11-19 16:00:53,372 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Received reconnect response
>
> [success=false, msg=TcpDiscoveryClientReconnectMessage
>
> [routerNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20,
>
> lastMsgId=c42b1bc2761-bbc61e78-97e8-49cd-844e-2dc3e8aacc68,
>
> super=TcpDiscoveryAbstractMessage [sndNodeId=null,
>
> id=a8722bc2761-cd323c53-d1de-4608-8eec-e373b1f68b71,
>
> verifierNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20, topVer=0,
>
> pendingIdx=0, failedNodes=null, isClient=true]]]
>
> 2018-11-19 16:00:53,372 [tcp-client-disco-reconnector-#5] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Grid runnable finished normally:
>
> tcp-client-disco-reconnector-#5
>
> 2018-11-19 16:00:53,373 [tcp-client-disco-msg-worker-#4] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Failed to restore closed connection,
>
> will try to reconnect [networkTimeout=360000, joinTimeout=0, failMsg=null]
>
> 2018-11-19 16:00:53,373 [tcp-client-disco-msg-worker-#4] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Discovery notification
>
> [node=TcpDiscoveryNode [id=cd323c53-d1de-4608-8eec-e373b1f68b71,
>
> addrs=[10.37.92.216],
>
> sockAddrs=[kafka-ignite-streamer-5bdb96f7d6-dfbhc/10.37.92.216:0],
>
> discPort=0, order=5, intOrder=0, lastExchangeTime=1542643135842,
>
> loc=true, ver=2.6.0#20180710-sha1:669feacc, isClient=true],
>
> type=CLIENT_NODE_DISCONNECTED, topVer=6]
>
> 2018-11-19 16:00:53,443 [tcp-client-disco-msg-worker-#4] INFO 
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Client node disconnected from cluster,
>
> will try to reconnect with new id
>
> [newId=af2e6c3b-7d68-44f6-ad64-e403e560b800,
>
> prevId=cd323c53-d1de-4608-8eec-e373b1f68b71, locNode=TcpDiscoveryNode
>
> [id=cd323c53-d1de-4608-8eec-e373b1f68b71, addrs=[10.37.92.216],
>
> sockAddrs=[kafka-ignite-streamer-5bdb96f7d6-dfbhc/10.37.92.216:0],
>
> discPort=0, order=5, intOrder=0, lastExchangeTime=1542643135842,
>
> loc=true, ver=2.6.0#20180710-sha1:669feacc, isClient=true]]
>
> 2018-11-19 16:00:53,482 [tcp-client-disco-msg-worker-#4] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Resolved addresses from IP finder:
>
> [/10.48.14.1:47500]
>
> 2018-11-19 16:00:53,482 [tcp-client-disco-msg-worker-#4] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Send join request
>
> [addr=/10.48.14.1:47500, reconnect=false,
>
> locNodeId=af2e6c3b-7d68-44f6-ad64-e403e560b800]
>
> 2018-11-19 16:00:53,553 [tcp-client-disco-msg-worker-#4] ERROR
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Runtime error caught during grid
>
> runnable execution: IgniteSpiThread [name=tcp-client-disco-msg-worker-#4]
>
> java.lang.NullPointerException: null
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequest(ClientImpl.java:666)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:546)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl.access$900(ClientImpl.java:128)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.tryJoin(ClientImpl.java:1846)
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1757)
>
>         at
>
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>
> 2018-11-19 16:00:53,553 [tcp-client-disco-msg-worker-#4] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Grid runnable finished normally:
>
> tcp-client-disco-msg-worker-#4
>
> 2018-11-19 16:06:27,487 [tcp-client-disco-sock-writer-#2] DEBUG
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Failed to get acknowledge for message,
>
> will try to reconnect [msg=TcpDiscoveryClientMetricsUpdateMessage
>
> [super=TcpDiscoveryAbstractMessage [sndNodeId=null,
>
> id=47722bc2761-cd323c53-d1de-4608-8eec-e373b1f68b71,
>
> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>
> isClient=true]], failureDetectionTimeout=360000]
>
> 2018-11-19 16:06:27,487 [tcp-client-disco-sock-writer-#2] ERROR
>
> o.a.i.s.d.tcp.TcpDiscoverySpi - Failed to send message: null
>
> java.io.IOException: Failed to get acknowledge for message:
>
> TcpDiscoveryClientMetricsUpdateMessage
>
> [super=TcpDiscoveryAbstractMessage [sndNodeId=null,
>
> id=47722bc2761-cd323c53-d1de-4608-8eec-e373b1f68b71,
>
> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>
> isClient=true]]
>
>         at
>
> org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1302)
>
>         at
>
> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>
>  
>
>  
>

Reply via email to