Issue with patch: https://issues.apache.org/jira/browse/IGNITE-10513

On 12/3/18 17:43, Dmitry Lazurkin wrote:
> OK. I have found steps to reproduce.
>
> For reproducing we need Ignite with slow disk. Or I known how to
> emulate slow hard disk:
>
> Add to partitions cycle in
> GridCacheDatabaseSharedManager#restorePartitionStates:
> //...
>             for (int i = 0; i < grp.affinity().partitions(); i++) {
>                 try {
>                     log.error("Wait");
>                     Thread.sleep(10000);
>                 } catch (InterruptedException e) {
>                     e.printStackTrace();
>                 }
> //...
>
> - Now we can start server
> - Then start client
> - Wait message "Join cluster while cluster state transition is in
> progress, waiting when transition finish."
> - Kill server
> - Wait repeatable exception java.net.ConnectException: Connection
> refused (Connection refused)
> - Start server and I have 100% chance to reproduce on my computer
>
> Thank you.
>
> On 11/29/18 14:03, Stanislav Lukyanov wrote:
>>
>> Hi,
>>
>>  
>>
>> The interesting (and disappointing) part is the NPE:
>>
>> java.lang.NullPointerException: null
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequest(ClientImpl.java:666)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:546)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.access$900(ClientImpl.java:128)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.tryJoin(ClientImpl.java:1846)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1757)
>>
>>  
>>
>> Which version do you use?
>>
>> Is this reproducible? Every time?
>>
>>  
>>
>> Thanks,
>>
>> Stan
>>
>>  
>>
>>  
>>
>> *From: *Dmitry Lazurkin <mailto:dila...@gmail.com>
>> *Sent: *20 ноября 2018 г. 15:44
>> *To: *user@ignite.apache.org <mailto:user@ignite.apache.org>
>> *Subject: *Client stucks and doesn't connect
>>
>>  
>>
>> Hello.
>>
>>  
>>
>> Ignite client stops connecting to server after exception:
>>
>>  
>>
>> 2018-11-19 16:00:49,257 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Resolved addresses from IP finder:
>>
>> [/10.48.14.1:47500]
>>
>> 2018-11-19 16:00:49,257 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Send join request
>>
>> [addr=/10.48.14.1:47500, reconnect=true,
>>
>> locNodeId=cd323c53-d1de-4608-8eec-e373b1f68b71]
>>
>> 2018-11-19 16:00:49,258 [tcp-client-disco-reconnector-#5] ERROR
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Exception on joining: Connection refused
>>
>> (Connection refused)
>>
>> java.net.ConnectException: Connection refused (Connection refused)
>>
>>         at java.net.PlainSocketImpl.socketConnect(Native Method)
>>
>>         at
>>
>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
>>
>>         at
>>
>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
>>
>>         at
>>
>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
>>
>>         at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
>>
>>         at java.net.Socket.connect(Socket.java:589)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.openSocket(TcpDiscoverySpi.java:1450)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.openSocket(TcpDiscoverySpi.java:1413)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequest(ClientImpl.java:637)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:546)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.access$900(ClientImpl.java:128)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl$Reconnector.body(ClientImpl.java:1408)
>>
>>         at
>>
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>>
>> 2018-11-19 16:00:49,258 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Failed to join to address
>>
>> [addr=/10.48.14.1:47500, recon=true, errs=[java.net.ConnectException:
>>
>> Connection refused (Connection refused)]]
>>
>> 2018-11-19 16:00:51,344 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Resolved addresses from IP finder:
>>
>> [/10.48.14.1:47500]
>>
>> 2018-11-19 16:00:51,344 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Send join request
>>
>> [addr=/10.48.14.1:47500, reconnect=true,
>>
>> locNodeId=cd323c53-d1de-4608-8eec-e373b1f68b71]
>>
>> 2018-11-19 16:00:51,364 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Message has been sent to address
>>
>> [msg=TcpDiscoveryClientReconnectMessage
>>
>> [routerNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20,
>>
>> lastMsgId=c42b1bc2761-bbc61e78-97e8-49cd-844e-2dc3e8aacc68,
>>
>> super=TcpDiscoveryAbstractMessage [sndNodeId=null,
>>
>> id=48722bc2761-cd323c53-d1de-4608-8eec-e373b1f68b71,
>>
>> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>>
>> isClient=true]], addr=/10.48.14.1:47500,
>>
>> rmtNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20]
>>
>> 2018-11-19 16:00:51,365 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Received response to join request
>>
>> [addr=/10.48.14.1:47500, res=100]
>>
>> 2018-11-19 16:00:51,365 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Will wait before retry join.
>>
>> 2018-11-19 16:00:53,365 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Send join request
>>
>> [addr=/10.48.14.1:47500, reconnect=true,
>>
>> locNodeId=cd323c53-d1de-4608-8eec-e373b1f68b71]
>>
>> 2018-11-19 16:00:53,368 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Message has been sent to address
>>
>> [msg=TcpDiscoveryClientReconnectMessage
>>
>> [routerNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20,
>>
>> lastMsgId=c42b1bc2761-bbc61e78-97e8-49cd-844e-2dc3e8aacc68,
>>
>> super=TcpDiscoveryAbstractMessage [sndNodeId=null,
>>
>> id=a8722bc2761-cd323c53-d1de-4608-8eec-e373b1f68b71,
>>
>> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>>
>> isClient=true]], addr=/10.48.14.1:47500,
>>
>> rmtNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20]
>>
>> 2018-11-19 16:00:53,368 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Received response to join request
>>
>> [addr=/10.48.14.1:47500, res=1]
>>
>> 2018-11-19 16:00:53,372 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Received reconnect response
>>
>> [success=false, msg=TcpDiscoveryClientReconnectMessage
>>
>> [routerNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20,
>>
>> lastMsgId=c42b1bc2761-bbc61e78-97e8-49cd-844e-2dc3e8aacc68,
>>
>> super=TcpDiscoveryAbstractMessage [sndNodeId=null,
>>
>> id=a8722bc2761-cd323c53-d1de-4608-8eec-e373b1f68b71,
>>
>> verifierNodeId=9c68d70f-883e-4a21-938b-05f3f6f98d20, topVer=0,
>>
>> pendingIdx=0, failedNodes=null, isClient=true]]]
>>
>> 2018-11-19 16:00:53,372 [tcp-client-disco-reconnector-#5] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Grid runnable finished normally:
>>
>> tcp-client-disco-reconnector-#5
>>
>> 2018-11-19 16:00:53,373 [tcp-client-disco-msg-worker-#4] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Failed to restore closed connection,
>>
>> will try to reconnect [networkTimeout=360000, joinTimeout=0,
>> failMsg=null]
>>
>> 2018-11-19 16:00:53,373 [tcp-client-disco-msg-worker-#4] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Discovery notification
>>
>> [node=TcpDiscoveryNode [id=cd323c53-d1de-4608-8eec-e373b1f68b71,
>>
>> addrs=[10.37.92.216],
>>
>> sockAddrs=[kafka-ignite-streamer-5bdb96f7d6-dfbhc/10.37.92.216:0],
>>
>> discPort=0, order=5, intOrder=0, lastExchangeTime=1542643135842,
>>
>> loc=true, ver=2.6.0#20180710-sha1:669feacc, isClient=true],
>>
>> type=CLIENT_NODE_DISCONNECTED, topVer=6]
>>
>> 2018-11-19 16:00:53,443 [tcp-client-disco-msg-worker-#4] INFO 
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Client node disconnected from cluster,
>>
>> will try to reconnect with new id
>>
>> [newId=af2e6c3b-7d68-44f6-ad64-e403e560b800,
>>
>> prevId=cd323c53-d1de-4608-8eec-e373b1f68b71, locNode=TcpDiscoveryNode
>>
>> [id=cd323c53-d1de-4608-8eec-e373b1f68b71, addrs=[10.37.92.216],
>>
>> sockAddrs=[kafka-ignite-streamer-5bdb96f7d6-dfbhc/10.37.92.216:0],
>>
>> discPort=0, order=5, intOrder=0, lastExchangeTime=1542643135842,
>>
>> loc=true, ver=2.6.0#20180710-sha1:669feacc, isClient=true]]
>>
>> 2018-11-19 16:00:53,482 [tcp-client-disco-msg-worker-#4] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Resolved addresses from IP finder:
>>
>> [/10.48.14.1:47500]
>>
>> 2018-11-19 16:00:53,482 [tcp-client-disco-msg-worker-#4] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Send join request
>>
>> [addr=/10.48.14.1:47500, reconnect=false,
>>
>> locNodeId=af2e6c3b-7d68-44f6-ad64-e403e560b800]
>>
>> 2018-11-19 16:00:53,553 [tcp-client-disco-msg-worker-#4] ERROR
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Runtime error caught during grid
>>
>> runnable execution: IgniteSpiThread [name=tcp-client-disco-msg-worker-#4]
>>
>> java.lang.NullPointerException: null
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.sendJoinRequest(ClientImpl.java:666)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.joinTopology(ClientImpl.java:546)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl.access$900(ClientImpl.java:128)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.tryJoin(ClientImpl.java:1846)
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1757)
>>
>>         at
>>
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>>
>> 2018-11-19 16:00:53,553 [tcp-client-disco-msg-worker-#4] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Grid runnable finished normally:
>>
>> tcp-client-disco-msg-worker-#4
>>
>> 2018-11-19 16:06:27,487 [tcp-client-disco-sock-writer-#2] DEBUG
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Failed to get acknowledge for message,
>>
>> will try to reconnect [msg=TcpDiscoveryClientMetricsUpdateMessage
>>
>> [super=TcpDiscoveryAbstractMessage [sndNodeId=null,
>>
>> id=47722bc2761-cd323c53-d1de-4608-8eec-e373b1f68b71,
>>
>> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>>
>> isClient=true]], failureDetectionTimeout=360000]
>>
>> 2018-11-19 16:06:27,487 [tcp-client-disco-sock-writer-#2] ERROR
>>
>> o.a.i.s.d.tcp.TcpDiscoverySpi - Failed to send message: null
>>
>> java.io.IOException: Failed to get acknowledge for message:
>>
>> TcpDiscoveryClientMetricsUpdateMessage
>>
>> [super=TcpDiscoveryAbstractMessage [sndNodeId=null,
>>
>> id=47722bc2761-cd323c53-d1de-4608-8eec-e373b1f68b71,
>>
>> verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null,
>>
>> isClient=true]]
>>
>>         at
>>
>> org.apache.ignite.spi.discovery.tcp.ClientImpl$SocketWriter.body(ClientImpl.java:1302)
>>
>>         at
>>
>> org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)
>>
>>  
>>
>>  
>>
>

Reply via email to