[
https://issues.apache.org/jira/browse/IGNITE-6071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16215150#comment-16215150
]
ASF GitHub Bot commented on IGNITE-6071:
----------------------------------------
GitHub user alamar opened a pull request:
https://github.com/apache/ignite/pull/2904
IGNITE-6071 White list of exceptions to suppress in createTcpClient.
Also add wait in discovery infinite loop to avoid grind
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gridgain/apache-ignite ignite-6071m8
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/ignite/pull/2904.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2904
----
commit 014161427fb603b6df7c8ecc3c0904f5df47a21d
Author: Denis Magda <[email protected]>
Date: 2017-02-14T01:33:32Z
IGNITE-4159: Kubernetes IP finder.
(cherry picked from commit 37c0a22)
commit 1db238402f11c67d2b28bfb7ff47955415f00c25
Author: Denis Magda <[email protected]>
Date: 2017-02-16T04:37:26Z
IGNITE-4159: fixing logging
(cherry picked from commit 06908d2)
(cherry picked from commit fa27ee3)
commit 5dfe16f7e91374008b9f6dfbb899364f5b2e1164
Author: Denis Magda <[email protected]>
Date: 2017-02-14T06:03:30Z
IGNITE-4159: using logger instead of system.out.println
(cherry picked from commit b9bf77c)
commit 6e596d1ef426b66abd866d011a8f5cf5c5c25124
Author: Andrey V. Mashenkov <[email protected]>
Date: 2017-04-06T11:43:50Z
IGNITE-4832: Prevent service deployment on client by default when
configuration is provided on startup. This closes #1748.
(cherry picked from commit b7ab273)
commit 443ac9a7aa82af1359a03bcfc8f9212b108300e4
Author: Andrey V. Mashenkov <[email protected]>
Date: 2017-04-05T12:01:02Z
IGNITE-4917: Fixed failure when accessing BinaryObjectBuilder field value
serialized with OptimizedMarshaller . This closes #1736.
commit 05f3c747921aed6838804d2f5f2c8d2bd7985337
Author: Andrey V. Mashenkov <[email protected]>
Date: 2017-04-05T12:01:02Z
IGNITE-4917: Fixed failure when accessing BinaryObjectBuilder field value
serialized with OptimizedMarshaller . This closes #1736.
(cherry picked from commit 443ac9a)
commit 3be4e00373ec5a2b49788d70eb0aebccc3cb6ccf
Author: Alexander Fedotov <[email protected]>
Date: 2017-04-07T11:59:00Z
Merge branch ignite-1.6.5 into ignite-1.8.5-p1
commit d81548d3a4e384e1a9b4adacf1fb487444bbfd33
Author: Alexander Fedotov <[email protected]>
Date: 2017-04-07T12:33:08Z
Merge branch ignite-1.6.6-p1 into ignite-1.8.5-p1
commit 6954ff0c85f2f75507ee0bd306c879f490b4201a
Author: Alexander Fedotov <[email protected]>
Date: 2017-04-07T12:44:48Z
Merge branch ignite-1.6.12 into ignite-1.8.5-p1
commit 62dbba81c009170ff6243a28d3ef12fa75b96225
Author: Alexander Fedotov <[email protected]>
Date: 2017-04-07T12:46:11Z
Merge branch ignite-1.7.4-p1 into ignite-1.8.5-p1
commit 4fce28054bc325741f65035ae384f9b4b9c3fee8
Author: Alexander Fedotov <[email protected]>
Date: 2017-04-07T13:06:34Z
Merge branch ignite-1.8.4-p1 into ignite-1.8.5-p1
# Conflicts:
#
modules/core/src/main/java/org/apache/ignite/internal/managers/discovery/GridDiscoveryManager.java
commit 3d616799efb472227b3b313516e6b40729654631
Author: dkarachentsev <[email protected]>
Date: 2017-04-10T07:36:11Z
IGNITE-2466 - Use current NIO back pressure mechanism to limit received
messages. Mark them process only when backups acknowledged.
(backport from 1.9.2)
(cherry picked from commit 220db882b466c03eadd148b3a19a0bf70d82d4a6)
commit 2a88a7a7581465ff0a6f8733550e6d42d7f71a6c
Author: dkarachentsev <[email protected]>
Date: 2017-04-10T07:54:37Z
IGNITE-4667 - Throw exception on starting client cache when indexed types
cannot be loaded
commit ba6227be49c8a395a5632e9841a6acb65ae340b6
Author: dkarachentsev <[email protected]>
Date: 2017-04-10T08:40:17Z
IGNITE-2466 - Disable back-pressure for sender data nodes to avoid deadlock.
commit 315ff38eeef96f12954d6ff39c84d58b2b959667
Author: Andrey V. Mashenkov <[email protected]>
Date: 2017-04-06T11:43:50Z
IGNITE-4879: Fixed System pool starvation while partition evicting.
commit 89e9dbe484312c251f02c9fbe9698c3ac2e03df8
Author: Alexander Fedotov <[email protected]>
Date: 2017-04-10T13:36:33Z
Fix org.apache.ignite.internal.processors.cache.expiry
.IgniteCacheExpiryPolicyAbstractTest#testNearExpiresWithCacheStore
commit 02b194268071b179d291b28472cef5d587e7558a
Author: Alexander Fedotov <[email protected]>
Date: 2017-04-11T09:00:59Z
Fix missing test resource directory for
org.apache.ignite.spi.discovery.tcp
.TcpDiscoveryNodeAttributesUpdateOnReconnectTest.testReconnect
commit 20016a20f780eb3c21f249d3cb74d08018c4eea5
Author: Alexander Fedotov <[email protected]>
Date: 2017-04-11T11:54:06Z
Fix org.apache.ignite.internal.processors.cache.expiry
.IgniteCacheExpiryPolicyAbstractTest#testNearExpiresWithCacheStore
commit 465084da5b00dcfc056d338f5d0a24875ca2af08
Author: Andrey V. Mashenkov <[email protected]>
Date: 2017-04-12T10:01:25Z
IGNITE-4907: Fixed excessive service instances can be started with dynamic
deployment. This closes #1766.
(cherry picked from commit 0f7ef74)
commit a20c307df1dd000309a273ef93231fdc41a2a81c
Author: dkarachentsev <[email protected]>
Date: 2017-04-13T06:31:17Z
IGNITE-4891 - Fix. Key is deserialized during transactional get() even if
withKeepBinary is set
(Backport from master)
commit 630558dfeb373f237057e394e8f2f63230d59dab
Author: vladisav <[email protected]>
Date: 2017-04-13T10:24:42Z
ignite-4173 IgniteSemaphore with failoverSafe enabled doesn't release
permits in case permits owner node left topology
Backport from master.
(cherry picked from commit 76485fc)
commit 870b752c095ed3776e91a65b99763142b9f2ebc0
Author: Vladisav Jelisavcic <[email protected]>
Date: 2017-04-11T11:09:12Z
ignite-1977 - fixed IgniteSemaphore fault tolerance.
Backport from master.
(cherry picked from commit 902bf42)
commit cd0b92950c6691c6fc1a26cb4f7e55f5ee459298
Author: Yakov Zhdanov <[email protected]>
Date: 2017-04-13T12:52:20Z
ignite-4946 GridCacheP2PUndeploySelfTest became failed
(cherry picked from commit d298e75)
commit 405ce563fb7c35627c6e1bb0b68f423ba089c6f2
Author: Dmitriy Shabalin <[email protected]>
Date: 2017-04-14T10:55:38Z
IGNITE-4068 Added common primitive for buttons group. Refactored existing
button groups.
(cherry picked from commit e5200c2)
commit 60cf48dc175fa288cd74d1189f0e992c9a16dc99
Author: Vasiliy Sisko <[email protected]>
Date: 2017-04-14T11:00:47Z
IGNITE-4886 Catch all errors.
(cherry picked from commit 7e8d9e8)
commit 81c3ed4c0511841f7056677db6063b4eb8d2def0
Author: Alexey Kuznetsov <[email protected]>
Date: 2017-04-14T11:18:23Z
IGNITE-4896 Rewored GridClientNodeBean serialization.
(cherry picked from commit a025268)
commit 4a1415ad01ff9fde30d5c7c02e6d938f1515178d
Author: Andrey V. Mashenkov <[email protected]>
Date: 2017-04-12T10:01:25Z
IGNITE-4907: Fixed excessive service instances can be started with dynamic
deployment. This closes #1766.
(cherry picked from commit 0f7ef74)
commit e206b9f1fd3dbf927f940d558144a4796479ed5d
Author: vsisko <[email protected]>
Date: 2017-04-14T11:32:30Z
IGNITE-4871 Added Kubernetes IP finder to Cluster configuration screen.
(cherry picked from commit f978ff2)
commit b22738080101536a8af1ed60e70d693897e9bc7c
Author: dkarachentsev <[email protected]>
Date: 2017-04-14T14:54:02Z
ignite-4173 Fix test. Permits must be released on node fail.
(cherry picked from commit 1f867c6)
commit 41c5288606710b9c42983780ac7464a746d09eb0
Author: dkarachentsev <[email protected]>
Date: 2017-04-14T14:56:25Z
Merge remote-tracking branch 'origin/ignite-1.8.6' into ignite-1.8.6
----
> Client may detect necessity for reconnect for too long
> ------------------------------------------------------
>
> Key: IGNITE-6071
> URL: https://issues.apache.org/jira/browse/IGNITE-6071
> Project: Ignite
> Issue Type: Bug
> Affects Versions: 2.1
> Reporter: Yakov Zhdanov
> Assignee: Ilya Kasnacheev
>
> There was a GC pause on client that caused servers to drop client due to
> inability to establish TCP communication connection. Then it took some time
> for client to detect that it has been dropped. During that time client many
> times attempted to connect to server which can be seen in the logs. After
> client detected its drop and reconnected servers fired node added event and
> no log flood can be found any more.
> We need to find out why client was reconnecting via communication and did not
> detect the drop for such a long time.
> I hope this can be reproduced in test:
> * start 2 servers
> * start client
> * suspend all client threads with Thread.suspend() - just filter threads of
> current JVM by name and suspend ones belonging to the client.
> {noformat}
> [10:12:24,785][WARNING][disco-event-worker-#71%null%][GridDiscoveryManager]
> Node FAILED: TcpDiscoveryNode [id=dd71479c-41ba-443e-b25c-3803a2a94f4f,
> addrs=[10.44.3.14, 127.0.0.1], sockAddrs=[/127.0.0.1:0,
> XXX.com/10.44.3.14:0], discPort=0, order=2, intOrder=2,
> lastExchangeTime=1502269008673, loc=false, ver=2.1.1#20170618-sha1:09ce29e0,
> isClient=true]
> [10:12:24,785][INFO][disco-event-worker-#71%null%][GridDiscoveryManager]
> Topology snapshot [ver=5, servers=2, clients=1, CPUs=144, heap=76.0GB]
> [10:12:24,794][INFO][exchange-worker-#72%null%][time] Started exchange init
> [topVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], crd=false, evt=12,
> node=TcpDiscoveryNode [id=98c1fdf7-09db-4fa0-bb01-8ca7f046643d,
> addrs=[10.44.3.11, 127.0.0.1], sockAddrs=[/127.0.0.1:47500,
> XXX.com/10.44.3.11:47500], discPort=47500, order=3, intOrder=3,
> lastExchangeTime=1502269944782, loc=true, ver=2.1.1#20170618-sha1:09ce29e0,
> isClient=false], evtNode=TcpDiscoveryNode
> [id=98c1fdf7-09db-4fa0-bb01-8ca7f046643d, addrs=[10.44.3.11, 127.0.0.1],
> sockAddrs=[/127.0.0.1:47500, XXX.com/10.44.3.11:47500], discPort=47500,
> order=3, intOrder=3, lastExchangeTime=1502269944782, loc=true,
> ver=2.1.1#20170618-sha1:09ce29e0, isClient=false], customEvt=null]
> [10:12:24,813][INFO][exchange-worker-#72%null%][time] Finished exchange init
> [topVer=AffinityTopologyVersion [topVer=5, minorTopVer=0], crd=false]
> [10:12:24,819][INFO][exchange-worker-#72%null%][GridCachePartitionExchangeManager]
> Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
> [topVer=5, minorTopVer=0], evt=NODE_FAILED,
> node=dd71479c-41ba-443e-b25c-3803a2a94f4f]
> [10:12:28,344][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52474]
> [10:12:28,348][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52482]
> [10:12:28,356][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52506]
> [10:12:28,362][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52522]
> [10:12:28,368][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52538]
> [10:12:28,374][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52554]
> [10:12:28,380][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52570]
> [10:12:28,386][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52586]
> [10:12:28,392][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52602]
> [10:12:28,397][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52618]
> [10:12:28,402][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52634]
> [10:12:28,407][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52650]
> [10:12:28,412][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:52666]
> ...
> [10:18:32,684][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:43604]
> [10:18:32,690][INFO][grid-nio-worker-tcp-comm-1-#58%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:43620]
> [10:18:32,695][INFO][grid-nio-worker-tcp-comm-0-#57%null%][TcpCommunicationSpi]
> Accepted incoming communication connection [locAddr=/10.44.3.11:47100,
> rmtAddr=/10.44.3.14:43636]
> [10:18:42,831][INFO][disco-event-worker-#71%null%][GridDiscoveryManager]
> Added new node to topology: TcpDiscoveryNode
> [id=2e80b0f0-21db-451d-a264-34ba16e00ffa, addrs=[10.44.3.14, 127.0.0.1],
> sockAddrs=[/127.0.0.1:0,
> gbrdsr000002837.intranet.barcapint.com/10.44.3.14:0], discPort=0, order=6,
> intOrder=5, lastExchangeTime=1502270322805, loc=false,
> ver=2.1.1#20170618-sha1:09ce29e0, isClient=true]
> [10:18:42,832][INFO][disco-event-worker-#71%null%][GridDiscoveryManager]
> Topology snapshot [ver=6, servers=2, clients=2, CPUs=144, heap=90.0GB]
> [10:18:42,833][INFO][exchange-worker-#72%null%][time] Started exchange init
> [topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], crd=false, evt=10,
> node=TcpDiscoveryNode [id=98c1fdf7-09db-4fa0-bb01-8ca7f046643d,
> addrs=[10.44.3.11, 127.0.0.1], sockAddrs=[/127.0.0.1:47500,
> XXX.com/10.44.3.11:47500], discPort=47500, order=3, intOrder=3,
> lastExchangeTime=1502270322815, loc=true, ver=2.1.1#20170618-sha1:09ce29e0,
> isClient=false], evtNode=TcpDiscoveryNode
> [id=98c1fdf7-09db-4fa0-bb01-8ca7f046643d, addrs=[10.44.3.11, 127.0.0.1],
> sockAddrs=[/127.0.0.1:47500, XXX.com/10.44.3.11:47500], discPort=47500,
> order=3, intOrder=3, lastExchangeTime=1502270322815, loc=true,
> ver=2.1.1#20170618-sha1:09ce29e0, isClient=false], customEvt=null]
> [10:18:42,851][INFO][exchange-worker-#72%null%][time] Finished exchange init
> [topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], crd=false]
> [10:18:42,855][INFO][exchange-worker-#72%null%][GridCachePartitionExchangeManager]
> Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
> [topVer=6, minorTopVer=0], evt=NODE_JOINED,
> node=2e80b0f0-21db-451d-a264-34ba16e00ffa]
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)