ff zhou created IGNITE-13311:
--------------------------------
Summary: Put operation may hang if the cache is reconnected
Key: IGNITE-13311
URL: https://issues.apache.org/jira/browse/IGNITE-13311
Project: Ignite
Issue Type: Bug
Components: cache
Affects Versions: 2.8.1
Environment: # Single-node client (fat node)
# single-node server
# use replicated persistence mode
Reporter: ff zhou
Problem description:
1. Single-node client (fat node), single-node server, and use replicated
persistence mode
2) Run the client program and perform the put operation.
3) Kill the server process.
4) The client captures the IgniteClientDisconnectedException exception and
invokes cause.reconnectFuture().get(); to wait for reconnection.
5) Start the server process.
6) After the execution(cause.reconnectFuture().get())is completed, perform the
put operation immediately. The put operation is hung on the checkProxyIsValid
interface.
7)if i performed sleep operation after reconnectFuture.get() is completed,then
do the put operation, it will not hang.(This is a temporary workaround.)
hang stack:
"mq-topic-11-3-4" #253 prio=5 os_prio=0 tid=0x00007fa70275e000 nid=0x2abb
waiting on condition [0x00007fa4ad29a000]"mq-topic-11-3-4" #253 prio=5
os_prio=0 tid=0x00007fa70275e000 nid=0x2abb waiting on condition
[0x00007fa4ad29a000] java.lang.Thread.State: WAITING (parking) at
sun.misc.Unsafe.park(Native Method) at
java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.publicJCache(GridCacheProcessor.java:4417)
at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.publicJCache(GridCacheProcessor.java:4388)
at
org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.checkProxyIsValid(GatewayProtectedCacheProxy.java:1602)
at
org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.onEnter(GatewayProtectedCacheProxy.java:1619)
at
org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.put(GatewayProtectedCacheProxy.java:853)
at
com.huawei.donau.master.common.repository.storage.cache.impl.ignite.EventCacheDaoIgniteImpl.insertEvent(EventCacheDaoIgniteImpl.java:105)
at
com.huawei.donau.master.core.repository.service.EventService.addEvent(EventService.java:28)
at
com.huawei.donau.master.core.jobmanager.task.DonauTask.addEvent(DonauTask.java:234)
at
com.huawei.donau.master.core.jobmanager.task.TaskManager.recordTaskStateChangeEvent(TaskManager.java:1290)
at
com.huawei.donau.master.core.jobmanager.task.TaskManager.recordAndPersistStateChange(TaskManager.java:986)
at
com.huawei.donau.master.core.jobmanager.task.TaskManager.onTaskStateChange(TaskManager.java:941)
at
com.huawei.donau.master.core.jobmanager.task.TaskManager.onTaskUpdate(TaskManager.java:333)
at
com.huawei.donau.master.core.jobmanager.task.TaskManager.handleTaskUpdateMessage(TaskManager.java:251)
at
com.huawei.donau.master.core.jobmanager.task.TaskManager$$Lambda$536/577636327.onReceive(Unknown
Source) at
com.huawei.donau.common.mq.ConsumerClient$ConsumerThread.run(ConsumerClient.java:141)
Locked ownable synchronizers: - None
client debug log:
Jul 27, 2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger infoJul 27,
2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger infoINFO: Router node:
TcpDiscoveryNode [id=98a557e1-0638-486d-8939-4e09ce32726a,
consistentId=8c821ac6-4bf2-4426-8098-e182f31fd168, addrs=ArrayList
[0:0:0:0:0:0:0:1%lo, 10.29.88.134, 127.0.0.1, 192.168.122.1], sockAddrs=HashSet
[/192.168.122.1:47500, hghphicprd08301/10.29.88.134:47500,
/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500], discPort=47500, order=1,
intOrder=1, lastExchangeTime=1595831945590, loc=false,
ver=2.8.1#20200521-sha1:86422096, isClient=false]Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Stopped cache
[cacheName=ignite-sys-cache]Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Assigned mvcc coordinator
[crd=MvccCoordinator [topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0],
nodeId=98a557e1-0638-486d-8939-4e09ce32726a, ver=1595831945461, local=false,
initialized=false]]Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Stopped cache
[cacheName=t_migration_jobid]Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger warningWARNING: Client node was
reconnected after it was already considered failed by the server topology (this
could happen after all servers restarted or due to a long network outage
between the client and servers). All continuous queries and remote event
listeners created by this client will be unsubscribed, consider listening to
EVT_CLIENT_NODE_RECONNECTED event to restore them.Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Stopped cache
[cacheName=t_submit_id]Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Stopped cache
[cacheName=tevent]Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Client node reconnected to
topology: TcpDiscoveryNode [id=2d923f40-61de-4baa-b8c3-e2d7abba8582,
consistentId=71df3af8-1955-4047-baa4-9978e4f50e5f, addrs=ArrayList
[0:0:0:0:0:0:0:1%lo, 10.29.88.134, 127.0.0.1, 192.168.122.1], sockAddrs=HashSet
[/192.168.122.1:0, /0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0,
hghphicprd08301/10.29.88.134:0], discPort=0, order=2, intOrder=0,
lastExchangeTime=1595831847434, loc=true, ver=2.8.1#20200521-sha1:86422096,
isClient=true]Jul 27, 2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger
infoINFO: Topology snapshot [ver=2, locNode=2d923f40, servers=1, clients=1,
state=ACTIVE, CPUs=4, offheap=1.0GB, heap=13.0GB]Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: ^-- Baseline [id=0,
size=1, online=1, offline=0]2020-07-27 14:39:05.639 [mq-topic-11-3-14:103288] -
[WARN] [EventCacheDaoIgniteImpl:] [] |ignite reconnect
successfully||||2020-07-27 14:39:05.640 [mq-topic-11-3-18:103289] - [WARN]
[EventCacheDaoIgniteImpl:] [] |ignite reconnect successfully||||2020-07-27
14:39:05.640 [mq-topic-11-3-16:103289] - [WARN] [EventCacheDaoIgniteImpl:] []
|ignite reconnect successfully||||2020-07-27 14:39:05.640
[mq-topic-11-3-15:103289] - [WARN] [EventCacheDaoIgniteImpl:] [] |ignite
reconnect successfully||||2020-07-27 14:39:05.640 [mq-topic-14-7-9:103289] -
[WARN] [EventCacheDaoIgniteImpl:] [] |ignite reconnect successfully||||Jul 27,
2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger errorSEVERE: Failed to
process custom exchange task: ClientCacheChangeDummyDiscoveryMessage
[reqId=4796b72f-8291-425f-a9d0-59dbbbe049e6, cachesToClose=null, startCaches=
[tevent]]java.lang.NullPointerException at
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesRequests(CacheAffinitySharedManager.java:665)
at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:405)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2997)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3145)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3063)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
Jul 27, 2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger errorSEVERE:
Failed to process custom exchange task: ClientCacheChangeDummyDiscoveryMessage
[reqId=a2f1a276-8a93-4a03-a6f4-af1ae15034f4, cachesToClose=null, startCaches=
[tevent]]java.lang.NullPointerException at
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesRequests(CacheAffinitySharedManager.java:665)
at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:405)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2997)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3145)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3063)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
Jul 27, 2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger errorSEVERE:
Failed to process custom exchange task: ClientCacheChangeDummyDiscoveryMessage
[reqId=fe75171b-3cb7-41ed-9a4a-4da448a82de9, cachesToClose=null, startCaches=
[tevent]]java.lang.NullPointerException at
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesRequests(CacheAffinitySharedManager.java:665)
at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:405)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2997)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3145)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3063)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
Jul 27, 2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger errorSEVERE:
Failed to process custom exchange task: ClientCacheChangeDummyDiscoveryMessage
[reqId=9ad699e3-89b3-41f1-8d0d-6c61be10f461, cachesToClose=null, startCaches=
[tevent]]java.lang.NullPointerException at
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesRequests(CacheAffinitySharedManager.java:665)
at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:405)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2997)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3145)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3063)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
Jul 27, 2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger errorSEVERE:
Failed to process custom exchange task: ClientCacheChangeDummyDiscoveryMessage
[reqId=85d54eef-897c-4a3e-b9f5-3923afe8dde4, cachesToClose=null, startCaches=
[tevent]]java.lang.NullPointerException at
org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesRequests(CacheAffinitySharedManager.java:665)
at
org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:405)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2997)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3145)
at
org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3063)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at java.lang.Thread.run(Thread.java:748)
Jul 27, 2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger infoINFO:
Received state change finish message: trueJul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Started exchange init
[topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], crd=false,
evt=NODE_JOINED, evtNode=2d923f40-61de-4baa-b8c3-e2d7abba8582, customEvt=null,
allowMerge=true, exchangeFreeSwitch=false]Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Components activation
performed in 0 ms.Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Started cache
[name=ignite-sys-cache, id=-2100569601, dataRegionName=sysMemPlc,
mode=REPLICATED, atomicity=TRANSACTIONAL, backups=2147483647, mvcc=false]Jul
27, 2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger infoINFO: Starting
caches on local join performed in 0 ms.Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Established outgoing
communication connection [locAddr=/0:0:0:0:0:0:0:1:35108,
rmtAddr=/0:0:0:0:0:0:0:1%lo:47100]Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Finished exchange init
[topVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], crd=false]Jul 27,
2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger infoINFO: Received
full message, will finish exchange [node=98a557e1-0638-486d-8939-4e09ce32726a,
resVer=AffinityTopologyVersion [topVer=2, minorTopVer=0]]Jul 27, 2020 2:39:05
PM org.apache.ignite.logger.java.JavaLogger infoINFO: Finish exchange future
[startVer=AffinityTopologyVersion [topVer=2, minorTopVer=0],
resVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], err=null,
rebalanced=true, wasRebalanced=false]Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Finish proxy initialization,
cacheName=ignite-sys-cache, localNodeId=2d923f40-61de-4baa-b8c3-e2d7abba8582Jul
27, 2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger infoINFO:
Completed partition exchange [localNode=2d923f40-61de-4baa-b8c3-e2d7abba8582,
exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion
[topVer=2, minorTopVer=0], evt=NODE_JOINED, evtNode=TcpDiscoveryNode
[id=2d923f40-61de-4baa-b8c3-e2d7abba8582,
consistentId=71df3af8-1955-4047-baa4-9978e4f50e5f, addrs=ArrayList
[0:0:0:0:0:0:0:1%lo, 10.29.88.134, 127.0.0.1, 192.168.122.1], sockAddrs=HashSet
[/192.168.122.1:0, /0:0:0:0:0:0:0:1%lo:0, /127.0.0.1:0,
hghphicprd08301/10.29.88.134:0], discPort=0, order=2, intOrder=0,
lastExchangeTime=1595831847434, loc=true, ver=2.8.1#20200521-sha1:86422096,
isClient=true], done=true, newCrdFut=null], topVer=AffinityTopologyVersion
[topVer=2, minorTopVer=0]]Jul 27, 2020 2:39:05 PM
org.apache.ignite.logger.java.JavaLogger infoINFO: Exchange timings
[startVer=AffinityTopologyVersion [topVer=2, minorTopVer=0],
resVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], stage="Waiting in
exchange queue" (0 ms), stage="Exchange parameters initialization" (0 ms),
stage="Components activation" (0 ms), stage="Determine exchange type" (19 ms),
stage="Waiting for Full message" (65 ms), stage="Affinity recalculation" (3
ms), stage="Full map updating" (1 ms), stage="Detect lost partitions" (0 ms),
stage="Exchange done" (0 ms), stage="Total time" (88 ms)]Jul 27, 2020 2:39:05
PM org.apache.ignite.logger.java.JavaLogger infoINFO: Exchange longest local
stages [startVer=AffinityTopologyVersion [topVer=2, minorTopVer=0],
resVer=AffinityTopologyVersion [topVer=2, minorTopVer=0], stage="Affinity
initialization (local join) [grp=tevent]" (1 ms) (parent=Affinity
recalculation), stage="Affinity initialization (local join) [grp=t_submit_id]"
(0 ms) (parent=Affinity recalculation), stage="Affinity initialization (local
join) [grp=t_migration_jobid]" (0 ms) (parent=Affinity recalculation)]Jul 27,
2020 2:39:05 PM org.apache.ignite.logger.java.JavaLogger infoINFO: Skipping
rebalancing (no affinity changes) [top=AffinityTopologyVersion [topVer=2,
minorTopVer=0], rebTopVer=AffinityTopologyVersion [topVer=-1, minorTopVer=0],
evt=NODE_JOINED, evtNode=2d923f40-61de-4baa-b8c3-e2d7abba8582, client=true]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)