[
https://issues.apache.org/jira/browse/IGNITE-17542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vyacheslav Koptilin updated IGNITE-17542:
-----------------------------------------
Affects Version/s: (was: 2.14)
> Test CacheLateAffinityAssignmentTest.testAffinitySimpleNoCacheOnCoordinator2
> became flaky after IGNITE-17507
> ------------------------------------------------------------------------------------------------------------
>
> Key: IGNITE-17542
> URL: https://issues.apache.org/jira/browse/IGNITE-17542
> Project: Ignite
> Issue Type: Bug
> Reporter: Vyacheslav Koptilin
> Assignee: Vyacheslav Koptilin
> Priority: Major
> Fix For: 2.14
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> The test
> CacheLateAffinityAssignmentTest.testAffinitySimpleNoCacheOnCoordinator2
> became flay due to IGNITE-17507.
> The root cause of the issue that _CacheAffinityChangeMessage_ mutates the
> message outside the _disco-notifier_ thread, and this fact may lead to the
> following exception:
> {noformat}
> [2022-08-16T21:10:32,133][ERROR][tcp-disco-msg-worker-[0448095b
> 127.0.0.1:47502]-#5308%distributed.CacheLateAffinityAssignmentTest3%-#98199%distributed.CacheLateAffinityAssignmentTest3%][TestTcpDiscoverySpi]
> TcpDiscoverSpi's message worker thread failed abnormally. Stopping the node
> in order to prevent cluster wide instability.
> org.apache.ignite.IgniteException: Failed to marshal mutable discovery
> message: CacheAffinityChangeMessage
> [id=ea31ffaa281-0286b465-6baf-4ad8-9e3b-3f8cb755d1dd, topVer=null,
> exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=6,
> minorTopVer=0], discoEvt=null, nodeId=f9f9faf0, evt=NODE_LEFT],
> partsMsg=GridDhtPartitionsFullMessage [parts=HashMap
> {-2100569601=GridDhtPartitionFullMap
> {f57cbb85-44ba-40d1-814e-937f96c00003=GridDhtPartitionMap [moving=0,
> top=AffinityTopologyVersion [topVer=6, minorTopVer=0], updateSeq=111,
> size=100], 0448095b-02d8-470c-ab90-6a5bcf800002=GridDhtPartitionMap
> [moving=0, top=AffinityTopologyVersion [topVer=6, minorTopVer=0],
> updateSeq=116, size=100]}, 1251687457=GridDhtPartitionFullMap
> {f57cbb85-44ba-40d1-814e-937f96c00003=GridDhtPartitionMap [moving=0,
> top=AffinityTopologyVersion [topVer=6, minorTopVer=0], updateSeq=1035,
> size=1024], 0448095b-02d8-470c-ab90-6a5bcf800002=GridDhtPartitionMap
> [moving=0, top=AffinityTopologyVersion [topVer=4, minorTopVer=0],
> updateSeq=3, size=0]}}, partCntrs=IgniteDhtPartitionCountersMap [],
> partCntrs2=null, partHistSuppliers=IgniteDhtPartitionHistorySuppliersMap [],
> partsToReload=IgniteDhtPartitionsToReloadMap [],
> topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], errs=null,
> resTopVer=null, flags=0, partCnt=2, super=GridDhtPartitionsAbstractMessage
> [exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=6,
> minorTopVer=0], discoEvt=null, nodeId=f9f9faf0, evt=NODE_LEFT],
> lastVer=GridCacheVersion [topVer=0, order=1660673425660, nodeOrder=0,
> dataCenterId=0], super=GridCacheMessage [msgId=-1, depInfo=null,
> lastAffChangedTopVer=null, err=null, skipPrepare=false]]],
> exchangeNeeded=false, stopProc=false]
> at
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.notifyDiscoveryListener(ServerImpl.java:6423)
> ~[classes/:?]
> at
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processCustomMessage(ServerImpl.java:6243)
> ~[classes/:?]
> at
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:3260)
> ~[classes/:?]
> at
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.processMessage(ServerImpl.java:2918)
> ~[classes/:?]
> at
> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorker.body(ServerImpl.java:8058)
> ~[classes/:?]
> at
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.body(ServerImpl.java:3089)
> [classes/:?]
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
> [classes/:?]
> at
> org.apache.ignite.spi.discovery.tcp.ServerImpl$MessageWorkerThread.body(ServerImpl.java:7989)
> [classes/:?]
> at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:58)
> [classes/:?]
> Caused by: org.apache.ignite.IgniteCheckedException: Failed to serialize
> object: CacheAffinityChangeMessage
> [id=ea31ffaa281-0286b465-6baf-4ad8-9e3b-3f8cb755d1dd, topVer=null,
> exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=6,
> minorTopVer=0], discoEvt=null, nodeId=f9f9faf0, evt=NODE_LEFT],
> partsMsg=GridDhtPartitionsFullMessage [parts=HashMap
> {-2100569601=GridDhtPartitionFullMap
> {f57cbb85-44ba-40d1-814e-937f96c00003=GridDhtPartitionMap [moving=0,
> top=AffinityTopologyVersion [topVer=6, minorTopVer=0], updateSeq=111,
> size=100], 0448095b-02d8-470c-ab90-6a5bcf800002=GridDhtPartitionMap
> [moving=0, top=AffinityTopologyVersion [topVer=6, minorTopVer=0],
> updateSeq=116, size=100]}, 1251687457=GridDhtPartitionFullMap
> {f57cbb85-44ba-40d1-814e-937f96c00003=GridDhtPartitionMap [moving=0,
> top=AffinityTopologyVersion [topVer=6, minorTopVer=0], updateSeq=1035,
> size=1024], 0448095b-02d8-470c-ab90-6a5bcf800002=GridDhtPartitionMap
> [moving=0, top=AffinityTopologyVersion [topVer=4, minorTopVer=0],
> updateSeq=3, size=0]}}, partCntrs=IgniteDhtPartitionCountersMap [],
> partCntrs2=null, partHistSuppliers=IgniteDhtPartitionHistorySuppliersMap [],
> partsToReload=IgniteDhtPartitionsToReloadMap [],
> topVer=AffinityTopologyVersion [topVer=6, minorTopVer=0], errs=null,
> resTopVer=null, flags=0, partCnt=2, super=GridDhtPartitionsAbstractMessage
> [exchId=GridDhtPartitionExchangeId [topVer=AffinityTopologyVersion [topVer=6,
> minorTopVer=0], discoEvt=null, nodeId=f9f9faf0, evt=NODE_LEFT],
> lastVer=GridCacheVersion [topVer=0, order=1660673425660, nodeOrder=0,
> dataCenterId=0], super=GridCacheMessage [msgId=-1, depInfo=null,
> lastAffChangedTopVer=null, err=null, skipPrepare=false]]],
> exchangeNeeded=false, stopProc=false]
> at
> org.apache.ignite.marshaller.jdk.JdkMarshaller.marshal0(JdkMarshaller.java:102)
> ~[classes/:?]
> at
> org.apache.ignite.marshaller.jdk.JdkMarshaller.marshal0(JdkMarshaller.java:109)
> ~[classes/:?]
> at
> org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.marshal(AbstractNodeNameAwareMarshaller.java:56)
> ~[classes/:?]
> at
> org.apache.ignite.internal.util.IgniteUtils.marshal(IgniteUtils.java:10827)
> [classes/:?]
> at
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.notifyDiscoveryListener(ServerImpl.java:6420)
> ~[classes/:?]
> ... 8 more
> Caused by: java.util.ConcurrentModificationException
> at java.util.HashMap$HashIterator.nextNode(HashMap.java:1493) ~[?:?]
> at java.util.HashMap$EntryIterator.next(HashMap.java:1526) ~[?:?]
> at java.util.HashMap$EntryIterator.next(HashMap.java:1524) ~[?:?]
> at
> org.apache.ignite.internal.util.IgniteUtils.writeMap(IgniteUtils.java:5706)
> ~[classes/:?]
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionFullMap.writeExternal(GridDhtPartitionFullMap.java:188)
> ~[classes/:?]
> at
> java.io.ObjectOutputStream.writeExternalData(ObjectOutputStream.java:1460)
> ~[?:?]
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431)
> ~[?:?]
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
> ~[?:?]
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:349)
> ~[?:?]
> at java.util.HashMap.internalWriteEntries(HashMap.java:1840) ~[?:?]
> at java.util.HashMap.writeObject(HashMap.java:1411) ~[?:?]
> at jdk.internal.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
> ~[?:?]
> at
> jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:?]
> at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
> at
> java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1145)
> ~[?:?]
> at
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1497)
> ~[?:?]
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
> ~[?:?]
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
> ~[?:?]
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1553)
> ~[?:?]
> at
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)
> ~[?:?]
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
> ~[?:?]
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
> ~[?:?]
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1553)
> ~[?:?]
> at
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)
> ~[?:?]
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
> ~[?:?]
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
> ~[?:?]
> at
> java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1553)
> ~[?:?]
> at
> java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1510)
> ~[?:?]
> at
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1433)
> ~[?:?]
> at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1179)
> ~[?:?]
> at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:349)
> ~[?:?]
> at
> org.apache.ignite.marshaller.jdk.JdkMarshaller.marshal0(JdkMarshaller.java:97)
> ~[classes/:?]
> at
> org.apache.ignite.marshaller.jdk.JdkMarshaller.marshal0(JdkMarshaller.java:109)
> ~[classes/:?]
> at
> org.apache.ignite.marshaller.AbstractNodeNameAwareMarshaller.marshal(AbstractNodeNameAwareMarshaller.java:56)
> ~[classes/:?]
> at
> org.apache.ignite.internal.util.IgniteUtils.marshal(IgniteUtils.java:10827)
> [classes/:?]
> at
> org.apache.ignite.spi.discovery.tcp.ServerImpl$RingMessageWorker.notifyDiscoveryListener(ServerImpl.java:6420)
> ~[classes/:?]
> ... 8 more
> {noformat}
> In my understanding, the discovery message should not be changed in any way
> outside the _disco-notifier_ thread. At least,
> _CacheAffinityChangeMessage.partitionsMessage()_ method should always return
> a deep copy of _GridDhtPartitionsFullMessage_ in order to overcome the issue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)