[
https://issues.apache.org/jira/browse/IGNITE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17135618#comment-17135618
]
Ivan Daschinskiy edited comment on IGNITE-5564 at 6/15/20, 9:20 AM:
--------------------------------------------------------------------
[~agoncharuk] I think this ticket is still actual. I accidentially caught this
during DML (UPDATE or DELETE) on atomic persistence cache when client
reconnects to cluster. [^TestReconnectClientToRestartedCluster.java] --
reproducer is attached, about 4 fails of 20 runs on my laptop.
I got this calltrace:
{code:java}
Caused by: java.lang.AssertionError: Invalid version for inner update
[isNew=false, entry=GridDhtCacheEntry [rdrs=ReaderId[] [], part=100,
super=GridDistributedCacheEntry [super=GridCacheMapEntry
[key=KeyCacheObjectImpl [part=100, val=100, hasValBytes=true],
val=PointOfInterest [idHash=1265448417, hash=1492433019, name=POI_100,
latitude=null, longitude=null, NAME=POI_100], ver=GridCacheVersion
[topVer=203354504, order=1591874500062, nodeOrder=1], hash=100, extras=null,
flags=0]]], newVer=GridCacheVersion [topVer=203354503, order=1591874501171,
nodeOrder=1]]
at
org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.versionCheck(GridCacheMapEntry.java:6703)
at
org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:6090)
at
org.apache.ignite.internal.processors.cache.GridCacheMapEntry$AtomicCacheUpdateClosure.call(GridCacheMapEntry.java:5863)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.invokeClosure(BPlusTree.java:3994)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree$Invoke.access$5700(BPlusTree.java:3888)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invokeDown(BPlusTree.java:2020)
at
org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.invoke(BPlusTree.java:1898)
... 26 more
{code}
Version -- actual master (2.9.0-SNAPSHOT)
was (Author: ivandasch):
[~agoncharuk] I think this ticket is still actual. I accidentially caught this
during DML (UPDATE or DELETE) on atomic persistence cache when client
reconnects to cluster. [^TestReconnectClientToRestartedCluster.java] --
reproducer is attached, about 4 fails of 20 runs on my laptop.
> Race between read-through and topology version update
> -----------------------------------------------------
>
> Key: IGNITE-5564
> URL: https://issues.apache.org/jira/browse/IGNITE-5564
> Project: Ignite
> Issue Type: Bug
> Components: cache
> Affects Versions: 1.7
> Reporter: Alexey Goncharuk
> Assignee: Dmitry Karachentsev
> Priority: Major
> Attachments:
> GridCachePartitionEvictionDuringReadThroughSelfTest.java,
> TestReconnectClientToRestartedCluster.java
>
>
> I occasionally observe the following assertions when working with ATOMIC
> cache with cache store on changing topology:
> {code}
> java.lang.AssertionError: Invalid version for inner update [isNew=false,
> entry=GridDhtAtomicCacheEntry [super=GridDhtCacheEntry [rdrs=[],
> locPart=GridDhtLocalPartition [id=157,
> map=org.apache.ignite.internal.processors.cache.GridCacheConcurrentMapImpl@7a99d0af,
> rmvQueue=GridCircularBuffer [sizeMask=31, idxGen=0], cntr=8,
> shouldBeRenting=false, state=OWNING, reservations=0, empty=false,
> createTime=06/21/2017 09:59:03], super=GridDistributedCacheEntry
> [super=GridCacheMapEntry [key=KeyCacheObjectImpl [val=1181,
> hasValBytes=true], val=CacheObjectImpl [val=1181, hasValBytes=true],
> startVer=1498028394357, ver=GridCacheVersion [topVer=109508344,
> time=1498028344708, order=1498028394358, nodeOrder=1], hash=1181,
> extras=GridCacheTtlEntryExtras [ttl=60000, expireTime=1498028404707],
> flags=0]]]], newVer=GridCacheVersion [topVer=109508343, time=1498028344709,
> order=1498028394369, nodeOrder=1]]
> at
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2311)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2485)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1887)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1727)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.mapSingle(GridNearAtomicAbstractUpdateFuture.java:264)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.map(GridNearAtomicSingleUpdateFuture.java:494)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicSingleUpdateFuture.mapOnTopology(GridNearAtomicSingleUpdateFuture.java:436)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridNearAtomicAbstractUpdateFuture.map(GridNearAtomicAbstractUpdateFuture.java:209)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update0(GridDhtAtomicCache.java:1242)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.put0(GridDhtAtomicCache.java:675)
> at
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2294)
> at
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearAtomicCache.put(GridNearAtomicCache.java:437)
> at
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2271)
> at
> org.apache.ignite.internal.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1379)
> {code}
> The assertion happens because there is a race between these events:
> 1) An update is mapped on topology version N
> 2) Topology version changes and discovery updates the version to N+1, but the
> event is not yet processed by the exchange future
> 3) A read-through request comes in and performs the read. Inside
> {{versionedValue()}} call a new entry version is generated. Since the
> discovery version is already updated, the new entry version is based on
> topVer=N+1
> 4) Update request proceeds and read-locks the topology. Since the exchange
> future is not yet initialized, the request does not attempt to remap and
> proceeds with version N
> 5) The next entry version is generated using request topology version = N
> 6) Inside the entry update method, we assert that new version is greater than
> old version, but it's not the case in this scenario, and assertion fails
> Attached is a test reproducing the issue (see
> testConcurrentReadThroughUpdate())
--
This message was sent by Atlassian Jira
(v8.3.4#803005)