[
https://issues.apache.org/jira/browse/IGNITE-16118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavel Pereslegin updated IGNITE-16118:
--------------------------------------
Attachment: image(1).png
Description:
In a rare case, you might observe a confusing {{GridCacheEntryRemovedException
"Failed to send TTL update request"}} in logs while *reading a non-expired*
cache value.
{noformat}
[ERROR][sys-#258%expiry.EntryRemovedOnReadTest2%][root] <default> Failed to
send TTL update request.
org.apache.ignite.internal.processors.cache.GridCacheEntryRemovedException
at
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.checkObsolete(GridCacheMapEntry.java:3052)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReadersLocked(GridDhtCacheEntry.java:732)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReaders(GridDhtCacheEntry.java:708)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.readers(GridDhtCacheEntry.java:416)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$8.run(GridDhtCacheAdapter.java:1122)
at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7329)
at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}
It looks like there is a race between {{sendTtlUpdateRequest(UUID,
GridCacheTtlUpdateRequest)}} and heap entry eviction.
This situation is currently not handled correctly, resulting in a hidden
NullPointerException that aborts TTL updates.
Due to the nature of the TTL updates in Ignite, this problem can only occur
when the TTL is updated from the backup node.
!image(1).png!
Thus, in a very rare case, we may notice a lack of updates in the near-cache
and on "non-initiator" backups
was:
In a rare case, you might observe a confusing {{GridCacheEntryRemovedException
"Failed to send TTL update request"}} in logs while *reading a non-expired*
cache value.
{noformat}
[ERROR][sys-#258%expiry.EntryRemovedOnReadTest2%][root] <default> Failed to
send TTL update request.
org.apache.ignite.internal.processors.cache.GridCacheEntryRemovedException
at
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.checkObsolete(GridCacheMapEntry.java:3052)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReadersLocked(GridDhtCacheEntry.java:732)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReaders(GridDhtCacheEntry.java:708)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.readers(GridDhtCacheEntry.java:416)
at
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$8.run(GridDhtCacheAdapter.java:1122)
at
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7329)
at
org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
at
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}
It looks like there is a race between {{sendTtlUpdateRequest(UUID,
GridCacheTtlUpdateRequest)}} and heap entry eviction. In this case, the
keys-loop is interrupted by the NPE ( which is currently not outputting
anywhere).
Reproducer:
{code:java}
public class EntryRemovedOnReadTest extends GridCommonAbstractTest {
private final ListeningTestLogger log = new
ListeningTestLogger(GridAbstractTest.log);
@Override protected IgniteConfiguration getConfiguration(String
igniteInstanceName) throws Exception {
return super.getConfiguration(igniteInstanceName)
.setGridLogger(log)
.setCacheConfiguration(new CacheConfiguration<Integer,
Integer>(DEFAULT_CACHE_NAME)
.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL)
.setCacheMode(REPLICATED)
);
}
@Test
public void test() throws Exception {
LogListener lsnr = LogListener.matches("Failed to send TTL update
request").build();
log.registerListener(lsnr);
startGridsMultiThreaded(4);
Map<Integer, Integer> vals = new TreeMap<>();
for (int i = 1; i < 10; i++)
vals.put(i, i);
jcache(0).putAll(vals);
assertFalse(lsnr.check());
long timeout = 20_000L;
long stopTime = System.currentTimeMillis() + timeout;
int iter = 0;
IgniteCache<Object, Object> cache = jcache(1).withExpiryPolicy(new
ExpiryPolicy() {
@Override public Duration getExpiryForAccess() {
return new Duration(TimeUnit.MILLISECONDS, timeout);
}
@Override public Duration getExpiryForCreation() { return null; }
@Override public Duration getExpiryForUpdate() { return null; }
});
while (System.currentTimeMillis() < stopTime) {
cache.getAll(vals.keySet());
assertFalse("iter=" + ++iter, lsnr.check());
}
}
}
{code}
> Near cache entry can miss a TTL update if an NPE occurs on the primary node.
> ----------------------------------------------------------------------------
>
> Key: IGNITE-16118
> URL: https://issues.apache.org/jira/browse/IGNITE-16118
> Project: Ignite
> Issue Type: Bug
> Reporter: Pavel Pereslegin
> Assignee: Pavel Pereslegin
> Priority: Minor
> Labels: ise
> Fix For: 2.13
>
> Attachments: image(1).png
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In a rare case, you might observe a confusing
> {{GridCacheEntryRemovedException "Failed to send TTL update request"}} in
> logs while *reading a non-expired* cache value.
> {noformat}
> [ERROR][sys-#258%expiry.EntryRemovedOnReadTest2%][root] <default> Failed to
> send TTL update request.
> org.apache.ignite.internal.processors.cache.GridCacheEntryRemovedException
> at
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.checkObsolete(GridCacheMapEntry.java:3052)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReadersLocked(GridDhtCacheEntry.java:732)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReaders(GridDhtCacheEntry.java:708)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.readers(GridDhtCacheEntry.java:416)
> at
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$8.run(GridDhtCacheAdapter.java:1122)
> at
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7329)
> at
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
> at
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {noformat}
> It looks like there is a race between {{sendTtlUpdateRequest(UUID,
> GridCacheTtlUpdateRequest)}} and heap entry eviction.
> This situation is currently not handled correctly, resulting in a hidden
> NullPointerException that aborts TTL updates.
> Due to the nature of the TTL updates in Ignite, this problem can only occur
> when the TTL is updated from the backup node.
> !image(1).png!
> Thus, in a very rare case, we may notice a lack of updates in the near-cache
> and on "non-initiator" backups
--
This message was sent by Atlassian Jira
(v8.20.1#820001)