Pavel Pereslegin created IGNITE-16118: -----------------------------------------
Summary: GridCacheEntryRemovedException appears in server log when performing concurrent cache read with expiry policy. Key: IGNITE-16118 URL: https://issues.apache.org/jira/browse/IGNITE-16118 Project: Ignite Issue Type: Bug Reporter: Pavel Pereslegin Assignee: Pavel Pereslegin In a rare case, you might observe a confusing {{GridCacheEntryRemovedException "Failed to send TTL update request"}} in logs while concurrently *reading a non-expired* cache value. {noformat} [ERROR][sys-#258%expiry.EntryRemovedOnReadTest2%][root] <default> Failed to send TTL update request. org.apache.ignite.internal.processors.cache.GridCacheEntryRemovedException at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.checkObsolete(GridCacheMapEntry.java:3052) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReadersLocked(GridDhtCacheEntry.java:732) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReaders(GridDhtCacheEntry.java:708) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.readers(GridDhtCacheEntry.java:416) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$8.run(GridDhtCacheAdapter.java:1122) at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7329) at org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {noformat} It looks like there is a race between {{sendTtlUpdateRequest(UUID, GridCacheTtlUpdateRequest)}} and heap entry eviction, in this case, the keys-loop is interrupted by the NPE, which is currently not outputting anywhere. Since ttl updates are not ordered (see IGNITE-305), it is difficult to reproduce the real-world problem that this issue could cause, but we shouldn't log a weird exception anyway and continue the loop without NPE. Reproducer: {code:java} public class EntryRemovedOnReadTest extends GridCommonAbstractTest { private final ListeningTestLogger log = new ListeningTestLogger(GridAbstractTest.log); @Override protected IgniteConfiguration getConfiguration(String igniteInstanceName) throws Exception { return super.getConfiguration(igniteInstanceName) .setGridLogger(log) .setCacheConfiguration(new CacheConfiguration<Integer, Integer>(DEFAULT_CACHE_NAME) .setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL) .setCacheMode(REPLICATED) ); } @Test public void test() throws Exception { LogListener lsnr = LogListener.matches("Failed to send TTL update request").build(); log.registerListener(lsnr); startGridsMultiThreaded(4); Map<Integer, Integer> vals = new TreeMap<>(); for (int i = 1; i < 10; i++) vals.put(i, i); jcache(0).putAll(vals); assertFalse(lsnr.check()); long timeout = 20_000L; long stopTime = System.currentTimeMillis() + timeout; int iter = 0; IgniteCache<Object, Object> cache = jcache(1).withExpiryPolicy(new ExpiryPolicy() { @Override public Duration getExpiryForAccess() { return new Duration(TimeUnit.MILLISECONDS, timeout); } @Override public Duration getExpiryForCreation() { return null; } @Override public Duration getExpiryForUpdate() { return null; } }); while (System.currentTimeMillis() < stopTime) { cache.getAll(vals.keySet()); assertFalse("iter=" + ++iter, lsnr.check()); } } } {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)