[ 
https://issues.apache.org/jira/browse/IGNITE-16118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Pereslegin updated IGNITE-16118:
--------------------------------------
     Attachment: image(1).png
    Description: 
In a rare case, you might observe a confusing {{GridCacheEntryRemovedException 
"Failed to send TTL update request"}} in logs while *reading a non-expired* 
cache value.

{noformat}
[ERROR][sys-#258%expiry.EntryRemovedOnReadTest2%][root] <default> Failed to 
send TTL update request.
org.apache.ignite.internal.processors.cache.GridCacheEntryRemovedException
        at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.checkObsolete(GridCacheMapEntry.java:3052)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReadersLocked(GridDhtCacheEntry.java:732)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReaders(GridDhtCacheEntry.java:708)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.readers(GridDhtCacheEntry.java:416)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$8.run(GridDhtCacheAdapter.java:1122)
        at 
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7329)
        at 
org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
{noformat}

It looks like there is a race between {{sendTtlUpdateRequest(UUID, 
GridCacheTtlUpdateRequest)}} and heap entry eviction.

This situation is currently not handled correctly, resulting in a hidden 
NullPointerException that aborts TTL updates.

Due to the nature of the TTL updates in Ignite, this problem can only occur 
when the TTL is updated from the backup node.

 !image(1).png! 

Thus, in a very rare case, we may notice a lack of updates in the near-cache 
and on "non-initiator" backups 

  was:
In a rare case, you might observe a confusing {{GridCacheEntryRemovedException 
"Failed to send TTL update request"}} in logs while *reading a non-expired* 
cache value.

{noformat}
[ERROR][sys-#258%expiry.EntryRemovedOnReadTest2%][root] <default> Failed to 
send TTL update request.
org.apache.ignite.internal.processors.cache.GridCacheEntryRemovedException
        at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.checkObsolete(GridCacheMapEntry.java:3052)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReadersLocked(GridDhtCacheEntry.java:732)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReaders(GridDhtCacheEntry.java:708)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.readers(GridDhtCacheEntry.java:416)
        at 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$8.run(GridDhtCacheAdapter.java:1122)
        at 
org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7329)
        at 
org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
{noformat}

It looks like there is a race between {{sendTtlUpdateRequest(UUID, 
GridCacheTtlUpdateRequest)}} and heap entry eviction. In this case, the 
keys-loop is interrupted by the NPE ( which is currently not outputting 
anywhere). 

Reproducer:

{code:java}
public class EntryRemovedOnReadTest extends GridCommonAbstractTest {
    private final ListeningTestLogger log = new 
ListeningTestLogger(GridAbstractTest.log);

    @Override protected IgniteConfiguration getConfiguration(String 
igniteInstanceName) throws Exception {
        return super.getConfiguration(igniteInstanceName)
            .setGridLogger(log)
            .setCacheConfiguration(new CacheConfiguration<Integer, 
Integer>(DEFAULT_CACHE_NAME)
                .setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL)
                .setCacheMode(REPLICATED)
            );
    }

    @Test
    public void test() throws Exception {
        LogListener lsnr = LogListener.matches("Failed to send TTL update 
request").build();

        log.registerListener(lsnr);

        startGridsMultiThreaded(4);

        Map<Integer, Integer> vals = new TreeMap<>();

        for (int i = 1; i < 10; i++)
            vals.put(i, i);

        jcache(0).putAll(vals);

        assertFalse(lsnr.check());

        long timeout = 20_000L;
        long stopTime = System.currentTimeMillis() + timeout;
        int iter = 0;

        IgniteCache<Object, Object> cache = jcache(1).withExpiryPolicy(new 
ExpiryPolicy() {
            @Override public Duration getExpiryForAccess() {
                return new Duration(TimeUnit.MILLISECONDS, timeout);
            }

            @Override public Duration getExpiryForCreation() {  return null; }
            @Override public Duration getExpiryForUpdate() { return null; }
        });

        while (System.currentTimeMillis() < stopTime) {
            cache.getAll(vals.keySet());

            assertFalse("iter=" + ++iter, lsnr.check());
        }
    }
}
{code}



> Near cache entry can miss a TTL update if an NPE occurs on the primary node.
> ----------------------------------------------------------------------------
>
>                 Key: IGNITE-16118
>                 URL: https://issues.apache.org/jira/browse/IGNITE-16118
>             Project: Ignite
>          Issue Type: Bug
>            Reporter: Pavel Pereslegin
>            Assignee: Pavel Pereslegin
>            Priority: Minor
>              Labels: ise
>             Fix For: 2.13
>
>         Attachments: image(1).png
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> In a rare case, you might observe a confusing 
> {{GridCacheEntryRemovedException "Failed to send TTL update request"}} in 
> logs while *reading a non-expired* cache value.
> {noformat}
> [ERROR][sys-#258%expiry.EntryRemovedOnReadTest2%][root] <default> Failed to 
> send TTL update request.
> org.apache.ignite.internal.processors.cache.GridCacheEntryRemovedException
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.checkObsolete(GridCacheMapEntry.java:3052)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReadersLocked(GridDhtCacheEntry.java:732)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.checkReaders(GridDhtCacheEntry.java:708)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry.readers(GridDhtCacheEntry.java:416)
>       at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter$8.run(GridDhtCacheAdapter.java:1122)
>       at 
> org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7329)
>       at 
> org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> {noformat}
> It looks like there is a race between {{sendTtlUpdateRequest(UUID, 
> GridCacheTtlUpdateRequest)}} and heap entry eviction.
> This situation is currently not handled correctly, resulting in a hidden 
> NullPointerException that aborts TTL updates.
> Due to the nature of the TTL updates in Ignite, this problem can only occur 
> when the TTL is updated from the backup node.
>  !image(1).png! 
> Thus, in a very rare case, we may notice a lack of updates in the near-cache 
> and on "non-initiator" backups 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to