Denis Magda created IGNITE-2465: ----------------------------------- Summary: Assertion in load cache closure Key: IGNITE-2465 URL: https://issues.apache.org/jira/browse/IGNITE-2465 Project: Ignite Issue Type: Bug Components: cache Affects Versions: 1.5.0.final Reporter: Denis Magda Assignee: Artem Shutak Priority: Blocker Fix For: 1.6
This is a tricky one. Every once in a while I get an assertion exception due to a null cache instance. It's difficult to reproduce, but the reason is more or less clear. First, here's the sequence of events: 1) node N0 starts a cache with GridGain's LocalCacheStore configured (see the cache config below). 2) also, N0 registers a listener for the Ignite DISCO_EVENTS. 3) node N1 joins the cluster. 4) N0 receives a discovery event (EVT_NODE_JOINED) and triggers cache loading using IgniteCache.loadCache(null). 5) N1 throws an AssertionException due to Ignite.cache("persistent-cache") call returning a null. >From the log snippet below you can see that the exception is first reported >and then a millisecond later GridCacheProcessor reports that the cache was >started. This means that the cache load closure starts executing on node N1 a >bit too early while the cache is still being started. I believe Ignite must be >able to handle such race properly. {noformat} 9319 [pub-#212%N1%] ERROR GridJobWorker - Failed to execute job due to unexpected runtime exception [jobId=0af5bad7251-9f7af4ba-6a64-4de4-b5d2-81d59be05303, ses=GridJobSessionImpl [ses=GridTaskSessionImpl [taskName=o.a.i.i.processors.cache.GridCacheAdapter$LoadCacheClosure, dep=LocalDeployment [super=GridDeployment [ts=1453807336443, depMode=SHARED, clsLdr=sun.misc.Launcher$AppClassLoader@15db9742, clsLdrId=46f5bad7251-9f7af4ba-6a64-4de4-b5d2-81d59be05303, userVer=0, loc=true, sampleClsName=java.lang.String, pendingUndeploy=false, undeployed=false, usage=0]], taskClsName=o.a.i.i.processors.cache.GridCacheAdapter$LoadCacheClosure, sesId=e9f5bad7251-1edbab1e-37bf-424e-a9e1-0c866b95009d, startTime=1453807336874, endTime=9223372036854775807, taskNodeId=1edbab1e-37bf-424e-a9e1-0c866b95009d, clsLdr=sun.misc.Launcher$AppClassLoader@15db9742, closed=false, cpSpi=null, failSpi=null, loadSpi=null, usage=1, fullSup=false, subjId=1edbab1e-37bf-424e-a9e1-0c866b95009d, mapFut=IgniteFuture [orig=GridFutureAdapter [resFlag=0, res=null, startTime=1453807336894, endTime=0, ignoreInterrupts=false, lsnr=null, state=INIT]]], jobId=0af5bad7251-9f7af4ba-6a64-4de4-b5d2-81d59be05303]] java.lang.AssertionError: persistent-cache at org.apache.ignite.internal.processors.cache.GridCacheAdapter$LoadCacheClosure.call(GridCacheAdapter.java:5788) at org.apache.ignite.internal.processors.cache.GridCacheAdapter$LoadCacheClosure.call(GridCacheAdapter.java:5740) at org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1789) at org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:509) at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6397) at org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:503) at org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:456) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at org.apache.ignite.internal.processors.job.GridJobProcessor.processJobExecuteRequest(GridJobProcessor.java:1166) at org.apache.ignite.internal.processors.job.GridJobProcessor$JobExecutionListener.onMessage(GridJobProcessor.java:1770) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:821) at org.apache.ignite.internal.managers.communication.GridIoManager.access$1600(GridIoManager.java:103) at org.apache.ignite.internal.managers.communication.GridIoManager$5.run(GridIoManager.java:784) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 9320 [Thread-19] INFO GridCacheProcessor - Started cache [name=persistent-cache, mode=REPLICATED] {noformat} For reference, here's the code that configures the cache: {noformat} CacheConfiguration<K, V> config = new CacheConfiguration<>("persistent-cache"); config.setCacheMode(CacheMode.REPLICATED); config.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL); config.setRebalanceMode(CacheRebalanceMode.SYNC); config.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC); config.setStartSize(1024); config.setCacheStoreFactory(new LocalCacheStoreFactory(somepath)); config.setWriteThrough(true); {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)