Hi everyone, We use Ignite 1.9, three node cluster at the moment(partitioned cache with 1 backup, and eviction max 500MB, cache expiry = 15minutes), and we can see following behavior.
Following [1] logs are from one JVM. Cluster started on 2018-09-26 2 AM. And this issue started happening predictably after couple of days.(we see this many time over the past months). And when recycled the JVM everything comes back normal. Here you can see three cache puts happening few seconds apart. And If you compare time stamp, after every time the put happens, it's followed by an eviction. We can see the same pattern of logs for every cache key coming in to the system. Meaning when the second request comes in few seconds later, it's been evicted and cache miss. The only cache operations we use entirely is Put, get, contains and remove. For the whole time from server startup there wasn't any errors from ignite side. Not even a timeout from a tcp com spi. Only significant thing I could find was, few hours before this behavior (which is eventually the cache hit rates going down because of this eviction behavior), taken place, there was 36,000 "CACHE_ENTRY_EVICTED" events fired within one second. (2018-09-27 11:51:18 to 2018-09-27 11:51:19). And 4084 and 4056 are the cache sizes one minute before and after above scenario as per ignite metrics. Looks very suspicious to me, but could not find anything to relate to this. No error logs, heap, cpu is fine all the time. [1] CACHE_OBJECT_PUT 2018-09-28 12:48:41,844 Event received: CacheEvent [cacheName=mycache, part=214, key=06cda95e28ba8266e1f2d94cfbbcbeb308cccd1904fcac4f54a1d48a0e224752, xid=null, lockId=GridCacheVersion [topVer=149325845, time=3076306530984, order=1537944629823, nodeOrder=13], newVal=com.main.CacheableResponse @47386d2b, oldVal=null, hasOldVal=false, hasNewVal=true, near=false, subjId=37f25eae-79dd-4de7-90a6-0ee34a0f5c9c, cloClsName=null, taskName=null, nodeId8=cc93a6bb, evtNodeId8=4d018b28, msg=Cache event., type=CACHE_OBJECT_PUT, tstamp=1538153321836] 2018-09-28 12:48:43,643 Event received: CacheEvent [cacheName=mycache, part=214, key=06cda95e28ba8266e1f2d94cfbbcbeb308cccd1904fcac4f54a1d48a0e224752, xid=null, lockId=GridCacheVersion [topVer=149325845, time=3076306532784, order=1537944629892, nodeOrder=13], newVal= com.main.CacheableResponse @28bf326c, oldVal=null, hasOldVal=false, hasNewVal=true, near=false, subjId=4d018b28-5e69-49f7-9d5e-a0c3c098b71b, cloClsName=null, taskName=null, nodeId8=cc93a6bb, evtNodeId8=4d018b28, msg=Cache event., type=CACHE_OBJECT_PUT, tstamp=1538153323637] 2018-09-28 12:48:59,290 Event received: CacheEvent [cacheName=mycache, part=214, key=06cda95e28ba8266e1f2d94cfbbcbeb308cccd1904fcac4f54a1d48a0e224752, xid=null, lockId=GridCacheVersion [topVer=149325845, time=3076306668429, order=1537944630394, nodeOrder=13], newVal= com.main.CacheableResponse @154a3c61, oldVal=null, hasOldVal=false, hasNewVal=true, near=false, subjId=cc93a6bb-0736-423b-92e9-91ae7883ec93, cloClsName=null, taskName=null, nodeId8=cc93a6bb, evtNodeId8=4d018b28, msg=Cache event., type=CACHE_OBJECT_PUT, tstamp=1538153339280] CACHE_ENTRY_EVICTED 2018-09-28 12:48:42,177 Event received: CacheEvent [cacheName=mycache, part=214, key=06cda95e28ba8266e1f2d94cfbbcbeb308cccd1904fcac4f54a1d48a0e224752, xid=null, lockId=null, newVal=null, oldVal= com.main.CacheableResponse@717a7062, hasOldVal=true, hasNewVal=false, near=false, subjId=null, cloClsName=null, taskName=null, nodeId8=cc93a6bb, evtNodeId8=cc93a6bb, msg=Cache event., type=CACHE_ENTRY_EVICTED, tstamp=1538153322171] 2018-09-28 12:48:53,202 Event received: CacheEvent [cacheName=mycache, part=214, key=06cda95e28ba8266e1f2d94cfbbcbeb308cccd1904fcac4f54a1d48a0e224752, xid=null, lockId=null, newVal=null, oldVal= com.main.CacheableResponse @46434293, hasOldVal=true, hasNewVal=false, near=false, subjId=null, cloClsName=null, taskName=null, nodeId8=cc93a6bb, evtNodeId8=cc93a6bb, msg=Cache event., type=CACHE_ENTRY_EVICTED, tstamp=1538153333200] 2018-09-28 12:49:00,552 Event received: CacheEvent [cacheName= mycache, part=214, key=06cda95e28ba8266e1f2d94cfbbcbeb308cccd1904fcac4f54a1d48a0e224752, xid=null, lockId=null, newVal=null, oldVal= com.main.CacheableResponse @51d4e8e4, hasOldVal=true, hasNewVal=false, near=false, subjId=null, cloClsName=null, taskName=null, nodeId8=cc93a6bb, evtNodeId8=cc93a6bb, msg=Cache event., type=CACHE_ENTRY_EVICTED, tstamp=1538153340544]
