I think, it would be more fair and simple to configure distributed
expiration as flag in cache configuration.
By the way, we still have to store ordered set of expirable entries on
every node. Having https://issues.apache.org/jira/browse/IGNITE-5874
merged, we can do the following: if distributed eviction is enabled,
primary node will scan PendingEntriesTree and generate remove requests,
if it's disabled, every node will clear it's own PendingEntriesTree.
This will allow user to switch distributed expiration on/off after grid
restart.
Best Regards,
Ivan Rakov
On 24.04.2018 17:02, Alexey Goncharuk wrote:
1.
Ivan,
Agree about the use-case when we have a read-write-through store. However,
we allow to use Ignite in-memory caches even without 3rd party stores, in
this case the same issue is still present. Maybe we can keep local expire
for read-through caches and have strongly consistent expire for other modes?
2018-04-24 16:51 GMT+03:00 Ivan Rakov <ivan.glu...@gmail.com>:
Alexey,
Distributed expire will result in serious performance overhead, mostly on
network level.
I think, the main use case of TTL are in-memory caches that accelerate
access to slower third-party data source. In such case nothing is broken if
data is missing; strong consistency guarantees are not needed. I think,
that's why we should keep "local expiration" at least for in-memory caches.
Our in-memory page eviction works in the same way.
Best Regards,
Ivan Rakov
On 24.04.2018 16:05, Alexey Goncharuk wrote:
Igniters,
We recently experienced some issues with TTL with enabled persistence, the
issues were related to persistence implementation details. However, when
we
were adding tests to cover more cases, we found more failures, which, I
think, reveal some fundamental issues with expire mechanism.
In short, the root cause of the issue is that we expire entries on primary
and backup nodes independently, which means:
1) Partition sizes may have different values at different times which will
trigger false-negative checks on partition map exchange which was recently
added
2) More importantly, this may lead to inconsistent primary and backup node
values when EntryProcessor is used, because an entry processor may observe
a non-null value on one node and a null value on another node.
In my opinion, the second issue is critical and we must change the expiry
mechanics to run expiry in a distributed mode, with cache mode semantics
for entry remove.
Thoughts?