Hi Denis, but should we treat current behavior as a bug that should be fixed asap or currently we should treat it as a known limitation? Because now, IgniteOOM means that the whole cluster should be restarted.
Thanks, Mikhail. On Thu, Dec 14, 2017 at 2:03 AM, Denis Magda <[email protected]> wrote: > Hello Mikhail, > > This problem is related to the discussion around Ignite internal problems > and their possible resolution: > http://apache-ignite-developers.2346864.n4.nabble.com/Internal-problems- > requiring-graceful-node-shutdown-reboot-etc-td24856.html < > http://apache-ignite-developers.2346864.n4.nabble.com/Internal-problems- > requiring-graceful-node-shutdown-reboot-etc-td24856.html> > > Referring to that discussion, I would define a special IgniteFailureAction > in response to IgniteOOM (IgniteFailureCause in terms of the new API). The > action can purge, wipe out the page memory or do another extra steps. > > — > Denis > > > On Dec 13, 2017, at 9:14 AM, Mikhail Cherkasov <[email protected]> > wrote: > > > > Hi all, > > > > I faced with a problem that if Ignite has no memory and IgniteOOM was > > thrown, there's no way to continues work with a cluster. > > > > You cannot remove some part of data to free some space because during > > removing Ignite tries to move pages to a free list and free list tries > > to acquire more pages, but there's no more space for this. > > > > Ignite can not revert transactions properly due to the same reason. > > If IgniteOOM occurs during transaction Ignite will try to revert already > > applied changes and as result will move some pages to free list and > there's > > the same problem as above, no space for the free list too. > > > > And you even cannot add more nodes, because after rebalancing ignite will > > try to evict pages and this means again we need to a space for free list: > > https://issues.apache.org/jira/browse/IGNITE-7019 > > > > Do you have ideas how we can properly handle this? > > > > -- > > Thanks, > > Mikhail. > > -- Thanks, Mikhail.
