Yakov, Thanks for response. I definitely like the idea of detecting Java level deadlocks.
As for hangs caused by Ignite internal problems, do we have a ticket for this as well? Do you have any idea about how this should be implemented? -Val On Mon, Jul 24, 2017 at 3:55 AM, Yakov Zhdanov <yzhda...@apache.org> wrote: > Val, it seems you spotted and issue. Please file a ticket - I would suggest > to remove the exceptions entirely as in my understanding timeout logic for > atomic operation will bring additional overhead, but most of the time > atomic operations are instant. From timeout perspective, what differs > atomic operation from a transaction is that you cannot predict when user > releases lock he acquired inside a transaction, but atomic operation should > have predictable timeout. > > As far as your example. Currently, this will lead to java-level deadlock on > synchronized sections for the cache entries (but when we move to pure > thread-per-partition for atomic caches this will not be an issue any more > https://issues.apache.org/jira/browse/IGNITE-4506). I would suggest we > file > a ticket to implement detection of java-level deadlock and allow user to > configure policy to take appropriate action on deadlock wherever it happens > - https://issues.apache.org/jira/browse/IGNITE-5811 > > Any other hang of the atomic operation seem to be caused by issues in > Ignite's internal machinery - either hanged exchange or problems in message > processing on some node (e.g. all threads are busy and/or in deadlock) > which again should result in notifying user and stopping node (by default). > > --Yakov >