[jira] [Commented] (IGNITE-6893) Java Deadlocks monitoring
[ https://issues.apache.org/jira/browse/IGNITE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17319650#comment-17319650 ] Stanislav Lukyanov commented on IGNITE-6893: We don't only check for starvation, we check for any hanging Ignite threads with the thread heartbeats. The only way we may miss a deadlock is if it is in a user's thread. I don't think Ignite should be responsible for finding deadlocks in the threads it doesn't manage. > Java Deadlocks monitoring > - > > Key: IGNITE-6893 > URL: https://issues.apache.org/jira/browse/IGNITE-6893 > Project: Ignite > Issue Type: Improvement >Reporter: Anton Vinogradov (Obsolete, actual is "av") >Priority: Major > Labels: iep-7 > > Java Level Deadlocks > Description > This situation occurs if user or Ignite comes to a Java-level deadlock due to > a bug in code - reverse order synchronized(mux1) {synchronized (mux2) {}} > sections, reverse order reentrant locks, etc. > Detection and Solution > This most likely cannot be resolved automatically and will require JVM > restart. > We can implement periodical threaddumps analysis and detect the deadlock. > Report > Deadlock should be reported to the logs. > Web Console should fire an alert on java deadlock detection and display a > warning on UI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-6893) Java Deadlocks monitoring
[ https://issues.apache.org/jira/browse/IGNITE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163574#comment-17163574 ] Anton Vinogradov commented on IGNITE-6893: -- [~slukyanov], Java deadlock may happen because of the huge list of problems. Checking only pool starvation we'll skip this important guarantee level. > Java Deadlocks monitoring > - > > Key: IGNITE-6893 > URL: https://issues.apache.org/jira/browse/IGNITE-6893 > Project: Ignite > Issue Type: Improvement >Reporter: Anton Vinogradov >Priority: Major > Labels: iep-7 > > Java Level Deadlocks > Description > This situation occurs if user or Ignite comes to a Java-level deadlock due to > a bug in code - reverse order synchronized(mux1) {synchronized (mux2) {}} > sections, reverse order reentrant locks, etc. > Detection and Solution > This most likely cannot be resolved automatically and will require JVM > restart. > We can implement periodical threaddumps analysis and detect the deadlock. > Report > Deadlock should be reported to the logs. > Web Console should fire an alert on java deadlock detection and display a > warning on UI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-6893) Java Deadlocks monitoring
[ https://issues.apache.org/jira/browse/IGNITE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17162843#comment-17162843 ] Stanislav Lukyanov commented on IGNITE-6893: I believe we don't need this feature in particular now that we have Failure Handlers. If there is a Java deadlock it is almost a guarantee that it will lead to a blocked system thread, which will trigger FH. How about we close this [~avinogradov] [~andrey-kuznetsov]? > Java Deadlocks monitoring > - > > Key: IGNITE-6893 > URL: https://issues.apache.org/jira/browse/IGNITE-6893 > Project: Ignite > Issue Type: Improvement >Reporter: Anton Vinogradov >Priority: Major > Labels: iep-7 > > Java Level Deadlocks > Description > This situation occurs if user or Ignite comes to a Java-level deadlock due to > a bug in code - reverse order synchronized(mux1) {synchronized (mux2) {}} > sections, reverse order reentrant locks, etc. > Detection and Solution > This most likely cannot be resolved automatically and will require JVM > restart. > We can implement periodical threaddumps analysis and detect the deadlock. > Report > Deadlock should be reported to the logs. > Web Console should fire an alert on java deadlock detection and display a > warning on UI. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-6893) Java Deadlocks monitoring
[ https://issues.apache.org/jira/browse/IGNITE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456305#comment-16456305 ] Andrey Kuznetsov commented on IGNITE-6893: -- {{ThreadMXBean::findDeadlockedThreads}} don't care whether lock acquisition has a timeout or not. In both cases any loop found will be reported as deadlock. We can analyze this report more thoroughly by using thread state information (WAITING vs TIMED_WAITING). > Java Deadlocks monitoring > - > > Key: IGNITE-6893 > URL: https://issues.apache.org/jira/browse/IGNITE-6893 > Project: Ignite > Issue Type: Improvement >Reporter: Anton Vinogradov >Assignee: Andrey Kuznetsov >Priority: Major > Labels: iep-7 > Fix For: 2.6 > > > Java Level Deadlocks > Description > This situation occurs if user or Ignite comes to a Java-level deadlock due to > a bug in code - reverse order synchronized(mux1) {synchronized (mux2) {}} > sections, reverse order reentrant locks, etc. > Detection and Solution > This most likely cannot be resolved automatically and will require JVM > restart. > We can implement periodical threaddumps analysis and detect the deadlock. > Report > Deadlock should be reported to the logs. > Web Console should fire an alert on java deadlock detection and display a > warning on UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-6893) Java Deadlocks monitoring
[ https://issues.apache.org/jira/browse/IGNITE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456022#comment-16456022 ] Andrey Kuznetsov commented on IGNITE-6893: -- I've done some benchmarks for {{ThreadMXBean::findDeadlockedThreads}}, both on Linux and Windows, and the results are good enough. A program with 200 active threads continuously contending for 50 synchronization aids (monitors or locks) checks itself for deadlocks every 10ms. At worst, deadlock detection consumes 0,1% of single hardware thread. So we can use it a separate thread in Ignite, by checking for deadlocks periodically and calling failure handler if necessary. > Java Deadlocks monitoring > - > > Key: IGNITE-6893 > URL: https://issues.apache.org/jira/browse/IGNITE-6893 > Project: Ignite > Issue Type: Improvement >Reporter: Anton Vinogradov >Assignee: Andrey Kuznetsov >Priority: Major > Labels: iep-7 > Fix For: 2.6 > > > Java Level Deadlocks > Description > This situation occurs if user or Ignite comes to a Java-level deadlock due to > a bug in code - reverse order synchronized(mux1) {synchronized (mux2) {}} > sections, reverse order reentrant locks, etc. > Detection and Solution > This most likely cannot be resolved automatically and will require JVM > restart. > We can implement periodical threaddumps analysis and detect the deadlock. > Report > Deadlock should be reported to the logs. > Web Console should fire an alert on java deadlock detection and display a > warning on UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-6893) Java Deadlocks monitoring
[ https://issues.apache.org/jira/browse/IGNITE-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16432429#comment-16432429 ] Andrey Kuznetsov commented on IGNITE-6893: -- Unified failure handling implemented in [1] can be used to handle deadlocks. As for detection, the most tempting way is the use of {{ThreadMXBean::findDeadlockedThreads}}, but we are to estimate its overhead first, especially for the large number of active threads. > Java Deadlocks monitoring > - > > Key: IGNITE-6893 > URL: https://issues.apache.org/jira/browse/IGNITE-6893 > Project: Ignite > Issue Type: Improvement >Reporter: Anton Vinogradov >Assignee: Andrey Kuznetsov >Priority: Major > Labels: iep-7 > Fix For: 2.6 > > > Java Level Deadlocks > Description > This situation occurs if user or Ignite comes to a Java-level deadlock due to > a bug in code - reverse order synchronized(mux1) {synchronized (mux2) {}} > sections, reverse order reentrant locks, etc. > Detection and Solution > This most likely cannot be resolved automatically and will require JVM > restart. > We can implement periodical threaddumps analysis and detect the deadlock. > Report > Deadlock should be reported to the logs. > Web Console should fire an alert on java deadlock detection and display a > warning on UI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)