[ https://issues.apache.org/jira/browse/IGNITE-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16078100#comment-16078100 ]
ASF GitHub Bot commented on IGNITE-5473: ---------------------------------------- GitHub user AMashenkov opened a pull request: https://github.com/apache/ignite/pull/2262 IGNITE-5473: Create ignite troubleshooting logger. Partial fix. You can merge this pull request into a Git repository by running: $ git pull https://github.com/gridgain/apache-ignite ignite-5473 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/ignite/pull/2262.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2262 ---- commit 6abe8ec92f016a18883332de2bc177fbab30a4c1 Author: Alexey Goncharuk <alexey.goncha...@gmail.com> Date: 2017-06-14T18:37:54Z WIP. commit 4ab6d52cfa2f0dda6170ad1dff80d4a42c2a0706 Author: Andrey V. Mashenkov <andrey.mashen...@gmail.com> Date: 2017-06-30T11:45:18Z WIP. ---- > Create ignite troubleshooting logger > ------------------------------------ > > Key: IGNITE-5473 > URL: https://issues.apache.org/jira/browse/IGNITE-5473 > Project: Ignite > Issue Type: Improvement > Components: general > Affects Versions: 2.0 > Reporter: Alexey Goncharuk > Priority: Critical > Labels: important, observability > Fix For: 2.2 > > > Currently, we have two extremes of logging - either INFO wich logs almost > nothing, or DEBUG, which will pollute logs with too verbose messages. > We should create a 'troubleshooting' logger, which should be easily enabled > (via a system property, for example) and log all stability-critical node and > cluster events: > * Connection events (both communication and discovery), handshake status > * ALL ignored messages and skipped actions (even those we assume are safe to > ignore) > * Partition exchange stages and timings > * Verbose discovery state changes (this should make it easy to understand > the reason for 'Node has not been connected to the topology') > * Transaction failover stages and actions > * All unlogged exceptions > * Responses that took more than N milliseconds when in normal they should > return right away > * Long discovery SPI messages processing times > * Managed service deployment stages > * Marshaller mappings registration and notification > * Binary metadata registration and notification > * Continuous query registration / notification > (add more) > The amount of logging should be chosen accurately so that it would be safe to > enable this logger in production clusters. -- This message was sent by Atlassian JIRA (v6.4.14#64029)