[jira] [Commented] (CASSANDRA-10241) Keep a separate production debug log for troubleshooting

Paulo Motta (JIRA) Mon, 28 Sep 2015 16:14:24 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934310#comment-14934310
 ]


Paulo Motta commented on CASSANDRA-10241:
-----------------------------------------

Now that we have the basic capability committed, I'd like to follow up on this 
by introducing a simple logging guideline for future system logging statements, 
based on the discussions of this thread and current practices. This guideline 
could help external and new contributors to understand the logging practices, 
and current contributors to review tickets related to logging using the new 
framework.

I've drafted an initial version for review, presented below:

*INFO*: General cluster status, operations overview. At this level a beginner 
user or operator should be able to understand most messages. 
Examples:
* Node startup and shutdown information
* User or system triggered operations overview
** Repair start and finish state
** Cleanup start and finish state
** Bootstrap start and finish state
** Index rebuild start and finish state

*DEBUG*: Low frequency state changes or message passing. Non-critical path logs 
on operation details, performance measurements or general troubleshooting 
information. At this level an advanced operator or system developer will have 
elements to investigate or detect erroneous conditions or performance 
bottlenecks, extract reproduction steps or inspect advanced operational 
information.
Examples:
* SSTable flushing
* Compactions in progress
* Gossip or schema state changes
* Operations intermediate steps
** Repair steps
** Stream session message exchanges

*WARN*: Use of suboptimal parameters or deprecated options, detection of 
degraded performance, capability limitations or missing dependencies. General 
optimization tips. At this level, an operator should be able to detect an 
eminent error condition, use of suboptimal parameters or non-critical 
configuration errors. Examples:
* Use of chunk_length_in_kb property instead of chunk_length
* GC above treshold warnings
* OpenJDK not recommended notice
* Small sstable size warning (Testing done for CASSANDRA-5727 indicates that 
performance improves up to 160MB)

*ERROR*:  A expected error condition has ocurred. Non-critical, transient or 
recovered errors might be reported at DEBUG level instead so they don't pollute 
system.log.
Examples:
 * critical errors in general (corrupted disk, read error, etc)
 * leak detection

*TRACE*:  High frequency state changes or message passing, critical path logs, 
testing or development information. This level is disabled by default, so 
everything that does not fit in the previous levels and highly verbose stuff 
must be kept at TRACE level. 
Examples:
* Failure detector checks
* Gossip digests
* CassandraServer.insert()

What do you think [~aweisberg]? After review and suggestions, if there are no 
objections, I will add this to the wiki and send an e-mail to the dev list.

After this, the next step would be to groom the current logs in a separate 
ticket so they follow the guideline.

> Keep a separate production debug log for troubleshooting
> --------------------------------------------------------
>
>                 Key: CASSANDRA-10241
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10241
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Config
>            Reporter: Jonathan Ellis
>            Assignee: Paulo Motta
>             Fix For: 2.2.x, 3.0.0 rc2
>
>         Attachments: 2.2-debug.log, 2.2-system.log, 3.0-debug.log, 
> 3.0-system.log
>
>
> [~aweisberg] had the suggestion to keep a separate debug log for aid in 
> troubleshooting, not intended for regular human consumption but where we can 
> log things that might help if something goes wrong.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10241) Keep a separate production debug log for troubleshooting

Reply via email to