[
https://issues.apache.org/jira/browse/CASSANDRA-10241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934310#comment-14934310
]
Paulo Motta commented on CASSANDRA-10241:
-----------------------------------------
Now that we have the basic capability committed, I'd like to follow up on this
by introducing a simple logging guideline for future system logging statements,
based on the discussions of this thread and current practices. This guideline
could help external and new contributors to understand the logging practices,
and current contributors to review tickets related to logging using the new
framework.
I've drafted an initial version for review, presented below:
*INFO*: General cluster status, operations overview. At this level a beginner
user or operator should be able to understand most messages.
Examples:
* Node startup and shutdown information
* User or system triggered operations overview
** Repair start and finish state
** Cleanup start and finish state
** Bootstrap start and finish state
** Index rebuild start and finish state
*DEBUG*: Low frequency state changes or message passing. Non-critical path logs
on operation details, performance measurements or general troubleshooting
information. At this level an advanced operator or system developer will have
elements to investigate or detect erroneous conditions or performance
bottlenecks, extract reproduction steps or inspect advanced operational
information.
Examples:
* SSTable flushing
* Compactions in progress
* Gossip or schema state changes
* Operations intermediate steps
** Repair steps
** Stream session message exchanges
*WARN*: Use of suboptimal parameters or deprecated options, detection of
degraded performance, capability limitations or missing dependencies. General
optimization tips. At this level, an operator should be able to detect an
eminent error condition, use of suboptimal parameters or non-critical
configuration errors. Examples:
* Use of chunk_length_in_kb property instead of chunk_length
* GC above treshold warnings
* OpenJDK not recommended notice
* Small sstable size warning (Testing done for CASSANDRA-5727 indicates that
performance improves up to 160MB)
*ERROR*: A expected error condition has ocurred. Non-critical, transient or
recovered errors might be reported at DEBUG level instead so they don't pollute
system.log.
Examples:
* critical errors in general (corrupted disk, read error, etc)
* leak detection
*TRACE*: High frequency state changes or message passing, critical path logs,
testing or development information. This level is disabled by default, so
everything that does not fit in the previous levels and highly verbose stuff
must be kept at TRACE level.
Examples:
* Failure detector checks
* Gossip digests
* CassandraServer.insert()
What do you think [~aweisberg]? After review and suggestions, if there are no
objections, I will add this to the wiki and send an e-mail to the dev list.
After this, the next step would be to groom the current logs in a separate
ticket so they follow the guideline.
> Keep a separate production debug log for troubleshooting
> --------------------------------------------------------
>
> Key: CASSANDRA-10241
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10241
> Project: Cassandra
> Issue Type: New Feature
> Components: Config
> Reporter: Jonathan Ellis
> Assignee: Paulo Motta
> Fix For: 2.2.x, 3.0.0 rc2
>
> Attachments: 2.2-debug.log, 2.2-system.log, 3.0-debug.log,
> 3.0-system.log
>
>
> [~aweisberg] had the suggestion to keep a separate debug log for aid in
> troubleshooting, not intended for regular human consumption but where we can
> log things that might help if something goes wrong.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)