Stanislav Kozlovski created KAFKA-7800:
------------------------------------------
Summary: Extend Admin API to support dynamic application log levels
Key: KAFKA-7800
URL: https://issues.apache.org/jira/browse/KAFKA-7800
Project: Kafka
Issue Type: New Feature
Reporter: Stanislav Kozlovski
Assignee: Stanislav Kozlovski
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-412%3A+Extend+Admin+API+to+support+dynamic+application+log+levels]
Logging is a critical part of any system's infrastructure. It is the most
direct way of observing what is happening with a system. In the case of issues,
it helps us diagnose the problem quickly which in turn helps lower the
[MTTR|http://enterprisedevops.org/article/devops-metric-mean-time-to-recovery-mttr-definition-and-reasoning].
Kafka supports application logging via the log4j library and outputs messages
in various log levels (TRACE, DEBUG, INFO, WARN, ERROR). Log4j is a rich
library that supports fine-grained logging configurations (e.g use INFO-level
logging in {{kafka.server.ReplicaManager}} and use DEBUG-level in
{{kafka.server.KafkaApis}}).
This is statically configurable through the
[log4j.properties|https://github.com/apache/kafka/blob/trunk/config/log4j.properties]
file which gets read once at broker start-up.
A problem with this static configuration is that we cannot alter the log levels
when a problem arises. It is severely impractical to edit a properties file and
restart all brokers in order to gain visibility of a problem taking place in
production.
It would be very useful if we support dynamically altering the log levels at
runtime without needing to restart the Kafka process.
Log4j itself supports dynamically altering the log levels in a programmatic way
and Kafka exposes a [JMX
API|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils/Log4jController.scala]
that lets you alter them. This allows users to change the log levels via a GUI
(jconsole) or a CLI (jmxterm) that uses JMX.
There is one problem with changing log levels through JMX that we hope to
address and that is *Ease of Use*:
* Establishing a connection - Connecting to a remote process via JMX requires
configuring and exposing multiple JMX ports to the outside world. This is a
burden on users, as most production deployments may stand behind layers of
firewalls and have policies against opening ports. This makes opening the ports
and connections in the middle of an incident even more burdensome
* Security - JMX and tools around it support authentication and authorization
but it is an additional hassle to set up credentials for another system.
* Manual process - Changing the whole cluster's log level requires manually
connecting to each broker. In big deployments, this is severely impractical and
forces users to build tooling around it.
h4. Proposition
Ideally, Kafka would support dynamically changing log levels and address all of
the aforementioned concerns out of the box.
We propose extending the IncrementalAlterConfig/DescribeConfig Admin API with
functionality for dynamically altering the broker's log level.
This approach would also pave the way for even finer-grained logging logic (e.g
log DEBUG level only for a certain topic) and would allow us to leverage the
existing *AlterConfigPolicy* for custom user-defined validation of log-level
changes.
These log-level changes will be *temporary* and reverted on broker restart - we
will not persist them anywhere.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)