[
https://issues.apache.org/jira/browse/KAFKA-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kevin Lu updated KAFKA-7236:
----------------------------
Description:
The "min.insync.replicas" configuration specifies the minimum number of insync
replicas required for a partition to accept messages from the producer. If the
insync replica count of a partition falls under the specified
"min.insync.replicas", then the broker will reject messages for producers using
acks=all. These producers will suffer unavailability as they will see a
NotEnoughReplicas or NotEnoughReplicasAfterAppend exception.
We currently have an UnderMinIsrPartitionCount metric which is useful for
identifying when partitions fall under "min.insync.replicas", however it is
still difficult to identify which topic partitions are affected and need fixing.
We can leverage the describe topics command in TopicCommand to add an option
"--under-minisr-partitions" to list out exactly which topic partitions are
below "min.insync.replicas".
was:
[KIP-351|https://cwiki.apache.org/confluence/display/KAFKA/KIP-351%3A+Add+--critical-partitions+option+to+describe+topics+command]
A topic partition can be in one of four states (assuming replication factor of
3):
(ISR = in sync replica)
3/3 ISRs: OK
2/3 ISRs: WARNING (under-replicated partition)
1/3 ISRs: CRITICAL (under-replicated partition)
0/3 ISRs: FATAL (offline/unavailable partition)
TopicCommand already has the --under-replicated-partitions and
--unavailable-partitions flags, but it would be beneficial to include an
additional --critical-partitions option that specifically lists out partitions
in CRITICAL state (only one remaining ISR left).
With this new option, Kafka users can use this option to identify the exact
topic partitions that are critical and need immediate repartitioning. Kafka
users can also set up critical alerts to trigger when the output of this
command contains partitions.
A couple cases where identifying this CRITICAL state is useful in alerting:
* Users that have a large amount of topics in a single cluster, making it
incredibly hard to manually repartition all topics that have under-replicated
partitions, so they only take action when it hits CRITICAL state
* Users with a high replication-factor that can tolerate some broker failures
and only take action when it hits CRITICAL state
Summary: Add --under-min-isr option to describe topics command (was:
Add --critical-partitions option to describe topics command)
> Add --under-min-isr option to describe topics command
> -----------------------------------------------------
>
> Key: KAFKA-7236
> URL: https://issues.apache.org/jira/browse/KAFKA-7236
> Project: Kafka
> Issue Type: Improvement
> Components: tools
> Reporter: Kevin Lu
> Assignee: Kevin Lu
> Priority: Minor
>
> The "min.insync.replicas" configuration specifies the minimum number of
> insync replicas required for a partition to accept messages from the
> producer. If the insync replica count of a partition falls under the
> specified "min.insync.replicas", then the broker will reject messages for
> producers using acks=all. These producers will suffer unavailability as they
> will see a NotEnoughReplicas or NotEnoughReplicasAfterAppend exception.
> We currently have an UnderMinIsrPartitionCount metric which is useful for
> identifying when partitions fall under "min.insync.replicas", however it is
> still difficult to identify which topic partitions are affected and need
> fixing.
> We can leverage the describe topics command in TopicCommand to add an option
> "--under-minisr-partitions" to list out exactly which topic partitions are
> below "min.insync.replicas".
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)