[ 
https://issues.apache.org/jira/browse/KAFKA-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Lu updated KAFKA-7236:
----------------------------
    Description: 
The "min.insync.replicas" configuration specifies the minimum number of insync 
replicas required for a partition to accept messages from the producer. If the 
insync replica count of a partition falls under the specified 
"min.insync.replicas", then the broker will reject messages for producers using 
acks=all. These producers will suffer unavailability as they will see a 
NotEnoughReplicas or NotEnoughReplicasAfterAppend exception.

We currently have an UnderMinIsrPartitionCount metric which is useful for 
identifying when partitions fall under "min.insync.replicas", however it is 
still difficult to identify which topic partitions are affected and need fixing.

We can leverage the describe topics command in TopicCommand to add an option 
"--under-minisr-partitions" to list out exactly which topic partitions are 
below "min.insync.replicas".

  was:
[KIP-351|https://cwiki.apache.org/confluence/display/KAFKA/KIP-351%3A+Add+--critical-partitions+option+to+describe+topics+command]

 

A topic partition can be in one of four states (assuming replication factor of 
3):

 

(ISR = in sync replica)

 

3/3 ISRs: OK

2/3 ISRs: WARNING (under-replicated partition)

1/3 ISRs: CRITICAL (under-replicated partition)

0/3 ISRs: FATAL (offline/unavailable partition)

 

TopicCommand already has the --under-replicated-partitions and 
--unavailable-partitions flags, but it would be beneficial to include an 
additional --critical-partitions option that specifically lists out partitions 
in CRITICAL state (only one remaining ISR left).

 

With this new option, Kafka users can use this option to identify the exact 
topic partitions that are critical and need immediate repartitioning. Kafka 
users can also set up critical alerts to trigger when the output of this 
command contains partitions.

 

A couple cases where identifying this CRITICAL state is useful in alerting:
 * Users that have a large amount of topics in a single cluster, making it 
incredibly hard to manually repartition all topics that have under-replicated 
partitions, so they only take action when it hits CRITICAL state
 * Users with a high replication-factor that can tolerate some broker failures 
and only take action when it hits CRITICAL state

        Summary: Add --under-min-isr option to describe topics command  (was: 
Add --critical-partitions option to describe topics command)

> Add --under-min-isr option to describe topics command
> -----------------------------------------------------
>
>                 Key: KAFKA-7236
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7236
>             Project: Kafka
>          Issue Type: Improvement
>          Components: tools
>            Reporter: Kevin Lu
>            Assignee: Kevin Lu
>            Priority: Minor
>
> The "min.insync.replicas" configuration specifies the minimum number of 
> insync replicas required for a partition to accept messages from the 
> producer. If the insync replica count of a partition falls under the 
> specified "min.insync.replicas", then the broker will reject messages for 
> producers using acks=all. These producers will suffer unavailability as they 
> will see a NotEnoughReplicas or NotEnoughReplicasAfterAppend exception.
> We currently have an UnderMinIsrPartitionCount metric which is useful for 
> identifying when partitions fall under "min.insync.replicas", however it is 
> still difficult to identify which topic partitions are affected and need 
> fixing.
> We can leverage the describe topics command in TopicCommand to add an option 
> "--under-minisr-partitions" to list out exactly which topic partitions are 
> below "min.insync.replicas".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to