[
https://issues.apache.org/jira/browse/KAFKA-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kevin Lu updated KAFKA-7236:
----------------------------
Description:
[KIP-351|https://cwiki.apache.org/confluence/display/KAFKA/KIP-351%3A+Add+--critical-partitions+option+to+describe+topics+command]
A topic partition can be in one of four states (assuming replication factor of
3):
(ISR = in sync replica)
3/3 ISRs: OK
2/3 ISRs: WARNING (under-replicated partition)
1/3 ISRs: CRITICAL (under-replicated partition)
0/3 ISRs: FATAL (offline/unavailable partition)
TopicCommand already has the --under-replicated-partitions and
--unavailable-partitions flags, but it would be beneficial to include an
additional --critical-partitions option that specifically lists out partitions
in CRITICAL state (only one remaining ISR left).
With this new option, Kafka users can use this option to identify the exact
topic partitions that are critical and need immediate repartitioning. Kafka
users can also set up critical alerts to trigger when the output of this
command contains partitions.
A couple cases where identifying this CRITICAL state is useful in alerting:
* Users that have a large amount of topics in a single cluster, making it
incredibly hard to manually repartition all topics that have under-replicated
partitions, so they only take action when it hits CRITICAL state
* Users with a high replication-factor that can tolerate some broker failures
and only take action when it hits CRITICAL state
was:
A topic partition can be in one of four states (assuming replication factor of
3):
(ISR = in sync replica)
3/3 ISRs: OK
2/3 ISRs: WARNING (under-replicated partition)
1/3 ISRs: CRITICAL (under-replicated partition)
0/3 ISRs: FATAL (offline/unavailable partition)
TopicCommand already has the --under-replicated-partitions and
--unavailable-partitions flags, but it would be beneficial to include an
additional --critical-partitions option that specifically lists out partitions
in CRITICAL state (only one remaining ISR left).
With this new option, Kafka users can use this option to identify the exact
topic partitions that are critical and need immediate repartitioning. Kafka
users can also set up critical alerts to trigger when the output of this
command contains partitions.
A couple cases where identifying this CRITICAL state is useful in alerting:
* Users that have a large amount of topics in a single cluster, making it
incredibly hard to manually repartition all topics that have under-replicated
partitions, so they only take action when it hits CRITICAL state
* Users with a high replication-factor that can tolerate some broker failures
and only take action when it hits CRITICAL state
> Add --critical-partitions option to describe topics command
> -----------------------------------------------------------
>
> Key: KAFKA-7236
> URL: https://issues.apache.org/jira/browse/KAFKA-7236
> Project: Kafka
> Issue Type: Improvement
> Components: tools
> Reporter: Kevin Lu
> Assignee: Kevin Lu
> Priority: Minor
>
> [KIP-351|https://cwiki.apache.org/confluence/display/KAFKA/KIP-351%3A+Add+--critical-partitions+option+to+describe+topics+command]
>
> A topic partition can be in one of four states (assuming replication factor
> of 3):
>
> (ISR = in sync replica)
>
> 3/3 ISRs: OK
> 2/3 ISRs: WARNING (under-replicated partition)
> 1/3 ISRs: CRITICAL (under-replicated partition)
> 0/3 ISRs: FATAL (offline/unavailable partition)
>
> TopicCommand already has the --under-replicated-partitions and
> --unavailable-partitions flags, but it would be beneficial to include an
> additional --critical-partitions option that specifically lists out
> partitions in CRITICAL state (only one remaining ISR left).
>
> With this new option, Kafka users can use this option to identify the exact
> topic partitions that are critical and need immediate repartitioning. Kafka
> users can also set up critical alerts to trigger when the output of this
> command contains partitions.
>
> A couple cases where identifying this CRITICAL state is useful in alerting:
> * Users that have a large amount of topics in a single cluster, making it
> incredibly hard to manually repartition all topics that have under-replicated
> partitions, so they only take action when it hits CRITICAL state
> * Users with a high replication-factor that can tolerate some broker
> failures and only take action when it hits CRITICAL state
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)