[
https://issues.apache.org/jira/browse/KAFKA-20407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18079265#comment-18079265
]
Justine Olshan commented on KAFKA-20407:
----------------------------------------
Seems reasonable. If you wanted to start a KIP, I think it makes sense to
further discuss with folks there. (y)
> Consider adding transaction state log append latency metrics
> ------------------------------------------------------------
>
> Key: KAFKA-20407
> URL: https://issues.apache.org/jira/browse/KAFKA-20407
> Project: Kafka
> Issue Type: Improvement
> Reporter: sanghyeok An
> Assignee: sanghyeok An
> Priority: Minor
> Labels: needs-kip, transaction
>
> Slow appends to __transaction_state can affect transaction operations such as
> {*}InitProducerId{*}, {*}AddPartitionsToTxn{*}, and {*}EndTxn{*}.
>
> When transaction latency increases, it is difficult to distinguish whether
> the slowdown comes from the request/network path or from appending
> transaction state transitions to {*}__transaction_state{*}. Existing metrics
> do not isolate the transaction state log append path, which makes diagnosis
> harder.
>
> A dedicated metric for transaction state log append latency would improve
> operability by making it easier to:
> * identify when transaction latency is driven by the transaction state topic
> write path
> * correlate transaction slowdowns with storage, ISR, or leader movement
> issues affecting __transaction_state
> * separate transaction state write-path issues from higher-level request
> latency
> * reduce time to diagnosis when transaction-related latency regresses
> There is also a similar precedent in the {*}Share Coordinator{*}, which
> already exposes *write-latency* metrics for state writes.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)