[ 
https://issues.apache.org/jira/browse/KAFKA-20418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sanghyeok An updated KAFKA-20418:
---------------------------------
    Labels: needs-kip transaction  (was: transaction)

> Consider adding metrics for pending transaction markers and oldest 
> transaction age
> ----------------------------------------------------------------------------------
>
>                 Key: KAFKA-20418
>                 URL: https://issues.apache.org/jira/browse/KAFKA-20418
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: sanghyeok An
>            Assignee: sanghyeok An
>            Priority: Minor
>              Labels: needs-kip, transaction
>
> When transaction handling becomes slow, it is difficult to tell whether the 
> delay is coming from the transaction state log append path, the post-EndTxn 
> marker completion path, or transactions remaining in coordinator state longer 
> than expected.
> The broker already exposes some transaction-related metrics, but it is still 
> hard to answer questions such as:
>  * how many transactions are currently waiting for marker completion
>  * whether pending marker backlog is growing or aging
>  * whether transactions are staying in a given state for unusually long 
> periods
> Adding a small set of metrics in this area could improve operability by 
> making it easier to identify transaction backlog and long-lived transactions 
> in the coordinator.  Suggested metrics:
>  * pending-marker-count
>  * pending-marker-oldest-age-ms
>  * oldest-transaction-age-ms\{state}
>  These metrics could be useful in scenarios such as:
>  * transaction completion appears slow even though request handling itself is 
> not obviously delayed
>  * marker propagation is backed up due to inter-broker issues or broker-side 
> delays
>  * some transactions remain in ONGOING or PREPARE_* states for much longer 
> than expected
>  * operators need to distinguish transaction state append issues from marker 
> completion issues or long-lived transaction state
>  
> There is already internal transaction and pending marker state tracking, so 
> exposing related metrics may be feasible and useful for broker operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to