[ 
https://issues.apache.org/jira/browse/KAFKA-14040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sagar Rao updated KAFKA-14040:
------------------------------
    Description: We noticed an issue with KIP-770 where the ordering of metrics 
for enforced processing sensor and total bytes sensor was inverted due to which 
total bytes buffer metric was always 0. A hotfix was filed to fix this. Need to 
check test coverage and add more tests for this.  (was: In some EOS 
applications with relatively long restoration times we've noticed a series of 
ProducerFencedExceptions occurring during/immediately after restoration. The 
broker logs were able to confirm these were due to transactions timing out.

In Streams, it turns out we automatically begin a new txn when calling {{send}} 
(if there isn’t already one in flight). A {{send}} occurs often outside a 
commit during active processing (eg writing to the changelog), leaving the txn 
open until the next commit. And if a StreamThread has been actively processing 
when a rebalance results in a new stateful task without revoking any existing 
tasks, the thread won’t actually commit this open txn before it goes back into 
the restoration phase while it builds up state for the new task. So the 
in-flight transaction is left open during restoration, during which the 
StreamThread only consumes from the changelog without committing, leaving it 
vulnerable to timing out when restoration times exceed the configured 
transaction.timeout.ms for the producer client.)

> Improve test coverage for max buffer bytes metrics
> --------------------------------------------------
>
>                 Key: KAFKA-14040
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14040
>             Project: Kafka
>          Issue Type: Task
>          Components: streams
>            Reporter: Sagar Rao
>            Assignee: Sagar Rao
>            Priority: Major
>
> We noticed an issue with KIP-770 where the ordering of metrics for enforced 
> processing sensor and total bytes sensor was inverted due to which total 
> bytes buffer metric was always 0. A hotfix was filed to fix this. Need to 
> check test coverage and add more tests for this.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to