Souquières Adam created KAFKA-20699:
---------------------------------------

             Summary: TopologyTestDriver.getStateStore corrupts the task's 
record context
                 Key: KAFKA-20699
                 URL: https://issues.apache.org/jira/browse/KAFKA-20699
             Project: Kafka
          Issue Type: Bug
          Components: streams-test-utils
    Affects Versions: 4.2.1, 4.3.0, 4.2.0
            Reporter: Souquières Adam


h2. Summary

{{TopologyTestDriver.getStateStore}} mutates the task's live record context as 
a side effect, wiping the in-flight record's metadata.

h2. Affects

Since KAFKA-19638 ([#20403|https://github.com/apache/kafka/pull/20403]).

h2. Description

Before KAFKA-19638, the dummy record context was set once at task construction 
and {{getStateStore}} was read-only.

Since that change, {{getStateStore}} unconditionally calls {{setRecordContext}} 
with a dummy context (null topic, -1 offset/partition, ts 0) and never restores 
it. A store lookup therefore mutates the task's live record context as a side 
effect.

When a store handle is fetched while a record's context is active — e.g. an 
interactive query interleaved with processing, or a test seam that resolves a 
store handle through TTD from production code — the in-flight record's 
{{RecordMetadata}} is wiped:
* {{recordMetadata().topic()}} returns {{null}}
* {{offset()}} and {{partition()}} become {{-1}}

Consequences:
* Code building provenance from {{recordMetadata().topic()}} NPEs.
* Direct store writes capture timestamp {{0}}.

h2. Why it's not always visible

The normal {{process()}} path masks the bug because {{doProcess}} rebuilds the 
context per record. It surfaces on:
* direct store writes after a lookup, and
* any context read not preceded by a fresh {{process()}}.

h2. Proposed Fix

Only set the dummy context when none exists yet; never overwrite a live one.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to