Hi Tom,

Thank you for having another look.

20. Upon a downgrade to a Kafka version that runs the current
"version == 1" assertion, then yes — a downgrade would not be possible
without first updating (manually) the meta.properties files back
to the previous version.

We could prevent this issue if we consider the new fields optional and
not bump the version, but this takes away from the value of having
a version in the first place. I'm not sure this would be a good trade-off.

IIUC in KIP-866 this issue was addressed by delaying the writing of
meta.properties under the new version until after the full transition to
KRaft mode is finished and relaxing the expectations around versioning.
We could take a similar approach by relaxing the current
expectation that the version must strictly be 1.

Currently, MetaProperties#parse() checks that the version is 1 but only
requires that the cluster.id and node.id properties are present in the file.
We can relax the requirement, ensuring only that the version is at least 1,
and land that change before this KIP is implemented – at least one release
beforehand. Then there will be some previous version for which a downgrade
is possible, even after the version has been bumped to 2 and the two new
properties are introduced. WDYT?

21. I think in this case we can reuse LOG_DIR_NOT_FOUND (code 57) and perhaps
reword its description slightly to be a bit more generic.
I've updated the KIP to mention this.

22. This check should be enforced by the broker, as the controller cannot
know whether the current replica to logdir assignment is correct.
Currently, the controller does not unfence a broker that does not "want"
to be unfenced. The broker should not want to be unfenced while there
are still mismatching (wrong or missing) replica to logdir assignments.
I updated the KIP to clarify this.

So a reason wouldn't be included in the heartbeat response but your point
about diagnosing is interesting. So I've updated the KIP to propose a
new broker metric reporting the number of mismatching replica assignments
- i.e. replica assignments that are either missing or incorrect in the
cluster metadata, as observed by the broker — until this metric is zero,
the broker would not want to be unfenced. Let me know what you think.

23. I don't see why not. We should be able to extend the system tests
proposed in KIP-866 to cover this.

24. Correct. Updated to remove the typo.

25. Good point. Updated to align the names.


Thank you,

--
Igor


Reply via email to