Dong Lin created KAFKA-7235:
-------------------------------
Summary: Use brokerZkNodeVersion to prevent broker from processing
outdated controller request
Key: KAFKA-7235
URL: https://issues.apache.org/jira/browse/KAFKA-7235
Project: Kafka
Issue Type: Improvement
Reporter: Dong Lin
Assignee: Dong Lin
Currently a broker can process controller requests that are sent before the
broker is restarted. This could cause a few problems. Here is one example:
Let's assume partitions p1 and p2 exists on broker1.
1) Controller generates LeaderAndIsrRequest with p1 to be sent to broker1.
2) Before controller sends the request, broker1 is quickly restarted.
3) The LeaderAndIsrRequest with p1 is delivered to broker1.
4) After processing the first LeaderAndIsrRequest, broker1 starts to checkpoint
high watermark for all partitions that it owns. Thus it may overwrite high
watermark checkpoint file with only the hw for partition p1. The hw for
partition p2 is now lost, which could be a problem.
In general, the correctness of broker logic currently relies on a few
assumption, e.g. the first LeaderAndIsrRequest received by broker should
contain all partitions hosted by the broker, which could break if broker can
receive controller requests that were generated before it restarts.
One reasonable solution to the problem is to include the
expectedBrokeNodeZkVersion in the controller requests. Broker should remember
the broker znode zkVersion after it registers itself in the zookeeper. Then
broker can reject those controller requests whose expectedBrokeNodeZkVersion is
different from its broker znode zkVersion.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)