[ 
https://issues.apache.org/jira/browse/KAFKA-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Lin updated KAFKA-7235:
----------------------------
    Description: 
Currently a broker can process controller requests that are sent before the 
broker is restarted. This could cause a few problems. Here is one example:

Let's assume partitions p1 and p2 exists on broker1.

1) Controller generates LeaderAndIsrRequest with p1 to be sent to broker1.

2) Before controller sends the request, broker1 is quickly restarted.

3) The LeaderAndIsrRequest with p1 is delivered to broker1.

4) After processing the first LeaderAndIsrRequest, broker1 starts to checkpoint 
high watermark for all partitions that it owns. Thus it may overwrite high 
watermark checkpoint file with only the hw for partition p1. The hw for 
partition p2 is now lost, which could be a problem.

In general, the correctness of broker logic currently relies on a few 
assumption, e.g. the first LeaderAndIsrRequest received by broker should 
contain all partitions hosted by the broker, which could break if broker can 
receive controller requests that were generated before it restarts. 

One reasonable solution to the problem is to include the 
expectedBrokeNodeZkVersion in the controller requests. Broker should remember 
the broker znode zkVersion after it registers itself in the zookeeper. Then 
broker can reject those controller requests whose expectedBrokeNodeZkVersion is 
different from its broker znode zkVersion.

 

  was:
Currently a broker can process controller requests that are sent before the 
broker is restarted. This could cause a few problems. Here is one example:

Let's assume partitions p1 and p2 exists on broker1.

1) Controller generates LeaderAndIsrRequest with p1 to be sent to broker1.

2) Before controller sends the request, broker1 is quickly restarted.

3) The LeaderAndIsrRequest with p1 is delivered to broker1.

4) After processing the first LeaderAndIsrRequest, broker1 starts to checkpoint 
high watermark for all partitions that it owns. Thus it may overwrite high 
watermark checkpoint file with only the hw for partition p1. The hw for 
partition p2 is now lost, which could be a problem.

In general, the correctness of broker logic currently relies on a few 
assumption, e.g. the first LeaderAndIsrRequest received by broker should 
contain all partitions hosted by the broker, which could break if broker can 
receive controller requests that were generated before it restarts.

 

One reasonable solution to the problem is to include the 
expectedBrokeNodeZkVersion in the controller requests. Broker should remember 
the broker znode zkVersion after it registers itself in the zookeeper. Then 
broker can reject those controller requests whose expectedBrokeNodeZkVersion is 
different from its broker znode zkVersion.

 


> Use brokerZkNodeVersion to prevent broker from processing outdated controller 
> request
> -------------------------------------------------------------------------------------
>
>                 Key: KAFKA-7235
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7235
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Dong Lin
>            Assignee: Dong Lin
>            Priority: Major
>
> Currently a broker can process controller requests that are sent before the 
> broker is restarted. This could cause a few problems. Here is one example:
> Let's assume partitions p1 and p2 exists on broker1.
> 1) Controller generates LeaderAndIsrRequest with p1 to be sent to broker1.
> 2) Before controller sends the request, broker1 is quickly restarted.
> 3) The LeaderAndIsrRequest with p1 is delivered to broker1.
> 4) After processing the first LeaderAndIsrRequest, broker1 starts to 
> checkpoint high watermark for all partitions that it owns. Thus it may 
> overwrite high watermark checkpoint file with only the hw for partition p1. 
> The hw for partition p2 is now lost, which could be a problem.
> In general, the correctness of broker logic currently relies on a few 
> assumption, e.g. the first LeaderAndIsrRequest received by broker should 
> contain all partitions hosted by the broker, which could break if broker can 
> receive controller requests that were generated before it restarts. 
> One reasonable solution to the problem is to include the 
> expectedBrokeNodeZkVersion in the controller requests. Broker should remember 
> the broker znode zkVersion after it registers itself in the zookeeper. Then 
> broker can reject those controller requests whose expectedBrokeNodeZkVersion 
> is different from its broker znode zkVersion.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to