Thank you for the continuous work, I have some small problems related to the 
implementation details:

1. We have decided to add a new metadata.version, and you have said that  "We 
should also avoid gating on metadata.version to include the 
`OnlineLogDirectories` field in the broker registration, the existing 
versioning in the message schemas should suffice to guarantee compatibility",  
I think this is a big difference from what we had done in the past design, it's 
better to add this point in  "Compatibility, Deprecation, and Migration Plan " 
section.
2. Related to (1),  an important thing about migration is that we should first 
upgrade controllers before upgrade brokers since broker will only send 
`BrokerRegisterationRequest` once and we should use the new format and the 
controller should recoganize the new format, this is difficult in co-resident 
mode, so you should point out that we only support distributed mode similar to 
KIP-866.
3. What will happen on receiving "AssignReplicasToDirsRequest"? this should be 
described in "Controller" section, for example, generating 
`PartitionChangeRecord` or return error code.

Apart from these problems, this KIP LGTM, 
--
Best,
Ziming


> On Sep 28, 2022, at 04:59, Igor Soarez <i...@soarez.me> wrote:
> 
> Hi Ziming,
> 
> Thank you for having another look at this KIP, and please accept my apologies 
> as to my delay in replying.
> 
> The migration introduces JBOD support, so before the migration there should 
> only be one log directory configured per broker. This assumption simplifies 
> how the controller learns about the initial log directory placement for 
> existing partitions. The first broker registration referencing log 
> directories must declare a single log directory - which must be online as 
> brokers cannot start without any online log dirs. If there are no previous 
> log dirs registered for this  broker, the controller assigns all existing 
> partitions in the broker to this single directory.
> 
> Since we assume a single log dir per broker as a migration starting point and 
> brokers cannot start without any online log directories, the 
> `OfflineLogDirectories` field simply cannot be used in the first registration 
> after the upgrade. However, the issue you described applies to 
> `OnlineLogDirectories`. If use of this field is gated by metadata.version 
> then we do require a double roll for brokers to make their first registration 
> with a log directory.
> 
> As you describe, we could move the list of online log directories from the 
> broker registration request to the heartbea request, but I believe we 
> shouldn't do this because:
>  a) The intention is to address a transient problem — the migration into the 
> feature, which should happen only once — by accepting a long term commitment 
> as to how the online log dirs are signaled.
>  b) While the log directory status can change at any point from online to 
> offline, the list of existing log directories is static for broker's 
> lifetime, so it makes little sense to send this information over and over 
> again to the controller.
> 
> We should also avoid gating on metadata.version to include the 
> `OnlineLogDirectories` field in the broker registration. If the controllers 
> are upgraded first, the existing versioning in the message schemas should 
> suffice to guarantee compatibility. After the controllers are upgraded, the 
> brokers can then be upgraded and each of them can register their single 
> online log directory from the first instantiation.
> 
> Please, let me know if this makes sense and any other thoughts you might have.
> 
> Best,
> 
> --
> Igor
> 
> On Thu, Sep 1, 2022, at 2:04 AM, deng ziming wrote:
>> Hi Igor,
>> 
>> I think this KIP can solve the current problems, I have some problems 
>> relating to the migration section.
>> 
>> Since we have bumped broker RPC version and metadata record version, 
>> there will be some problems between brokers/controllers of different 
>> versions. In ZK mode we use IBP as a flag to help solve this, in KRaft 
>> mode we use a feature flag(metadata.version) as a flag for using new 
>> RPC/metadata or not. 
>> 
>> Assuming that we are upgrading from 3.3 to 3.4, firstly the finalized 
>> version of metadata.version is 3.3, brokers will use version 1 of 
>> `BrokerRegistrationRequest` which contains no `OfflineLogDirectories`, 
>> finally the finalized version of metadata.version is 3.4, but brokers 
>> will no longer send `BrokerRegistrationRequest` unless we restart the 
>> broker, so controllers can’t be aware of the `OfflineLogDirectories` of 
>> each broker, so we should reconsider the suggestion of Jason to use 
>> `BrokerHeartbeatRequest` to communicate `OfflineLogDirectories`.
>> 
>> Of course this problem can be solved through a double roll(restart 
>> broker twice when upgrading), but we should trying to avoid it since we 
>> now have feature flag.
>> 
>> One solution is that we include `OfflineLogDirectories` both in 
>> `BrokerRegistrationRequest` and `BrokerHeartbeatRequest` requests, when 
>> upgrading from 3.3 to 3.4 the 
>> `BrokerRegistrationRequest.OfflineLogDirectories` is empty whereas when 
>> upgrading from 3.4 to 3.5 it will not be empty. And maybe we can also 
>> remove `LogDirectoriesOfflineRequest` you proposed in this KIP.
>> 
>> --
>> Best,
>> Ziming
>> 
>> 
>>> On Aug 18, 2022, at 11:24 PM, Igor Soarez <i...@soarez.me> wrote:
>>> 
>>> Hi Ziming,
>>> 
>>> I'm sorry it took me a while to reply.
>>> 
>>> Thank you for having a look at this KIP and providing feedback.
>>> 
>>>> 1. We have a version field in meta.properties, currently it’s 1, and we can
>>>> set it to 2 in this KIP, and we can give an example of server.properties 
>>>> and
>>>> it’s corresponding meta.properties generated by the storage command tool.
>>> 
>>>> 2. When using storage format tool we should specify cluster-id, it will be 
>>>> an
>>>> arduous work if we should manually specify directory-ids for all log dirs,
>>>> I think you can make it more clear about this change that the directory-ids
>>>> are generated automatically.
>>> 
>>> Thank you for these suggestions. I've updated the KIP to:
>>> 
>>> * include an example
>>> * clarify that the version field would be changed to 2
>>> * clarify that the log directory UUIDs are automatically generated
>>> 
>>>> 3. When controller place a replica, it will select a broker as well as a 
>>>> log
>>>> directory, the latter is currently accomplished in the broker side, so this
>>>> will be a big change?
>>> 
>>> I think there can be benefits, as Jason described previously, if we change 
>>> how
>>> log directories are assigned as follow-up work.
>>> 
>>> From a codebase surface area perspective, it is definitely a big change
>>> because there are many models, types and interfaces that assume replicas are
>>> identified solely by a broker id.
>>> That will have to change to include the directory UUID, many lines of code 
>>> will
>>> be affected.
>>> 
>>> But in terms of behavior it shouldn't be a big change at all. Brokers 
>>> currently
>>> select the log directory with the least logs in it. This isn't a very nice
>>> policy, as logs can have wildly different sizes, and log directories can 
>>> have
>>> different capacities. But it is a policy that the controller can keep.
>>> 
>>> If we decide to extend the selection policy and keep it in the broker,
>>> the broker will continue to be able to override the controller's selection
>>> of log directory, using the `AssignReplicasToDirectories` RPC.
>>> 
>>>> 4. When handling log directory failures we will update Leader and ISR using
>>>> the existing replica state machine, what is this state machine referring 
>>>> to,
>>>> do you mean the AlterPartition RPC?
>>> 
>>> No, I mean that it will use the same rules and mechanism
>>> (`PartitionChangeBuilder`) that is used when a broker is fenced, shutdown or
>>> unregistered.
>>> 
>>> I think maybe the term "replica state machine" isn't the very appropriate.
>>> I've updated the KIP to rephrase this section.
>>> 
>>> Thanks,
>>> 
>>> --
>>> Igor

Reply via email to