Hi Alexandre, Thanks for the feedback.
100. You're correct. I updated the KIP to explicitly state that once we introduce a new tagged field in a later version, that version can never be downgraded to the listed versions. 101. I don't think there is a way to differentiate the two but also not sure if there will be a need to. Best, Jeff On Tue, Mar 28, 2023 at 2:13 PM Alexandre Dupriez < alexandre.dupr...@gmail.com> wrote: > Hi Jeff, > > Thank you for the fast answer! > > 100. Got it, I think I am starting to understand based on your example > exposing three Kafka versions. Please let me know if the following > corresponds to the correct interpretation. > > Taking OffsetCommitValue as an example, the idea is to bump the schema > to version 4 to make the record flexible. The schema version 4 will be > back-ported to versions 3.0.3, 3.1.3, 3.2.4, 3.3.3, 3.4.1 and > potentially 3.5. These versions will continue to serialise the record > with version 3 so that downgrades from there to earlier minor and/or > major versions are possible. And, these versions will deserialize the > records with version 4 so that they can support flexible records, > although they do not make use of tagged fields (if any is present). > > Then, in a future version, 3.x, x >= 6, a tag field will be added to > the record - say, topic id. Downgrade from 3.x to 3.0.3, 3.1.3, 3.2.4, > 3.3.3, 3.4.1 and 3.5 will be possible because these versions will > perform deserialisation with the record schema version 4, that is, > supporting tagged fields. > > Hence, this approach is both backward and forward looking and allows > to extend the scope of compatible versions for downgrades. > > [N] 3.x, x >= 6 --> downgradable to [I] 3.0.3, 3.1.3, 3.2.4, 3.3.3, > 3.4.1, 3.5 --> downgradable to e.g. [O] 3.0.0. > > One note though is that a transitive downgrade from 3.x, x >= 6, to > 3.0.0 via the version domain [I], will not be supported. Should it be > explicitly mentioned in the KIP that downgrades from 3.0.3, 3.1.3, > 3.2.4, 3.3.3, 3.4.1, 3.5 to earlier versions may not be possible (if > offsets or group metadata records were generated by a higher version > 3.x, x >= 6)? Or am I still misunderstanding? > > 101. I will try to provide an example to illustrate what I mean by > version. Consider the GroupMetadataValue schema version 4 and the > tagged field topicId introduced in 3.6. Let's say a new optional field > needs to be added in 3.7. We will have two records version 4, one with > topic id, the other with topic id and the new field. The new field is > semantically optional (structurally it is always optional since it is > a tagged field) but we want to make the distinction between a record > generated by 3.6 and one generated by 3.7. How do we resolve the > ambiguity? > > Thanks! > Alexandre > > Le mar. 28 mars 2023 à 16:13, Jeff Kim <jeff....@confluent.io.invalid> a > écrit : > > > > Hi Alexandre, > > > > Thank you for taking a look! > > > > 100. I am not sure I fully understand what you mean by forcefully adding > > tagged fields. Let's say VX does not have a flexible version, > > VY allows deserialization but serializes with a non-flexible version, and > > VZ introduces a new tagged field. > > VX upgrade to VY then downgrade back to VX works because even if group > > metadata changes VY will serialize with > > the highest non-flexible version. VZ to VY back to VZ also works because > > even if VY serializes with a non-flexible field > > VZ will be able to deserialize it as it is a supported version. Does this > > answer your question? > > > > 101. The future versioning scheme needs to be backward compatible with > > older coordinators. Wouldn't segregating into 2 versions > > be incompatible? > > > > Thanks, > > Jeff > > > > On Tue, Mar 28, 2023 at 5:47 AM Alexandre Dupriez < > > alexandre.dupr...@gmail.com> wrote: > > > > > Hi Jeff, Team, > > > > > > Thank you for the KIP. This is a very interesting approach. I feel it > > > is simpler than the described alternative although it comes with > > > tradeoffs, thanks for highlighting those. If I may, I would like to > > > share two naive questions. > > > > > > 100. The KIP mentions that records will be serialised with the highest > > > non-flexible version (e.g. 3 for GroupMetadataValue and > > > OffsetCommitValue) so that records can be deserialized with earlier > > > versions of Kafka. I am not sure I understand correctly: is the idea > > > to forcefully add tagged fields at the end of the records while > > > maintaining the existing version (3 for the two record types just > > > mentioned) so that they can be deserialized by existing Kafka versions > > > for which the version of these record types is not known as flexible, > > > while at the same time preserving the new tagged fields to new Kafka > > > versions abreast of the addition of a new flexible version for these > > > record types? If so, is it "bypassing" the protocol convention which > > > prescribes the use of a flexible version to allow the use of tagged > > > fields? > > > > > > 101. After the bump of the records to a new version indicated as > > > flexible, the record version is expected to never change while the > > > underlying tagged fields could potentially still evolve over time. One > > > potential downside is that we lose the benefits of the versioning > > > scheme enforced by the serde protocol. Could this become a problem in > > > the future if there is ever a need to segregate two distinct > > > "versions" of the appended record structure held by the tagged fields? > > > > > > Thanks, > > > Alexandre > > > > > > Le jeu. 23 mars 2023 à 18:15, Jeff Kim <jeff....@confluent.io.invalid> > a > > > écrit : > > > > > > > > Hi Yi, > > > > > > > > > Does it mean with a flexible version, the future > > > > version of these value types will stay at the version where the > > > flexibility > > > > is first introduced? Will there be any need to bump the version > again in > > > > the future? > > > > > > > > Yes, there will be no need to bump the version since we will only be > > > adding > > > > tagged fields but in the chance the version is bumped, we will > > > deserialize > > > > to the highest known (flexible) version which will ignore unknown > tagged > > > > fields. > > > > > > > > > To enforce the version not bumping, is it possible to have a unit > test > > > to > > > > gate? > > > > > > > > Do you have some tests in mind? I don't think we can tell whether a > > > version > > > > was bumped or not. > > > > > > > > Best, > > > > Jeff > > > > > > > > On Thu, Mar 23, 2023 at 12:07 PM Yi Ding <yd...@confluent.io.invalid > > > > > wrote: > > > > > > > > > Hi Jeff, > > > > > > > > > > Thanks for the update. Does it mean with a flexible version, the > future > > > > > version of these value types will stay at the version where the > > > flexibility > > > > > is first introduced? Will there be any need to bump the version > again > > > in > > > > > the future? > > > > > To enforce the version not bumping, is it possible to have a unit > test > > > to > > > > > gate? > > > > > > > > > > > > > > > On Wed, Mar 22, 2023 at 3:19 PM Jeff Kim > <jeff....@confluent.io.invalid > > > > > > > > > wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > After discussing with my colleagues, I have repurposed the KIP > for a > > > > > > general downgrade solution for both transaction and group > > > coordinators. > > > > > The > > > > > > KIP no longer discusses the downgrade path but instead lays out > the > > > > > > foundation for future downgrade solutions. > > > > > > > > > > > > Link: > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-915%3A+Txn+and+Group+Coordinator+Downgrade+Foundation > > > > > > > > > > > > Thanks, > > > > > > Jeff > > > > > > > > > > > > On Mon, Mar 20, 2023 at 7:37 PM Jeff Kim <jeff....@confluent.io> > > > wrote: > > > > > > > > > > > > > Hi David and Justine, > > > > > > > > > > > > > > Thank you both for the detailed feedback. > > > > > > > > > > > > > > David, > > > > > > > > > > > > > > 1. That makes sense. I revised the "Reading new fields" section > > > with > > > > > how > > > > > > > we can downgrade to the highest known version and that this was > > > > > confirmed > > > > > > > via unit testing. I also attempted to dive deeper into using > tagged > > > > > > fields > > > > > > > and the rejected alternative. Please let me know what you > think. > > > > > > > > > > > > > > 2. Under "Restrictions and Guidelines" I updated the paragraph > to > > > > > clearly > > > > > > > state that we want to use tagged fields across all record types > > > > > > introduced > > > > > > > in KIP-848 including OffsetCommitValue. > > > > > > > > > > > > > > 3. Would it be possible to bump the OffsetCommitValue record > > > version to > > > > > > > make it flexible along with the changes to parse with the > highest > > > known > > > > > > > version? I'm not sure I understand why we cannot make both > changes > > > > > > together. > > > > > > > > > > > > > > 4. I completely missed this. Added some notes at the end of > > > > > "Restrictions > > > > > > > and Guidelines". Unfortunately I can't think of a solution at > the > > > > > moment. > > > > > > > Will get back to you. > > > > > > > > > > > > > > 5. I have a section under "Identifying New Record Types" that > > > discusses > > > > > > > this: > > > > > > > > We can automate the cleanup by writing tombstones when the > > > > > coordinator > > > > > > > reads unrecognizable records. This may add duplicate work if > > > tombstones > > > > > > > were already added but not yet pruned by the log cleaner. > > > > > > > This is a sure way to delete any unknown record types even if > the > > > > > > operator > > > > > > > does not follow the steps. > > > > > > > > > > > > > > 6. Thanks, I have expanded on the section on transactional > offset > > > > > > commits. > > > > > > > As for log compaction, my understanding was that we can > control the > > > > > > process > > > > > > > by forcing compaction. Is my understanding incorrect? > > > > > > > > > > > > > > 7. We throw an exception if ConfigResource.type is unknown > which is > > > > > true > > > > > > > in our case because KIP-848 introduces a new GROUP resource > type. > > > We > > > > > will > > > > > > > need to add some sort of safeguard if the dynamic configs are > not > > > > > deleted > > > > > > > before the server downgrade. I'll give you an update once I > update > > > the > > > > > > > section. > > > > > > > > > > > > > > Justine, > > > > > > > > > > > > > > On the transactional offset commits, I have updated the KIP to > > > reflect > > > > > > > your points after our offline discussion (much appreciated). It > > > seems > > > > > > that > > > > > > > the work required to abort from the server side is fairly big > and > > > will > > > > > > > require additional investigation if we are to go down this > path. > > > > > > > > > > > > > > As for downgrade limitations, I missed that part so thanks for > the > > > > > catch > > > > > > > (along with David). Unfortunately, the proposed design won't > allow > > > > > > > downgrades even after the new record types are deleted because > the > > > > > > existing > > > > > > > OffsetCommitValue record is not flexible and will not be able > to > > > parse > > > > > > > the topicId tagged field. I'll think more about this and get > back > > > to > > > > > you. > > > > > > > > > > > > > > Best, > > > > > > > Jeff > > > > > > > > > > > > > > On Mon, Mar 20, 2023 at 2:16 PM Justine Olshan > > > > > > > <jols...@confluent.io.invalid> wrote: > > > > > > > > > > > > > >> Hey Jeff and David, > > > > > > >> > > > > > > >> Thanks for the KIP! I was also looking into this a bit since > I may > > > > > want > > > > > > to > > > > > > >> change the record format for KIP-890 as well (finally > > > implementing the > > > > > > >> record change from KIP-360 to better support the epoch bump. > This > > > will > > > > > > >> potentially be helpful for me to implement that work. > > > > > > >> > > > > > > >> I discussed some aspects with David offline. Namely, the > > > alternative > > > > > > >> solution to rewrite the records in the old format upon > downgrade. > > > One > > > > > > plus > > > > > > >> to this approach is that it should trigger compaction for all > the > > > > > > existing > > > > > > >> (new format) records. > > > > > > >> The main issue we ran into was how to handle ongoing > transactions. > > > > > (Ie, > > > > > > >> could we abort and rewrite with the old format) I think that's > > > one of > > > > > > the > > > > > > >> main downsides. I think it could potentially be possible to do > > > this, > > > > > but > > > > > > >> we'd definitely need a server-side mechanism to abort and > rewrite > > > the > > > > > > >> records. > > > > > > >> > > > > > > >> I think the trouble I was running into with the current > solution > > > is > > > > > that > > > > > > >> we > > > > > > >> can only downgrade to a version slightly before the new group > > > > > > coordinator. > > > > > > >> If someone was running 3.2 and they upgrade to 3.6, but find a > > > bug in > > > > > > 3.4 > > > > > > >> (or any version between 3.5 and the one they upgraded from), > then > > > they > > > > > > can > > > > > > >> only downgrade to 3.5. This puts us in a difficult spot. I > guess > > > in > > > > > this > > > > > > >> scenario, they will need to wait for the new format records > to get > > > > > > cleared > > > > > > >> away before downgrading again. Is that correct? > > > > > > >> > > > > > > >> Thanks, > > > > > > >> Justine > > > > > > >> > > > > > > >> On Thu, Mar 16, 2023 at 10:41 AM David Jacot > > > > > > <dja...@confluent.io.invalid > > > > > > >> > > > > > > > >> wrote: > > > > > > >> > > > > > > >> > Hi Jeff, > > > > > > >> > > > > > > > >> > Thanks for the KIP! I am really glad that we are finally > > > addressing > > > > > > this > > > > > > >> > gap in KIP-848. I have a few general comments. > > > > > > >> > > > > > > > >> > 1. Overall, I feel like the important bits are not bold > enough > > > in > > > > > the > > > > > > >> KIP. > > > > > > >> > I think that it is good to explain the overall > upgrade/downgrade > > > > > > process > > > > > > >> > and to highlight where the issues are but I think that the > core > > > bits > > > > > > >> should > > > > > > >> > give more details. For instance, we should explain why > relying > > > on > > > > > tag > > > > > > >> > fields works to ignore fields added in future releases. My > > > > > > >> understanding is > > > > > > >> > that it works because the buffer for the tagged fields is > > > serialized > > > > > > at > > > > > > >> the > > > > > > >> > end so reading with the old version, which is a prefix of > the > > > new > > > > > one, > > > > > > >> > works. > > > > > > >> > > > > > > > >> > 2. Moreover, it would be great if we could make the > principle > > > more > > > > > > >> general. > > > > > > >> > My hope is that we can keep reusing the principles > introduced > > > in the > > > > > > >> KIP in > > > > > > >> > future releases as well. For instance, let's say that we > need to > > > > > add a > > > > > > >> new > > > > > > >> > field to one of the new records introduced by KIP-848 or > that we > > > > > will > > > > > > >> have > > > > > > >> > to introduce a new record type as well. Would it work for > those > > > > > cases > > > > > > as > > > > > > >> > well? > > > > > > >> > > > > > > > >> > 3. Regarding enabling support for tagged fields for the > > > > > > >> OffsetCommitValue > > > > > > >> > record, it would be great if you could give more details on > the > > > > > steps > > > > > > to > > > > > > >> > get there in the KIP. My understanding is that we would > have to > > > do > > > > > the > > > > > > >> > following: 1) Update the code which reads the records to > fail > > > back > > > > > to > > > > > > >> the > > > > > > >> > highest known version if the version stored in the log is > > > unknown. > > > > > > Let's > > > > > > >> > say that we do this in AK 3.5. 2) We need to turn on tagged > > > fields > > > > > for > > > > > > >> the > > > > > > >> > record. I think that we can only do this in AK 3.6+. > > > > > > >> > > > > > > > >> > 4. I may have missed this part but we should clearly > explain the > > > > > > >> drawback > > > > > > >> > of the proposed approach as well. Say that we enable tagged > > > fields > > > > > for > > > > > > >> > OffsetCommitValue in AK 3.6. This means that it won't be > > > possible to > > > > > > >> > downgrade a cluster from 3.6 to a version earlier than 3.5. > > > This is > > > > > a > > > > > > >> > significant limitation in my opinion because, I think, users > > > don't > > > > > > >> > necessarily upgrade to all versions. > > > > > > >> > > > > > > > >> > 5. In the proposal, it is not clear about whether the old > > > software > > > > > > will > > > > > > >> > delete unknown records or not. It is true that new records > will > > > be > > > > > > >> deleted > > > > > > >> > when the group is downgraded but this only works if the > operator > > > > > > >> respects > > > > > > >> > the process. > > > > > > >> > > > > > > > >> > 6. It would be great if we could extend the rejected > > > alternative. > > > > > The > > > > > > >> > alternative sounds clearly better when you read it so we > should > > > > > really > > > > > > >> > explain the reason to reject it. 1) One issue that you > mention > > > is > > > > > that > > > > > > >> the > > > > > > >> > log must be compacted before downgrading and we don't really > > > control > > > > > > >> this > > > > > > >> > process. 2) Transactions may be difficult to handle. I > suppose > > > that > > > > > it > > > > > > >> is > > > > > > >> > possible to handle them though. Have you thought about this? > > > > > > >> > > > > > > > >> > 7. For the new dynamic configs, what happens if they are > kept > > > and > > > > > the > > > > > > >> > quorum controller is downgraded? > > > > > > >> > > > > > > > >> > Best, > > > > > > >> > David > > > > > > >> > > > > > > > >> > On Thu, Mar 16, 2023 at 12:56 AM Jeff Kim > > > > > > <jeff....@confluent.io.invalid > > > > > > >> > > > > > > > >> > wrote: > > > > > > >> > > > > > > > >> > > Hi folks, > > > > > > >> > > > > > > > > >> > > I would like to start a discussion thread for KIP-915: > Next > > > Gen > > > > > > Group > > > > > > >> > > Coordinator Downgrade Path which proposes the downgrade > > > design for > > > > > > the > > > > > > >> > new > > > > > > >> > > group coordinator introduced in KIP-848 > > > > > > >> > > < > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-848%3A+The+Next+Generation+of+the+Consumer+Rebalance+Protocol > > > > > > >> > > > > > > > > > >> > > . > > > > > > >> > > > > > > > > >> > > KIP: > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-915%3A+Next+Gen+Group+Coordinator+Downgrade+Path > > > > > > >> > > > > > > > > >> > > Thanks, > > > > > > >> > > Jeff > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >