On Tue, May 27, 2025, at 09:28, José Armando García Sancio wrote: > Hi Colin, > > On Fri, May 2, 2025 at 6:52 PM Colin McCabe <cmcc...@apache.org> wrote: >> >> On Fri, May 2, 2025, at 09:54, José Armando García Sancio wrote: >> That seems pretty clear? It's also already implemented (although not used >> since we don't have any inter-dependent features yet). > > I see. This is mentioned in the "Compatibility, Deprecation, and > Migration Plan" section and not in the "Proposed Changes" section. > This is probably why I couldn't find any definition of feature > dependencies. > >> I agree that running the 3.6.x software version on some nodes, and 4.0.x one >> others would be very odd. 3.6 isn't >> even one of our supported software versions any more. Right now we're >> supporting 4.0, 3.9, 3.8. I was just trying >> to give an example. Perhaps we can simply agree to move on, since the bad >> case here relies on using unsupported >> software versions to do an unsupported operation. > > I don't understand the argument. Software version 4.0 is supported by > the Apache Kafka community. The issue is not the software version but > the metadata version.
Hi José, The issue we were discussing earlier was actually the software version, not the metadata version. Specifically, we were discussing a corner case where MV downgrade would proceed even though the software version didn't support it. This can happen if: 1. the software version on some controller nodes is older than 4.1 (or whatever version in which we release KIP-1155), and 2. the metadata version is older than 3.7-IV3, the MV in which we added controller registration, and 3. the active controller has a post-4.1 (or whatever) software version My argument was that it would be unlikely that all three of these things would be true. One reason why I think this is that 3.7-IV3 is a pretty old MV at this point anyway. > If the user gets into this case, what is the workaround? Do they need > to upgrade all of their controllers to a version that supports > downgrade? Is that sufficient? In the scenario described above, restarting the process is all that is needed. > >> > Snapshot at end offset 100. All records between 0 and 99, inclusive, >> > have been included in the snapshot. The MV in the snapshot is X >> > offset 100 -> metadata.version = Y -- MV was previously upgraded to >> > version Y >> > ... -> ... -- All of these records are serialized using >> > version Y. >> > offset 110 -> metadata.version = X -- MV was downgraded to X. >> > >> > Before a snapshot that includes offset 110 (MV 3.9) could be generated >> > the node restarts. How would the code identify that the records >> > between 100 and 110 need to be snapshotted using metadata version 3.9? >> > Note that the metadata loader can batch all of the records between 100 >> > and 110 into one delta. >> >> You loaded a snapshot with metadata version Y. > > That's what I am trying to highlight. There is no snapshot for MV Y at > offset 100. There is only a snapshot for MV X at offset 99. > >> You replayed a record changing the metadata version to X. We already >> specified >> that this will cause us to generate a new snapshot. The snapshot will >> presumably >> be generated with offset 110 since that's the offset that changed the MV. I >> suppose >> it could also have a slightly larger offset if the loader batched more. > > Yes but the important part in my example above is that the snapshot at > offset 100 has a metadata version of X and the metadata delta at > offset 110 after that snapshot has the MV at X. Yet the MV version > changed to Y (e.g. offset 105) and back to X (at offset 110) in > between 100 and 110. Hmm, I'm having trouble understanding your question in this case. The initial question was something like "what happens if the broker exits before generating the snapshot"? And the answer to that is "the broker generates the snapshot once you start it up again." After all, it is reading the same log, and encountering the same FeatureLevelRecord. Does that answer the question? best, Colin > > Thanks, > -- > -José