Hi, Thanks Colin for the feedback. Edo and I have updated the KIP accordingly. Can you take another look?
On Tue, Oct 22, 2019 at 12:20 AM Colin McCabe <cmcc...@apache.org> wrote: > > Hi Mickael, > > We don't have any official way for brokers to join the cluster other than > showing up and registering themselves in ZK. Similarly, we don't have any > way of removing brokers from the cluster other than simply removing them and > removing their znodes from ZooKeeper. > > If we wanted to change this, it seems like it would be a really big step. We > would need public, stable APIs for both of these things. Or at least for the > removal thing, which is currently automatic and doesn't require any action on > the part of the administrator. Administrators would have to be retrained to > do this whenever shrinking the cluster. > We cannot tell people to modify ZK directly for this. > > To be honest, I don't think reworking broker registration is worth it for > this change. I think we could pretty easily have placeholder values for the > missing replicas like -1, -2, -3, etc. and just fill them in whenever a new > broker comes online. This may be slightly more complex to implement, but it > greatly simplifies what users have to do. > > It is true that filling in -1, -2, -3, etc. will not preserve rack placement > information. But this is kind of a more general problem that we should > probably solve separately. After placement, a lot of placement information > disappears and is not accessible to reassignment. Since reassignment is > becoming more and more important, we should make an effort to preserve this > information. Since that would be a big change, it's probably best to do > separately, however. > > The "rejected alternatives" section says that adding an option to > CreateTopicsRequest to allow users to opt-in to the new behavior "felt too > complex." But I think this could use a little clarification. Adding a new > boolean to the createTopics command is actually fairly simple from the > perspective of a developer. But it adds another thing for end-users to think > about when using the software. It's also not clear how many users would take > advantage of this. I think that's the reason people were not in favor of it, > not a general feeling of complexity. Adding more configuration options is > often simple to implement, and making things "just work" is often a little > more complex. But we should prefer the latter, most of the time at least. I > think this is what you meant here, but it would be good to clarify. > > "Rejected alternatives" also talks about an error code and an error message > when the replication is not up to full strength. But this was removed, > right? We should clarify that no error code is returned in this case, and > the CreateTopicsResponse returns the true number of replicas that was > created, in case the client is interested in this information. Returning an > error code would certainly cause problems for a lot of users, who use > all().get() to verify that all the topics have been successfully created. > > best, > Colin > > > On Mon, Oct 21, 2019, at 09:50, Mickael Maison wrote: > > Thanks Stanislav and Colin for the feedback. > > > > I've updated the KIP to make it simpler. > > It's not updating the CreateTopics/CreatePartitions RPCs anymore. I've > > kept the broker setting so admins can keep the current behaviour but > > simplified it to be either enabled or disabled. > > > > I've also kept the observed_brokers nodes in Zookeeper. I can't think > > of a better alternative to keep track of the expected brokers. The > > other option would be to perform the extra replica creation > > asynchronously (driven by the controller when a broker joins the > > cluster) but that feels a lot more complicated for this specific use > > case. > > > > I've also made it explicit that at least "min.insync.replicas" brokers > > have to be online to allow topic/partition creation. > > > > Thanks > > > > On Mon, Mar 25, 2019 at 1:17 PM Mickael Maison <mickael.mai...@gmail.com> > > wrote: > > > > > > Thanks Colin for the feedback. > > > > > > The idea was to allow both users and administrator to decide if they > > > wanted to opt-in and if so under what conditions. > > > > > > Maybe we could do something simpler and just allow the creation if at > > > least min-in-sync replicas are available? That should not require > > > changes to the protocol and while this might not cover all possible > > > use cases, that would still cover the use cases we've listed in the > > > KIP. That would also tie in with existing semantics/guarantees > > > (min-in-sync). > > > > > > Thanks > > > > > > On Tue, Feb 26, 2019 at 5:40 PM Colin McCabe <cmcc...@apache.org> wrote: > > > > > > > > Hi Mickael, > > > > > > > > I don't think adding CREATED_UNDER_REPLICATED as an error code makes > > > > sense. It is not an error condition, as described here. > > > > > > > > > Updates to the Decommissioning brokers section in the documentation > > > > > will mention that if a broker id is never to be reused then its > > > > > corresponding node in zookeeper > > > > > /brokers/observed_ids will need to be removed manually > > > > > > > > I don't think it's acceptable to ask admins to manually modify > > > > ZooKeeper here. In general the ZK changes seem kind of like a hack -- > > > > perhaps we should drop it from the proposal for now. > > > > > > > > Perhaps we could even somehow do all of this in a custom > > > > CreateTopicPolicy? That would avoid the need for RPC changes, new > > > > configuration knobs, etc. > > > > > > > > best, > > > > Colin > > > > > > > > > > > > On Tue, Dec 18, 2018, at 08:43, Mickael Maison wrote: > > > > > Hi, > > > > > > > > > > We have submitted a KIP to handle topics and partitions creation when > > > > > a cluster is not fully available: > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-409%3A+Allow+creating+under-replicated+topics+and+partitions > > > > > > > > > > As always, we welcome feedback and suggestions. > > > > > > > > > > Thanks > > > > > Mickael and Edoardo > > > > > > >