Hi,

Thanks Colin for the feedback. Edo and I have updated the KIP
accordingly. Can you take another look?


On Tue, Oct 22, 2019 at 12:20 AM Colin McCabe <cmcc...@apache.org> wrote:
>
> Hi Mickael,
>
> We don't have any official way for brokers to join the cluster other than 
> showing up and registering themselves in ZK.  Similarly, we don't have any 
> way of removing brokers from the cluster other than simply removing them and 
> removing their znodes from ZooKeeper.
>
> If we wanted to change this, it seems like it would be a really big step.  We 
> would need public, stable APIs for both of these things.  Or at least for the 
> removal thing, which is currently automatic and doesn't require any action on 
> the part of the administrator. Administrators would have to be retrained to 
> do this whenever shrinking the cluster.
>  We cannot tell people to modify ZK directly for this.
>
> To be honest, I don't think reworking broker registration is worth it for 
> this change.  I think we could pretty easily have placeholder values for the 
> missing replicas like -1, -2, -3, etc. and just fill them in whenever a new 
> broker comes online.  This may be slightly more complex to implement, but it 
> greatly simplifies what users have to do.
>
> It is true that filling in -1, -2, -3, etc. will not preserve rack placement 
> information.  But this is kind of a more general problem that we should 
> probably solve separately.  After placement, a lot of placement information 
> disappears and is not accessible to reassignment.  Since reassignment is 
> becoming more and more important, we should make an effort to preserve this 
> information.  Since that would be a big change, it's probably best to do 
> separately, however.
>
> The "rejected alternatives" section says that adding an option to 
> CreateTopicsRequest to allow users to opt-in to the new behavior "felt too 
> complex."  But I think this could use a little clarification.  Adding a new 
> boolean to the createTopics command is actually fairly simple from the 
> perspective of a developer.  But it adds another thing for end-users to think 
> about when using the software.  It's also not clear how many users would take 
> advantage of this.  I think that's the reason people were not in favor of it, 
> not a general feeling of complexity.  Adding more configuration options is 
> often simple to implement, and making things "just work" is often a little 
> more complex.  But we should prefer the latter, most of the time at least.  I 
> think this is what you meant here, but it would be good to clarify.
>
> "Rejected alternatives" also talks about an error code and an error message 
> when the replication is not up to full strength.  But this was removed, 
> right?  We should clarify that no error code is returned in this case, and 
> the CreateTopicsResponse returns the true number of replicas that was 
> created, in case the client is interested in this information.  Returning an 
> error code would certainly cause problems for a lot of users, who use 
> all().get() to verify that all the topics have been successfully created.
>
> best,
> Colin
>
>
> On Mon, Oct 21, 2019, at 09:50, Mickael Maison wrote:
> > Thanks Stanislav and Colin for the feedback.
> >
> > I've updated the KIP to make it simpler.
> > It's not updating the CreateTopics/CreatePartitions RPCs anymore. I've
> > kept the broker setting so admins can keep the current behaviour but
> > simplified it to be either enabled or disabled.
> >
> > I've also kept the observed_brokers nodes in Zookeeper. I can't think
> > of a better alternative to keep track of the expected brokers. The
> > other option would be to perform the extra replica creation
> > asynchronously (driven by the controller when a broker joins the
> > cluster) but that feels a lot more complicated for this specific use
> > case.
> >
> > I've also made it explicit that at least "min.insync.replicas" brokers
> > have to be online to allow topic/partition creation.
> >
> > Thanks
> >
> > On Mon, Mar 25, 2019 at 1:17 PM Mickael Maison <mickael.mai...@gmail.com> 
> > wrote:
> > >
> > > Thanks Colin for the feedback.
> > >
> > > The idea was to allow both users and administrator to decide if they
> > > wanted to opt-in and if so under what conditions.
> > >
> > > Maybe we could do something simpler and just allow the creation if at
> > > least min-in-sync replicas are available? That should not require
> > > changes to the protocol and while this might not cover all possible
> > > use cases, that would still cover the use cases we've listed in the
> > > KIP. That would also tie in with existing semantics/guarantees
> > > (min-in-sync).
> > >
> > > Thanks
> > >
> > > On Tue, Feb 26, 2019 at 5:40 PM Colin McCabe <cmcc...@apache.org> wrote:
> > > >
> > > > Hi Mickael,
> > > >
> > > > I don't think adding CREATED_UNDER_REPLICATED as an error code makes 
> > > > sense.  It is not an error condition, as described here.
> > > >
> > > > > Updates to the Decommissioning brokers section in the documentation
> > > > > will mention that if a broker id is never to be reused then its 
> > > > > corresponding node in zookeeper
> > > > > /brokers/observed_ids will need to be removed manually
> > > >
> > > > I don't think it's acceptable to ask admins to manually modify 
> > > > ZooKeeper here.  In general the ZK changes seem kind of like a hack -- 
> > > > perhaps we should drop it from the proposal for now.
> > > >
> > > > Perhaps we could even somehow do all of this in a custom 
> > > > CreateTopicPolicy?  That would avoid the need for RPC changes, new 
> > > > configuration knobs, etc.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Tue, Dec 18, 2018, at 08:43, Mickael Maison wrote:
> > > > > Hi,
> > > > >
> > > > > We have submitted a KIP to handle topics and partitions creation when
> > > > > a cluster is not fully available:
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-409%3A+Allow+creating+under-replicated+topics+and+partitions
> > > > >
> > > > > As always, we welcome feedback and suggestions.
> > > > >
> > > > > Thanks
> > > > > Mickael and Edoardo
> > > > >
> >

Reply via email to