On Tue, Jul 4, 2023, at 05:19, Luke Chen wrote:
> 4. Does all the command-line tools with `--bootstrap-controllers` support
> all the options in the tool?
> For example, kafka-configs.sh, In addition to the `--alter` option you
> mentioned in the example, do we also support `--describe` or `--delete`
> option?

Hi Luke,

Yes, we do.

> If so, do we also support setting "quota" for users/clients/topics... via
> `--bootstrap-controllers`? (not intuitive, but maybe we just directly
> commit the change into the metadata from controller?)

Yes. This would be supported via the alterClientQuotas API.

>
> 5. Do we have any plan for this feature to be completed? v3.6.0?
>

Yes, I would like to target 3.6. If time gets tight one or two command 
conversions might slip. The most critical command line tools to support are the 
ones for getting the quorum status itself.

best,
Colin

>
> Thank you.
> Luke
>
>
> On Fri, Apr 28, 2023 at 1:42 AM Colin McCabe <cmcc...@apache.org> wrote:
>
>> On Wed, Apr 26, 2023, at 22:08, Luke Chen wrote:
>> > Hi Colin,
>> >
>> > Some comments:
>> > 1. I agree we should set "top-level" errors for metadata response
>> >
>> > 2. In the "brokers" field of metadata response from controller, it'll
>> > respond with "Controller endpoint information as given in
>> > controller.quorum.voters", instead of the "alive" controllers(voters).
>> That
>> > will break the existing admin client because in admin client, we'll rely
>> on
>> > the metadata response to build the "current alive brokers" list, and
>> choose
>> > one from them to connect (either least load or other criteria). That
>> means,
>> > if now, we return the value in `controller.quorum.voters`, but one of
>> them
>> > is down. We might choose it to connect and get connection errors. Should
>> we
>> > return the "alive" controllers(voters) to client?
>>
>> Hi Luke,
>>
>> Good question. When talking to the controllers directly, the AdminClient
>> needs to always send its RPCs to the active controller. There is one
>> exception: configuring ephemeral log4j settings with
>> incrementalAlterConfigs must be done by sending them to the specified
>> controller node.
>>
>> I will add this to a section called "AdminClient Implementation Notes" so
>> that it's captured in the KIP.
>>
>> >
>> > 3. In the KIP, we list the command-line tools will get a new
>> > --bootstrap-controllers argument, but without explaining why these tools
>> > need to talk to controller directly. Could we add some explanation about
>> > them? I tried but cannot know why some tools are listed here:
>> >     - kafka-acls.sh -> Allow clients to update ACLs via controller before
>> > brokers up
>> >
>> >     - kafka-cluster.sh -> Reasonable to get/update cluster info via
>> > controller
>> >
>> >     - kafka-configs.sh -> Allow clients to dynamically update
>> > configs/describe configs from controller. But in this script, client can
>> > still set quota for users/clients/topics... is client also able to update
>> > via controllers? Or we only allow partial actions in the script to talk
>> to
>> > controllers?
>> >
>> >     - kafka-delegation-tokens.sh -> Reasonable to update
>> delegation-tokens
>> > via controllers
>> >
>> >     - kafka-features.sh -> Reasonable
>> >     - kafka-metadata-quorum.sh -> Reasonable
>> >     - kafka-metadata-shell.sh -> Reasonable
>> >
>> >     - kafka-reassign-partitions.sh -> Why should we allow clients to move
>> > metadata log partitions in controller nodes? What's the use-case?
>> >
>>
>> Yes, the common thread here is that all of these shell commands perform
>> operations can be done without the broker. So it's reasonable to allow them
>> to be done without going through the broker. I don't know if we need a
>> separate note for each since the rationale is really the same for all (is
>> it reasonable? if so allow it.)
>>
>> kafka-reassign-partitions.sh cannot be used to move the __cluster_metadata
>> topic. However, it can be used to move partitions that reside on the
>> brokers, even when using --bootstrap-controllers to talk directly to the
>> quorum.
>>
>> Colin
>>
>> >
>> > Thank you.
>> > Luke
>> >
>> > On Thu, Apr 27, 2023 at 8:04 AM Colin McCabe <cmcc...@apache.org> wrote:
>> >
>> >> On Tue, Apr 25, 2023, at 04:59, Divij Vaidya wrote:
>> >> > Thank you for the KIP Colin.
>> >> >
>> >> > In general, I like the idea of having the ability to interact directly
>> >> with
>> >> > the controllers. I agree with your observation that it helps in
>> >> situations
>> >> > where you would want to get data directly from the controller instead
>> of
>> >> > going via a broker. I have some general comments but the main concern
>> I
>> >> > have is with the piggy-backing of error code with response of
>> >> > __cluster_metadata topic.
>> >> >
>> >> > 1. With this change, how are we guarding against the possibility of
>> >> > misbehaving client traffic from disrupting the controller (that you
>> >> > mentioned as a motivation of earlier behaviour)? One solution could
>> be to
>> >> > have default values set for request throttling on the controller.
>> >>
>> >> Hi Divij,
>> >>
>> >> Thanks for the comments.
>> >>
>> >> Our guards against client misbehavior remain the same:
>> >> 1. our recommendation to put the clients on a separate network
>> >> 2. the fact that producers and consumers can't interact directly with
>> the
>> >> controller
>> >> 3. the authorizer.
>> >>
>> >> Re: #3, I will spell out in the KIP that clients must have DESCRIBE on
>> the
>> >> CLUSTER resource to send a METADATA request to the controller.
>> >>
>> >> > 2. This KIP also increases the network attack surface area. Prior to
>> this
>> >> > KIP, it was possible to have firewall rules setup for controllers such
>> >> that
>> >> > only the brokers can talk to it. But now, the controller would require
>> >> > access from other endpoints other than brokers as well. Can we add a
>> >> > suggestion to the upgrade documentation and inform users
>> >>
>> >> There is no requirement for access from other endpoints. It is still
>> >> possible to set up firewall rules such that only the brokers can talk to
>> >> the controller. In fact I would even say this is desirable. Since this
>> >> faculty is intended for infrequent manual administrative operations,
>> >> needing to log into the broker to use it seems perfectly fine.
>> >>
>> >> > 3. In section KRaft Controller MetadataResponse, row 3, "There is no
>> >> > top-level error code in MetadataResponse, so we use the
>> >> __cluster_metadata
>> >> > topic to send back our error.". This will definitely confuse the
>> users.
>> >> Can
>> >> > we introduce a top level error field instead?
>> >>
>> >> Let me check how we're handling this in other places. I recall some
>> other
>> >> cases where we used the dummy topic approach, but I can't find them just
>> >> now.
>> >>
>> >> > 4. As part of the KIP, could we please add some documentation for
>> users
>> >> > with the suggestion of when to get information directly from the
>> >> controller
>> >> > and when not to (and associated tradeoffs)?
>> >>
>> >> I think most users will not face this tradeoff because they simply won't
>> >> have network access to the controller servers.
>> >>
>> >> For those who do want more information, we'll have command-line
>> >> documentation for --boostrap-controllers and the bootstrap.controllers
>> >> configuration key.
>> >>
>> >> > 5. Why do we need the `FromKRaftController` field in the response?
>> What
>> >> do
>> >> > we expect the users to do with this information?
>> >>
>> >> The field is present so that we can throw an exception in the client if
>> we
>> >> have received a response that is not from the controller.
>> >>
>> >> > 6. Can we drop `KRaft` from the fields `FromKRaftController` and
>> >> > `DirectToKRaftControllerQuorum`? My suggestion will be to rename it as
>> >> > `DirectToController`.
>> >>
>> >> I like the idea, but wouldn't that create confusion in the ZK cluster
>> case?
>> >>
>> >> > 7.  "kafka-metadata-shell.sh  is at an "evolving" level of interface
>> >> > stability" -> I thought that with KRaft being production ready, the
>> >> > evolving mode for kraft-related tools has also moved to production.
>> Do we
>> >> > have a timeline when this would move to production?
>> >>
>> >> That's a good question, but I think we should tackle it separately from
>> >> this KIP. The metadata shell is pretty different from other parts of
>> kafka
>> >> since it interacts so closely with internals.
>> >>
>> >> best,
>> >> Colin
>> >>
>> >>
>> >> >
>> >> > --
>> >> > Divij Vaidya
>> >> >
>> >> >
>> >> >
>> >> > On Tue, Apr 25, 2023 at 1:38 AM Colin McCabe <cmcc...@apache.org>
>> wrote:
>> >> >
>> >> >> On Fri, Apr 21, 2023, at 14:17, Jason Gustafson wrote:
>> >> >> > Hey Colin,
>> >> >> >
>> >> >> > The KIP makes sense overall. Nice to clarify the contract between
>> >> clients
>> >> >> > and the controllers. The use of `DirectToKRaftControllerQuorum`
>> will
>> >> help
>> >> >> > prevent misconfiguration. In fact, I wonder if we can return a
>> fatal
>> >> >> error
>> >> >> > instead of NOT_CONTROLLER so that the client would immediately
>> fail.
>> >> For
>> >> >> > example, could we use INVALID_REQUEST or something like that?
>> Either
>> >> that
>> >> >> > or we need to make sure clients treat NOT_CONTROLLER as a fatal
>> error.
>> >> >> > Without that, it would probably get picked up with default retry
>> logic
>> >> >> and
>> >> >> > the user might not see the problem.
>> >> >>
>> >> >> Hi Jason,
>> >> >>
>> >> >> Yes, this is a good point. It should return INVALID_REQUEST since
>> that
>> >> is
>> >> >> not retryable. I'll change it.
>> >> >>
>> >> >> >
>> >> >> > I also wonder if we can relax the requirements on the Metadata
>> >> request so
>> >> >> > that we can use it to list topics and partition state (e.g.
>> URPs).  It
>> >> >> > would be useful to query the controllers as the metadata source of
>> >> truth
>> >> >> > when we suspect that the broker metadata may have diverged.
>> >> >> >
>> >> >>
>> >> >> Fair enough. I will document that we can return topics.
>> >> >>
>> >> >> I also added a "future work" section about maybe using the
>> controllers
>> >> as
>> >> >> bootstrap servers for the broker cluster. To be clear, that is NOT in
>> >> scope
>> >> >> here, but it's interesting to think about potentially doing in the
>> >> future.
>> >> >> The major problem is what to do when the broker endpoints we're
>> >> returning
>> >> >> have different security settings from the controller endpoint the
>> client
>> >> >> initially talked to.
>> >> >>
>> >> >> best,
>> >> >> Colin
>> >> >>
>> >> >>
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Jason
>> >> >> >
>> >> >> > On Thu, Apr 20, 2023 at 5:53 PM Colin McCabe <cmcc...@apache.org>
>> >> wrote:
>> >> >> >
>> >> >> >> On Wed, Apr 19, 2023, at 20:56, Philip Nee wrote:
>> >> >> >> > Hey Colin,
>> >> >> >> >
>> >> >> >> > I still need to finish reading and understanding the KIP, but I
>> >> have a
>> >> >> >> > couple of comments despite being ignorant of most of the KRaft
>> >> stuff.
>> >> >> >> > (Sorry!)
>> >> >> >> >
>> >> >> >> > Firstly, does it make sense to create an extension of the
>> current
>> >> >> >> > AdminClient only to handle these specific KRaft use cases? It
>> seems
>> >> >> >> > cumbersome to have two sets of bootstrap configurations to make
>> the
>> >> >> >> > AdminClient generic enough to handle these specific cases,
>> instead,
>> >> >> maybe
>> >> >> >> > it is more obvious (to me) to just extend the AdminClient. What
>> I'm
>> >> >> >> > thinking is KraftAdminClient which continuously uses
>> >> >> *bootstrap.servers*,
>> >> >> >> > but make this class only serves the Kraft controllers APIs.
>> >> >> >>
>> >> >> >> Hi Philip,
>> >> >> >>
>> >> >> >> Thanks for taking a look.
>> >> >> >>
>> >> >> >> We would not want to create a new Admin client class in order to
>> >> >> >> communicate directly with the controllers. The RPCs accepted by
>> the
>> >> >> >> controllers have the same format as the those accepted by the
>> >> brokers.
>> >> >> >> There is no difference in what is sent over the wire or the data
>> >> >> structures
>> >> >> >> that are used in the client.
>> >> >> >>
>> >> >> >> I know you mentioned you haven't had time to read all the KRaft
>> stuff
>> >> >> (and
>> >> >> >> there is a lot, I understand). But this is one area that would
>> >> probably
>> >> >> be
>> >> >> >> clarified if you were able to read at least KIP-500 and KIP-590.
>> RPCs
>> >> >> sent
>> >> >> >> to the broker are forwarded to the controller (mostly) without
>> >> >> modification.
>> >> >> >>
>> >> >> >> >
>> >> >> >> > Secondly, if we want to continue with the design, I'm not yet
>> sure
>> >> >> why we
>> >> >> >> > can't continue using the *bootstrap.servers*? I assume when the
>> >> client
>> >> >> >> gets
>> >> >> >> > the metadata, it should know who it is talking to. I'm just
>> >> >> reconsidering
>> >> >> >> > your alternative again.
>> >> >> >> >
>> >> >> >> > A bad idea: Why don't we continue using *bootstrap.servers* but
>> >> have a
>> >> >> >> > separated config like *kraft.controller* = true/false. I feel
>> like
>> >> >> most
>> >> >> >> > users might not know what is a controller and causes some
>> mistakes
>> >> >> down
>> >> >> >> the
>> >> >> >> > road.
>> >> >> >> >
>> >> >> >>
>> >> >> >> Well, you called it a bad idea, and I can't help but agree! :)
>> >> >> >>
>> >> >> >> I think "the user might not know what a controller is" is a good
>> sign
>> >> >> that
>> >> >> >> they shouldn't be interacting with the controller. Like many
>> >> AdminClient
>> >> >> >> APIs, this one is intended for use by administrators only. Most
>> users
>> >> >> >> indeed should not need to know what a controller is or interact
>> with
>> >> it
>> >> >> >> directly. If they do interact it should be very clear that they
>> are
>> >> >> doing
>> >> >> >> so. The --controller-bootstrap flag makes it very clear, as does
>> the
>> >> >> >> separate configuration.
>> >> >> >>
>> >> >> >> Let me give an example of the kind of problems that arise if you
>> >> want to
>> >> >> >> reuse bootstrap.servers for this purpose.
>> >> >> >>
>> >> >> >> If the user grasb a 3.4 Kafka AdminClient and set
>> bootstrap.servers
>> >> to a
>> >> >> >> set of controller servers, and set direct.to.controller to true,
>> the
>> >> >> >> unknown (in 3.4) configuration will be ignored, and a normal
>> metadata
>> >> >> >> request will be sent without the direct to controller flag. In
>> that
>> >> >> case it
>> >> >> >> will give back an error. Confusing, right?
>> >> >> >>
>> >> >> >> Using controller.servers clarifies this situation because the 3.4
>> >> client
>> >> >> >> will not support that config, and will complain about the lack of
>> >> >> >> bootstrap.servers.
>> >> >> >>
>> >> >> >> Actually, the situation could get even more confusing than what I
>> >> >> >> described since some older preproduction versions of the KRaft
>> >> >> controller
>> >> >> >> did implement the METADATA RPC. So if you send them a METADATA
>> >> request
>> >> >> >> without any special information, it's not clear what you'll get.
>> >> >> Indeed,
>> >> >> >> it would be dependent on the client version and the controller
>> >> version.
>> >> >> >>
>> >> >> >> The bottom line is that reusing the bootstrap.servers
>> configuration
>> >> here
>> >> >> >> is not a good idea.
>> >> >> >>
>> >> >> >> best,
>> >> >> >> Colin
>> >> >> >>
>> >> >> >> > Thanks,
>> >> >> >> > P
>> >> >> >> >
>> >> >> >> > On Wed, Apr 19, 2023 at 2:18 PM Colin McCabe <
>> cmcc...@apache.org>
>> >> >> wrote:
>> >> >> >> >
>> >> >> >> >> Hi all,
>> >> >> >> >>
>> >> >> >> >> I wrote a short KIP about allowing AdminClient to talk directly
>> >> with
>> >> >> the
>> >> >> >> >> KRaft controller quorum. Check it out here:
>> >> >> >> >>
>> >> >> >> >> https://cwiki.apache.org/confluence/x/Owo0Dw
>> >> >> >> >>
>> >> >> >> >> best,
>> >> >> >> >> Colin
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>>

Reply via email to