Re: 回复： [DISCUSS] KIP-1134: Virtual Clusters in Kafka

Greg Harris Wed, 26 Mar 2025 17:06:18 -0700

Hi all,

I'm coming around to this design now.


If we require users in some cluster (the root cluster) to be able to refer
to resources in virtual clusters for administration, and are not willing to
add protocol-specific support for this, it needs to be done through name
generation of some kind. I think this name generation is the single biggest
sticking point in this KIP for me.

1. Is there a way to disable this name generation, such that users in the
root cluster are not permitted to even know of the VC resource existence?
This could be a privacy oriented feature, or benefit root operators that
treat topics like "cattle" rather than "pets".
2. Is there a way to assign ACLs in the root cluster to all resources
within a specific VC, or all VCs? This could be used to allow or prevent
writes to topics from outside the VC. With the UUID-first generation,
PREFIX ACLs appear unusable.
3. Topic names have a maximum length, do the UUIDs in physical topics count
against that length?

And a little unrelated to the above but I thought about:

4. For metrics or observability, is there any expectation that VC users
would have observability over the contents of their VC, or is that only
provided to the root VC holder? For example, as a user assigned a VC by my
operators, can I see the produce throughput for some topic-a-0 within my
cluster to verify my application is working? Is it the responsibility of my
operator to build some sort of dashboarding that is VC-aware?

Thanks!
Greg

On Thu, Mar 20, 2025 at 9:48 AM Viktor Somogyi-Vass
<viktor.somo...@cloudera.com.invalid> wrote:

> Hey Stan,
>
> 1. You're right, most of the broker-level configs shouldn't be modified,
> such as thread pools, loggers, etc.. In all honesty, I was mainly thinking
> of group and topic resources here. Overall I think cluster admins should be
> able to restrict VC admins through ACLs. I think we should be able to give
> best practice recommendations. Another option is to physically prevent
> users who are assigned to any VCs to modify broker configuration. I think
> both can be viable. While ACLs put more work on admins but also be more
> flexible, physical prevention could be a bit easier from the user side, but
> also less flexible. I can't think of any use cases where VC admins would
> need to change broker level configs so I may put my vote on the overall
> prevention instead of passing it down via ACLs. Let me know what you think.
>
> 2. Yes your understanding is correct, the drawback of using generated names
> is that users who are outside of any VCs will see the physical names and
> not the mapping. I think it's one bit better than just having the UUID for
> topic names as by appending the original name, a topic that is unlinked
> from all VCs may still maintain some context, so a cluster admin looking
> through the topics isn't completely clueless.
> Our original thought was that VCs should be transparent to users inside
> them, so they see it as a normal cluster. The reason for this was simple:
> if we don't leak VC information everywhere, then we could make it very
> backward compatible and self-contained. If we start leaking VC information
> in existing protocols, then it hurts the transparency principle and may
> tempt people in the future to add it to more protocols which may increase
> the chances of incompatibility. Having this transparency of course comes
> with these side effects of weird topic names for outside users. We thought
> that this is fine most of the time as outside clients shouldn't have
> anything to do with a topic in a VC and therefore it isn't a problem for
> them. For admins there is the describeVirtualCluster admin API that would
> reveal these associations if they're needed for debug reasons. If you think
> we should still have VC information in topic-describe, then I'm open to it,
> however as I said, this kind of hurts the transparency of virtual clusters.
>
> 3. Yes, Luke already asked for metrics and I'm planning to update the KIP
> :). I promised last week but unfortunately I was out sick, so I'm a little
> bit late on that. I'm planning to add topic-link byte in-out metrics, VC,
> link and group counts and quota status. I didn't plan this to be a
> "KIP-500" style proposal as I think this feature is much smaller than that
> and also it's good to present people with the whole idea up-front. Of
> course there can be subsequent KIPs for further feature additions but I
> thought the core can be done in a single proposal.
>
> PS: well I avoided the "tenant" word because it has a slippery definition.
> I think this VC design is similar to namespaces in Pulsar, but then
> "tenant" is also one level higher there as it can span across clusters.
> Then in Redpanda I couldn't exactly find a good definition of it. In
> Confluent Cloud it seems like a "tenant" is a user principal. If I take the
> key aspects of tenants (data isolation, resource isolation, access control,
> customization), then I can fit those on both users and VCs. I think it
> makes more sense to use "tenant" as a synonym to VCs just because
> "multi-tenancy" makes more sense that way. But again, I don't really like
> the "tenant" word itself. Listening to your advice though, I renamed the
> KIP to "Multi-tenancy in Kafka: Virtual Clusters" :). Not sure if I should
> rename the discussion too because then all this conversation will be
> "lost".
>
> Best,
> Viktor
>
> On Sun, Mar 16, 2025 at 12:57 PM Stanislav Kozlovski <
> stanislavkozlov...@apache.org> wrote:
>
> > This is a super cool proposal. Thank you for it!
> >
> > 1) The KIP mentions VCs are allowed ALTER_CONFIG, but doesn't mention
> what
> > configs will be allowed to get modified and what not. There are a few
> > broker-level configs that shouldn't be modifiable by any single VC imo -
> >
> https://github.com/apache/kafka/blob/766caaa551a3e7561cf0434e457ac4230435cb54/core/src/main/scala/kafka/server/DynamicBrokerConfig.scala#L86-L93
> .
> > Kafka also supports changing log levels dynamically, would be weird if
> one
> > VC could change it for all.
> >
> > 2)  re: topic names - the UUID+name combo simply reads weird and is not
> > very user friendly. My understanding is that tenants will see the topic
> > name without the UUID, as the mapping will happen in the broker through
> the
> > user (principal). If you don't use a principal (i.e no auth), you'd just
> > access the general cluster. On the other hand, I agree with the reasoning
> > of not using KIP-37 style hierarchy.
> >
> > Perhaps we could extend the topic describe command (and metadata
> responses
> > that carry the info) to contain the tenant info? I just want to make sure
> > we will offer first-class support in managing such clusters
> >
> > 3) metrics. Would be nice to have a few metrics that aggregate usage per
> > VC. I imagine operators will need a way to monitor these things. That
> being
> > said, I'm not sure how deep into detail you envision going in this single
> > KIP - is the idea here to present it all from a high level, get
> consensus,
> > and then perhaps split in a few more detailed KIPs? (a-la KIP-500)
> >
> > PS: This is a nit but it's worthwhile to get the terms right. Do we
> define
> > a "tenant" as a "VC", or is a "tenant" a Principal that is connected to
> one
> > VC (or more in the future)?
> >
> > Another nit I would suggest for better marketing/interest in the feature
> > is to incorporate the word "multi-tenant" in the proposal. It's
> > industry-standard, very SEO friendly and in general just explains things
> > better. As a simple anecdote, I skipped on reading this proposal at first
> > because I didn't realize it was a multi-tenant solution!
> >
> > In any case, I'm looking forward to seeing this proposal get in! I don't
> > see downsides from having this feature, it's table-stakes for modern data
> > infra in 2025 and as you mentioned - the high overhead cost of Kafka (3
> > nodes, 3 controllers) is overkill for a majority of use cases. If we want
> > to see greater adoption of Kafka, we need to cater to the smaller fish
> too
> > (and hope they grow to use more of the platform) As another point -
> vendors
> > have already offered this in the market for many years (Confluent,
> RedPanda
> > Serverless)
> >
> > Best,
> > Stan
> >
> > On 2025/03/11 04:10:32 萨尔卡 wrote:
> > > agreed with 3-node-case. and we can do more:&nbsp;both&nbsp;virtual
> > clusters and hard clusters.
> > > virtual&nbsp;separate&nbsp;like mysql. different users can use
> different
> > database. different kafka users use different VC.&nbsp;&nbsp;
> > > how about 10000 kafka instances? but kafka should do&nbsp;HARD
> isolation
> > for users to avoid&nbsp;influencing in each others. and keep controll
> risks
> > UNDER small-hard-clusters
> > >
> > >
> > > let's just called it&nbsp;VC-Plus with hard&nbsp;isolation
> > >
> > >
> > > 1. little resource use VC but can not ensure&nbsp;influencing
> > > 2. little resource use VC-Plus, no need to do
> > > 3.&nbsp;lots of resource can use VC but worse than&nbsp;VC-Plus
> > > VC-Plus can&nbsp;avoid&nbsp;influencing in each others,&nbsp;keep
> > controll risks UNDER small-hard-clusters
> > >
> > >
> > >
> > >
> > > 1. little resource, not suggest to use VC
> > >
> > >
> > >
> > > &gt; Administrators would be able to put a cap on the quota of a
> virtual
> > > &gt; cluster, so they can prevent virtual-clusters from monopolizing
> > resources.
> > > &gt; This can be thought of as a certain kind of isolation but I'm not
> > sure if
> > > &gt; you meant more than that? I'll add a revised version of the answer
> > to David
> > > &gt; to the KIP, but the bottom line for why don't we chose the hard
> > kind of
> > > &gt; virtualization where we separate off whole brokers into a cluster
> > of a
> > > &gt; cluster is that it's less efficient on resources too. As I
> imagine,
> > in a
> > > &gt; hard virtualization case if a 3 node Kafka cluster is split into 2
> > virtual
> > > &gt; clusters, we would need to add 3 more brokers if we wanted to
> > satisfy the
> > > &gt; requirement of topics with a replication factor of 3.
> > > &gt; Also with quotas one would be able to control how much produce-,
> > consume-
> > > &gt; and replication-traffic goes through every topic. Adding VC quotas
> > on top
> > > &gt; ensures that resources aren't monopolized.
> > > &gt;&nbsp;
> > > &gt; Best,
> > > &gt; Viktor
> > > &gt;&nbsp;
> > > &gt; On Wed, Mar 5, 2025 at 3:20 AM 萨尔卡 <tigerwe...@qq.com.invalid&gt;
> > wrote:
> > >
> > >
> > > &gt; thanks for your answer. @vk
> > > &gt; in my&amp;nbsp;experience, isolation-in-brokers to avoid
> > influencing each
> > > &gt; other would be a point user cares in-use.&amp;nbsp;
> >
>

Re: 回复： [DISCUSS] KIP-1134: Virtual Clusters in Kafka

Reply via email to