In last KIP hangout following questions were raised.

   1.

   *Whether or not to support move command? If yes, how do we support it.*
   I think *move* command will be essential, once we start supporting
   directories. However, implementation might be a bit convoluted. A few
   things required for it will be, ability to mark a topic unavailable during
   the move, update brokers’ metadata cache to reflect the move.
   2.

   *How will acls/ configs inheritance work?*
   Say we have /dc/ns/topic.
   dc has dc_acl and dc_config. Similarly for ns and topic.
   For being able to perform an action on /dc/ns/topic, the user must have
   required perms on dc, ns and topic for that operation. For example, User1
   will need DESCRIBE permissions on dc, ns and topic to be able to describe
   /dc/ns/topic.
   For configs, configs for /dc/ns/topic will be topic_config + ns_config +
   dc_config, in that order. So, if a config is specified for topic then that
   will be used, else it’s parent (ns) will be checked for that config, and
   this goes on.
   3.

   *Will supporting n-deep hierarchy be a concern?*
   This can be a performance concern, however it sounds more of a misusage
   of the functionality or bad organization of topics. We can have a depth
   limit, but I am not sure if it is required.
   4.

   *Will we continue to support multi-directory on disk, that was proposed
   in KAFKA-188?*
   Yes, we should be able to support that. It is within those directories,
   namespaces will be created. The heuristics for choosing least loaded
   disc/dir will remain same.
   5.

   *Will it be required to move existing topics from default directory/
   namespace to a particular directory/ namespace to enable mirror-maker
   replicate topics in that directory/namespace?*
   I do not think it will be required, as one can simple add /*/* to
   mirror-maker’s blacklist and this will only capture topics that exist in
   default namespace. @Joel, does this answer your question?

​

On Fri, Oct 16, 2015 at 6:33 PM, Ashish Singh <asi...@cloudera.com> wrote:

> On Thu, Oct 15, 2015 at 1:30 PM, Jiangjie Qin <j...@linkedin.com.invalid>
> wrote:
>
>> Hey Jay,
>>
>> If we allow consumer to subscribe to /*/my-event, does that mean we allow
>> consumer to consume cross namespaces?
>
> That is the idea. If a user has permissions then yes, he should be able to
> consume from as many namespaces as he wants.
>
>
>> In that case it seems not
>> "hierarchical" but more like a name field filtering. i.e. user can choose
>> to consume from topic where datacenter={x,y},
>> topic_name={my-topic1,mytopic2}. Am I understanding right?
>>
> I think it is still hierarchical, however with possible filtering (as you
> said).
>
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Wed, Oct 14, 2015 at 12:49 PM, Jay Kreps <j...@confluent.io> wrote:
>>
>> > Hey Jason,
>> >
>> > I actually think this is one of the advantages. The problem we have
>> today
>> > is that you can't really do bidirectional replication between clusters
>> > because it would actually be a feedback loop.
>> >
>> > So the intended use would be that you would have a structure where the
>> > top-level directory was DIFFERENT but the topic names were the same, so
>> if
>> > you maintain
>> >   /chicago-datacenter/actual-topics
>> >   /oregon-datacenter/actual topics
>> >   etc.
>> > Then you replicate
>> >   /chicago-datacenter/* => /oregon-datacenter
>> > and
>> >   /oregon-datacenter/* => /chicago-datacenter
>> >
>> > People who want the aggregate feed subscribe to /*/my-event.
>> >
>> > The nice thing about this is it gives a unified namespace across all
>> > locations.
>> >
>> > Basically exactly what we do now but you no longer need to add new
>> clusters
>> > to get the namespacing.
>> >
>> > -Jay
>> >
>> >
>> > On Wed, Oct 14, 2015 at 11:24 AM, Jason Gustafson <ja...@confluent.io>
>> > wrote:
>> >
>> > > Hey Ashish, thanks for the write-up. I think having a namespace
>> > capability
>> > > is a useful feature for Kafka, in particular with the addition of the
>> > > authorization layer. I probably prefer Jay's hierarchical approach if
>> > we're
>> > > going to embed the namespace in the topic name since it seems more
>> > general.
>> > > That said, one advantage of having a namespace independent of the
>> topic
>> > > name is that it simplifies replication between namespaces a bit since
>> you
>> > > don't have to parse and rewrite topic names. Assuming that
>> hierarchical
>> > > topics will happen eventually anyway, I imagine a common pattern
>> would be
>> > > to preserve the same directory structure in multiple namespaces, so
>> > having
>> > > an easy mechanism for applications to switch between them would be
>> nice.
>> > > The namespace is kind of analogous to a chroot in this case. Of course
>> > you
>> > > can achieve the same thing by having a configurable topic prefix, just
>> > you
>> > > have to do all the topic rewriting, which I'm guessing will be a
>> little
>> > > annoying to implement in all of the clients and tools. However, the
>> > > tradeoff (as you mention in the KIP) is that all request schemas have
>> to
>> > be
>> > > updated, which is also annoying.
>> > >
>> > > -Jason
>> > >
>> > > On Wed, Oct 14, 2015 at 12:03 AM, Ashish Singh <asi...@cloudera.com>
>> > > wrote:
>> > >
>> > > > On Mon, Oct 12, 2015 at 7:37 PM, Gwen Shapira <g...@confluent.io>
>> > wrote:
>> > > >
>> > > > > This works really nicely from the consumer side, but what about
>> the
>> > > > > producer? If there are no more topics,do we allow producing to a
>> > > > directory
>> > > > > and have the Partitioner hash-partition messages between all
>> > partitions
>> > > > in
>> > > > > the multiple levels in a directory?
>> > > > >
>> > > > Good point.
>> > > >
>> > > > I am personally in favor of maintaining current behavior for
>> producer,
>> > > > i.e., letting users to only produce to a topic. This is different
>> for
>> > > > consumers, the suggested behavior is inline with current behavior.
>> One
>> > > can
>> > > > use regex subscription to achieve the same even today.
>> > > >
>> > > > >
>> > > > > Also, I think we want to preserve the consumer terminology of
>> > > "subscribe"
>> > > > > to topics / directories, but "assign" partitions - since the
>> consumer
>> > > > > behavior is different in those cases.
>> > > > >
>> > > > > On Mon, Oct 12, 2015 at 7:16 PM, Jay Kreps <j...@confluent.io>
>> wrote:
>> > > > >
>> > > > > > Okay this is similar to what I think we have talked about
>> before.
>> > Let
>> > > > me
>> > > > > > elaborate on the idea that I think has been floating
>> around--it's
>> > > > pretty
>> > > > > > similar with a few differences.
>> > > > > >
>> > > > > > I think what you are calling the "default namespace" is
>> basically
>> > > what
>> > > > I
>> > > > > > would call the "current working directory" with paths not
>> beginning
>> > > > with
>> > > > > > '/' being interpreted relative to this directory as in the fs.
>> > > > > >
>> > > > > > One thing you have to work out is what levels in this hierarchy
>> you
>> > > can
>> > > > > > actually subscribe to. I think you are assuming only what we
>> > > currently
>> > > > > > consider a "topic", i.e. the first level of directories but not
>> the
>> > > > > > partitions or parent dirs, would be subscribable. If you think
>> > about
>> > > > it,
>> > > > > > though, that constraint is a bit arbitrary.
>> > > > > >
>> > > > > > I'd propose instead the semantics that:
>> > > > > > - Subscribing to /a/b/c/0 means subscribing to the 0th
>> partition of
>> > > > topic
>> > > > > > "c" in directory /a/b
>> > > > > > - Subscribing to /a/b/c means subscribing to all partitions in
>> > > > > > topic/directory "c"
>> > > > > > - Subscribing to /a/b means subscribing to all partitions in all
>> > > > > > topics/subdirectories under a/b recursively
>> > > > > >
>> > > > > > Effectively the concept of topics goes away entirely--you just
>> have
>> > > > > > partitions/logs and directories. In this respect rather than
>> adding
>> > > new
>> > > > > > concepts this new feature would actually just generalizes what
>> we
>> > > have
>> > > > > > (which I think is a good thing).
>> > > > > >
>> > > > > > -Jay
>> > > > > >
>> > > > > > On Mon, Oct 12, 2015 at 6:24 PM, Ashish Singh <
>> asi...@cloudera.com
>> > >
>> > > > > wrote:
>> > > > > >
>> > > > > > > On Mon, Oct 12, 2015 at 5:42 PM, Jay Kreps <j...@confluent.io>
>> > > wrote:
>> > > > > > >
>> > > > > > > > Great. I definitely would strongly favor carrying over
>> user's
>> > > > > intuition
>> > > > > > > > from FS unless we think we need a very different model. The
>> > minor
>> > > > > > details
>> > > > > > > > like the seperator and namespace term will help with that.
>> > > > > > > >
>> > > > > > > > Follow-up question, say I have a layout like
>> > > > > > > >    /chicago-datacenter/user-events/pageviews
>> > > > > > > > Can I subscribe to
>> > > > > > > >    /chicago-datacenter/user-events
>> > > > > > > >
>> > > > > > > Yes, however they will have need a regex like
>> > > > > > > /chicago-datacenter/user-events/*
>> > > > > > >
>> > > > > > > > to get the full firehose of user events from chicago? Can I
>> > > > subscribe
>> > > > > > to
>> > > > > > > >    /*/user-events
>> > > > > > > > to get user events originating from all datacenters?
>> > > > > > > >
>> > > > > > > Yes, however they will have need a regex like
>> > > > > > > /chicago-datacenter/user-events/*
>> > > > > > > Yes
>> > > > > > >
>> > > > > > > >
>> > > > > > > > (Assuming, for now, that these are all in the same
>> cluster...)
>> > > > > > > >
>> > > > > > > > Also, just to confirm, it sounds from the proposal like
>> config
>> > > > > > overrides
>> > > > > > > > would become fully hierarchical so you can override config
>> at
>> > any
>> > > > > > > directory
>> > > > > > > > point. This will add complexity in implementation but I
>> think
>> > > will
>> > > > > > likely
>> > > > > > > > be much more operator friendly.
>> > > > > > > >
>> > > > > > > Yes, that is the idea.
>> > > > > > >
>> > > > > > > >
>> > > > > > > > There are about a thousand details to discuss in terms of
>> how
>> > > this
>> > > > > > would
>> > > > > > > > impact the metadata request, various zk entries, and various
>> > > other
>> > > > > > > aspects,
>> > > > > > > > but probably it makes sense to first agree on how we would
>> want
>> > > it
>> > > > to
>> > > > > > > work
>> > > > > > > > and then start to dive into how to implement that.
>> > > > > > > >
>> > > > > > > Agreed.
>> > > > > > >
>> > > > > > > >
>> > > > > > > > -Jay
>> > > > > > > >
>> > > > > > > > On Mon, Oct 12, 2015 at 5:28 PM, Ashish Singh <
>> > > asi...@cloudera.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Hey Jay, thanks for reviewing the proposal. Answers
>> inline.
>> > > > > > > > >
>> > > > > > > > > On Mon, Oct 12, 2015 at 10:53 AM, Jay Kreps <
>> > j...@confluent.io>
>> > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Hey guys,
>> > > > > > > > > >
>> > > > > > > > > > I think this is an important feature and one we've
>> talked
>> > > about
>> > > > > > for a
>> > > > > > > > > > while. I really think trying to invent a new
>> nomenclature
>> > is
>> > > > > going
>> > > > > > to
>> > > > > > > > > make
>> > > > > > > > > > it hard for people to understand, though. As such I
>> > recommend
>> > > > we
>> > > > > > call
>> > > > > > > > > > namespaces "directories" and denote them with '/'--this
>> > will
>> > > > make
>> > > > > > the
>> > > > > > > > > > feature 1000x more understandable to people.
>> > > > > > > > >
>> > > > > > > > > Essentially you are suggesting two things here.
>> > > > > > > > > 1. Use "Directory" instead of "Namespace" as it is more
>> > > > intuitive.
>> > > > > I
>> > > > > > > > agree.
>> > > > > > > > > 2. Make '/' as delimiter instead of ':'. Fine with me and
>> I
>> > > agree
>> > > > > if
>> > > > > > we
>> > > > > > > > > call these directories, '/' is the way to go.
>> > > > > > > > >
>> > > > > > > > > I think we should inheret the
>> > > > > > > > > > semantics of normal unix fs in so far as it makes sense.
>> > > > > > > > > >
>> > > > > > > > > > In this approach we get rid of topics entirely, instead
>> we
>> > > > really
>> > > > > > > just
>> > > > > > > > > have
>> > > > > > > > > > partitions which are the equivalent of a file and retain
>> > > their
>> > > > > > > numeric
>> > > > > > > > > > names, and the existing topic concept is just the first
>> > > > directory
>> > > > > > > level
>> > > > > > > > > but
>> > > > > > > > > > we generalize to allow arbitrarily many more levels of
>> > > nesting.
>> > > > > > This
>> > > > > > > > > allows
>> > > > > > > > > > categorization of data, such as
>> > > > > > /datacenter1/user-events/page-views/3
>> > > > > > > > and
>> > > > > > > > > > you can subscribe, apply configs or permissions at any
>> > level
>> > > of
>> > > > > the
>> > > > > > > > > > hierarchy.
>> > > > > > > > > >
>> > > > > > > > > +1. This actually requires just a minor change to existing
>> > > > > proposal,
>> > > > > > > > i.e.,
>> > > > > > > > > "some:namespace:topic" becomes "some/namespace/topic".
>> > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > I'm actually not 100% such what the semantics of
>> accessing
>> > > data
>> > > > > in
>> > > > > > > > > > differing namespaces is in the current proposal, maybe
>> you
>> > > can
>> > > > > > > clarify
>> > > > > > > > > > Ashish?
>> > > > > > > > >
>> > > > > > > > > I will add more info to KIP on this, however I think a
>> client
>> > > > > should
>> > > > > > be
>> > > > > > > > > able to access data in any namespace as long as following
>> > > > > conditions
>> > > > > > > are
>> > > > > > > > > satisfied.
>> > > > > > > > >
>> > > > > > > > > 1. Namespace, the client is trying to access, exists.
>> > > > > > > > > 2. The client has sufficient permissions on the namespace
>> for
>> > > > type
>> > > > > of
>> > > > > > > > > operation the client is trying to perform on a topic
>> within
>> > > that
>> > > > > > > > namespace.
>> > > > > > > > > 3. The client has sufficient permissions on the topic for
>> > type
>> > > of
>> > > > > > > > operation
>> > > > > > > > > the client is trying to perform on that topic.
>> > > > > > > > >
>> > > > > > > > > If we choose to go with what you suggested earlier that
>> just
>> > > have
>> > > > > > > > hierarchy
>> > > > > > > > > of directories, then step 3 will actually be covered in
>> step
>> > 2.
>> > > > > > > > >
>> > > > > > > > > In the current proposal, consumers will subscribe to a
>> topic
>> > > in a
>> > > > > > > > namespace
>> > > > > > > > > by specifying <namespace>:<topic> as the topic name. They
>> can
>> > > > > > subscribe
>> > > > > > > > to
>> > > > > > > > > topics from multiple namespaces.
>> > > > > > > > >
>> > > > > > > > > Let me know if I totally missed your question.
>> > > > > > > > >
>> > > > > > > > > Since the point of Kafka is sharing data I think it is
>> really
>> > > > > > > > > > important that the grouping be just for
>> > > > > > > > > convenience/permissions/config/etc
>> > > > > > > > > > and that it remain possible to access multiple
>> > > > > > directories/namespaces
>> > > > > > > > > from
>> > > > > > > > > > the same client.
>> > > > > > > > > >
>> > > > > > > > > Totally agree with you.
>> > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > -Jay
>> > > > > > > > > >
>> > > > > > > > > > On Fri, Oct 9, 2015 at 6:32 PM, Ashish Singh <
>> > > > > asi...@cloudera.com>
>> > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hey Guys,
>> > > > > > > > > > >
>> > > > > > > > > > > I just created KIP-37 for adding namespaces to Kafka.
>> > > > > > > > > > >
>> > > > > > > > > > > KIP-37
>> > > > > > > > > > > <
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-37+-+Add+Namespaces+to+Kafka
>> > > > > > > > > > > >
>> > > > > > > > > > > tracks the proposal.
>> > > > > > > > > > >
>> > > > > > > > > > > The idea is to make Kafka support multi-tenancy via
>> > > > namespaces.
>> > > > > > > > > > >
>> > > > > > > > > > > Feedback and comments are welcome.
>> > > > > > > > > > > ​
>> > > > > > > > > > > --
>> > > > > > > > > > >
>> > > > > > > > > > > Regards,
>> > > > > > > > > > > Ashish
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > --
>> > > > > > > > >
>> > > > > > > > > Regards,
>> > > > > > > > > Ashish
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > --
>> > > > > > >
>> > > > > > > Regards,
>> > > > > > > Ashish
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > >
>> > > > Regards,
>> > > > Ashish
>> > > >
>> > >
>> >
>>
>
>
>
> --
>
> Regards,
> Ashish
>



-- 

Regards,
Ashish

Reply via email to