Closing this KIP in favor of adding filtering support to the Metadata API and KIP-142. Will open a new KIP when ready. Thanks for your reviews.
On Mon, Jul 16, 2018 at 8:38 AM Colin McCabe <cmcc...@apache.org> wrote: > Thanks, Manikumar. I've been meaning to bring up KIP-142 again. It would > definitely be a nice improvement. > > best, > Colin > > > On Sat, Jul 14, 2018, at 08:51, Manikumar wrote: > > Hi Jason and Colin, > > > > Thanks for the feedback. I agree that having filtering support to the > > Metadata API would be useful and solves > > the scalability issues. > > > > But to implement specific use case of "describe all topics", regex > > support > > won't help. In any case user needs to > > call listTopics() to get topic list, and then make describeTopics() > > calls > > with a subset of the topics set. > > This leads to improving existing listTopics() API performance. Colin > > already raised a KIP for this: KIP-142 > > < > https://cwiki.apache.org/confluence/display/KAFKA/KIP-142%3A+Add+ListTopicsRequest+to+efficiently+list+all+the+topics+in+a+cluster > > > > . > > May be we should consider implementing KIP-142. > > > > Since we have support wildcard ACLs, Initially, I can explore > > prefixed/wildcards patterns support to Metadata API. > > We can later extend support for regular expressions. > > > > Thanks > > > > > > > > On Sat, Jul 14, 2018 at 2:42 PM Ted Yu <yuzhih...@gmail.com> wrote: > > > > > What if broker crashes before all the pages can be returned ? > > > > > > Cheers > > > > > > On Sat, Jul 14, 2018 at 1:07 AM Stephane Maarek < > > > steph...@simplemachines.com.au> wrote: > > > > > > > Why not paginate ? Then one can retrieve as many topics as desired ? > > > > > > > > On Sat., 14 Jul. 2018, 4:15 pm Colin McCabe, <cmcc...@apache.org> > wrote: > > > > > > > > > Good point. We should probably have a maximum number of results > like > > > > > 1000 or something. That can go in the request RPC as well... > > > > > Cheers, > > > > > Colin > > > > > > > > > > On Fri, Jul 13, 2018, at 18:15, Ted Yu wrote: > > > > > > bq. describe topics by a regular expression on the server side > > > > > > > > > > > > Should caution be taken if the regex doesn't filter ("*") ? > > > > > > > > > > > > Cheers > > > > > > > > > > > > On Fri, Jul 13, 2018 at 6:02 PM Colin McCabe > > > > > > <cmcc...@apache.org> wrote:> > > > > > > > As Jason wrote, this won't scale as the number of partitions > > > > > > > increases.> > We already have users who have tens of thousands > of > > > > > topics, or > > > > > > > more. If> > you multiply that by 100x over the next few > years, you > > > > > end up with > > > > > > > this API> > returning full information about millions of > topics, > > > > which > > > > > clearly > > > > > > > doesn't> > work. > > > > > > > > > > > > > > We discussed this a lot in the original KIP-117 DISCUSS thread > > > > > > > which added> > the Java AdminClient. ListTopics and > DescribeTopics > > > > > were > > > > > > > deliberately kept> > separate because we understood that > > > eventually a > > > > > single RPC would > > > > > > > not be> > able to return information about all the topics in > the > > > > > cluster. So > > > > > > > I have> > to vote -1 for this proposal as it stands. > > > > > > > > > > > > > > I do agree that adding a way to describe topics by a regular > > > > > > > expression on> > the server side would be very useful. This > would > > > > > also fix a major > > > > > > > scalability problem we have now, which is that when > > > > > > > subscribing via a> > regular expression, clients need to fetch > the > > > > > full list of all > > > > > > > topics in> > the cluster and filter locally. > > > > > > > > > > > > > > I think a regular expression library like re2 would be ideal > > > > > > > for this> > purpose. re2 is standardized and language-agnostic > > > (it's > > > > > not tied > > > > > > > only to> > Java). In contrast, Java regular expression change > with > > > > > different > > > > > > > releases> > of the JDK (there were some changes in java 8, for > > > > > example). > > > > > > > Also, re2> > regular expressions are linear time, never > exponential > > > > > time. See > > > > > > > https://github.com/google/re2j > > > > > > > > > > > > > > regards, > > > > > > > Colin > > > > > > > > > > > > > > > > > > > > > On Fri, Jul 13, 2018, at 05:00, Andras Beni wrote: > > > > > > > > The KIP looks good to me. > > > > > > > > However, if there is willingness in the community to work on > > > > > > > > metadata> > > request with patterns, the feature proposed > here > > > and > > > > > filtering by > > > > > > > > '*' or> > > '.*' would be redundant. > > > > > > > > > > > > > > > > Andras > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Jul 13, 2018 at 12:38 AM Jason Gustafson > > > > > > > > <ja...@confluent.io>> > wrote: > > > > > > > > > > > > > > > > > Hey Manikumar, > > > > > > > > > > > > > > > > > > As Kafka begins to scale to larger and larger numbers of > > > > > > > topics/partitions, > > > > > > > > > I'm a little concerned about the scalability of APIs such > as > > > > > > > > > this. The> > API > > > > > > > > > looks benign, but imagine you have have a few million > > > > > > > > > partitions. We> > > > already expose similar APIs in the > > > producer > > > > > and consumer, so > > > > > > > > > probably> > not > > > > > > > > > much additional harm to expose it in the AdminClient, but > it > > > > > > > > > would be> > nice > > > > > > > > > to put a little thought into some longer term options. We > > > should > > > > > > > > > be> > giving > > > > > > > > > users an efficient way to select a smaller set of the > topics > > > > > > > > > they are> > > > interested in. We have always discussed > adding > > > > > some filtering > > > > > > > > > support> > to > > > > > > > > > the Metadata API. Perhaps now is a good time to reconsider > > > this? > > > > > > > > > We now> > > > have a convention for wildcard ACLs, so > perhaps > > > we > > > > > can do > > > > > > > > > something> > > > similar. Full regex support might be ideal > > > given > > > > > the consumer's> > > > subscription API, but that is more > challenging. > > > > What > > > > > do you > > > > > > > > > think?> > > > > > > > > > > > > Thanks, > > > > > > > > > Jason > > > > > > > > > > > > > > > > > > On Thu, Jul 12, 2018 at 2:35 PM, Harsha <ka...@harsha.io> > > > > wrote:> > > > > > > > > > > > > > > > > > > Very useful. LGTM. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Harsha > > > > > > > > > > > > > > > > > > > > On Thu, Jul 12, 2018, at 9:56 AM, Manikumar wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > > > > > I have created a KIP to add describe all topics API to > > > > > > > > > > > AdminClient> > . > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > > > > > > > > > 327%3A+Add+describe+all+topics+API+to+AdminClient > > > > > > > > > > > > > > > > > > > > > > Please take a look. > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >