I¹m a little bit concerned about the request routers among brokers. Typically we have a dominant percentage of produce and fetch request/response. Routing them from one broker to another seems not wanted. Also I think we generally have two types of requests/responses: data related and admin related. It is typically a good practice to separate data plain from control plain. That suggests we should have another admin port to serve those admin requests and probably have different authentication/authorization from the data port.
Jiangjie (Becket) Qin On 2/6/15, 11:18 AM, "Joe Stein" <joe.st...@stealth.ly> wrote: >I updated the installation and sample usage for the existing patches on >the >KIP site >https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and >+centralized+administrative+operations > >There are still a few pending items here. > >1) There was already some discussion about using the Broker that is the >Controller here https://issues.apache.org/jira/browse/KAFKA-1772 and we >should elaborate on that more in the thread or agree we are ok with admin >asking for the controller to talk to and then just sending that broker the >admin tasks. > >2) I like this idea https://issues.apache.org/jira/browse/KAFKA-1912 but >we >can refactor after KAFK-1694 committed, no? I know folks just want to talk >to the broker that is the controller. It may even become useful to have >the >controller run on a broker that isn't even a topic broker anymore (small >can of worms I am opening here but it elaborates on Guozhang's hot spot >point. > >3) anymore feedback? > >- Joe Stein > >On Fri, Jan 23, 2015 at 3:15 PM, Guozhang Wang <wangg...@gmail.com> wrote: > >> A centralized admin operation protocol would be very useful. >> >> One more general comment here is that controller is originally designed >>to >> only talk to other brokers through ControllerChannel, while the broker >> instance which carries the current controller is agnostic of its >>existence, >> and use KafkaApis to handle general Kafka requests. Having all admin >> requests redirected to the controller instance will force the broker to >>be >> aware of its carried controller, and access its internal data for >>handling >> these requests. Plus with the number of clients out of Kafka's control, >> this may easily cause the controller to be a hot spot in terms of >>request >> load. >> >> >> On Thu, Jan 22, 2015 at 10:09 PM, Joe Stein <joe.st...@stealth.ly> >>wrote: >> >> > inline >> > >> > On Thu, Jan 22, 2015 at 11:59 PM, Jay Kreps <jay.kr...@gmail.com> >>wrote: >> > >> > > Hey Joe, >> > > >> > > This is great. A few comments on KIP-4 >> > > >> > > 1. This is much needed functionality, but there are a lot of the so >> let's >> > > really think these protocols through. We really want to end up with >>a >> set >> > > of well thought-out, orthoganol apis. For this reason I think it is >> > really >> > > important to think through the end state even if that includes APIs >>we >> > > won't implement in the first phase. >> > > >> > >> > ok >> > >> > >> > > >> > > 2. Let's please please please wait until we have switched the server >> over >> > > to the new java protocol definitions. If we add upteen more ad hoc >> scala >> > > objects that is just generating more work for the conversion we >>know we >> > > have to do. >> > > >> > >> > ok :) >> > >> > >> > > >> > > 3. This proposal introduces a new type of optional parameter. This >>is >> > > inconsistent with everything else in the protocol where we use -1 or >> some >> > > other marker value. You could argue either way but let's stick with >> that >> > > for consistency. For clients that implemented the protocol in a >>better >> > way >> > > than our scala code these basic primitives are hard to change. >> > > >> > >> > yes, less confusing, ok. >> > >> > >> > > >> > > 4. ClusterMetadata: This seems to duplicate TopicMetadataRequest >>which >> > has >> > > brokers, topics, and partitions. I think we should rename that >>request >> > > ClusterMetadataRequest (or just MetadataRequest) and include the id >>of >> > the >> > > controller. Or are there other things we could add here? >> > > >> > >> > We could add broker version to it. >> > >> > >> > > >> > > 5. We have a tendency to try to make a lot of requests that can >>only go >> > to >> > > particular nodes. This adds a lot of burden for client >>implementations >> > (it >> > > sounds easy but each discovery can fail in many parts so it ends up >> > being a >> > > full state machine to do right). I think we should consider making >> admin >> > > commands and ideally as many of the other apis as possible >>available on >> > all >> > > brokers and just redirect to the controller on the broker side. >>Perhaps >> > > there would be a general way to encapsulate this re-routing >>behavior. >> > > >> > >> > If we do that then we should also preserve what we have and do both. >>The >> > client can then decide "do I want to go to any broker and proxy" or >>just >> > "go to controller and run admin task". Lots of folks have seen >> controllers >> > come under distress because of their producers/consumers. There is >>ticket >> > too for controller elect and re-elect >> > https://issues.apache.org/jira/browse/KAFKA-1778 so you can force it >>to >> a >> > broker that has 0 load. >> > >> > >> > > >> > > 6. We should probably normalize the key value pairs used for configs >> > rather >> > > than embedding a new formatting. So two strings rather than one >>with an >> > > internal equals sign. >> > > >> > >> > ok >> > >> > >> > > >> > > 7. Is the postcondition of these APIs that the command has begun or >> that >> > > the command has been completed? It is a lot more usable if the >>command >> > has >> > > been completed so you know that if you create a topic and then >>publish >> to >> > > it you won't get an exception about there being no such topic. >> > > >> > >> > We should define that more. There needs to be some more state there, >>yes. >> > >> > We should try to cover >>https://issues.apache.org/jira/browse/KAFKA-1125 >> > within what we come up with. >> > >> > >> > > >> > > 8. Describe topic and list topics duplicate a lot of stuff in the >> > metadata >> > > request. Is there a reason to give back topics marked for deletion? >>I >> > feel >> > > like if we just make the post-condition of the delete command be >>that >> the >> > > topic is deleted that will get rid of the need for this right? And >>it >> > will >> > > be much more intuitive. >> > > >> > >> > I will go back and look through it. >> > >> > >> > > >> > > 9. Should we consider batching these requests? We have generally >>tried >> to >> > > allow multiple operations to be batched. My suspicion is that >>without >> > this >> > > we will get a lot of code that does something like >> > > for(topic: adminClient.listTopics()) >> > > adminClient.describeTopic(topic) >> > > this code will work great when you test on 5 topics but not do as >>well >> if >> > > you have 50k. >> > > >> > >> > So => Input is a list of topics (or none for all) and a batch response >> from >> > the controller (which could be routed through another broker) of the >> entire >> > response? We could introduce a Batch keyword to explicitly show the >>usage >> > of it. >> > >> > >> > > 10. I think we should also discuss how we want to expose a >>programmatic >> > JVM >> > > client api for these operations. Currently people rely on AdminUtils >> > which >> > > is totally sketchy. I think we probably need another client under >> > clients/ >> > > that exposes administrative functionality. We will need this just to >> > > properly test the new apis, I suspect. We should figure out that >>API. >> > > >> > >> > We were talking about that here >> > https://issues.apache.org/jira/browse/KAFKA-1774 and wrote it in java >> > https://reviews.apache.org/r/29301/diff/7/?page=4#75 so we could do >> > something like that, sure. >> > >> > >> > > >> > > 11. The other information that would be really useful to get would >>be >> > > information about partitions--how much data is in the partition, >>what >> are >> > > the segment offsets, what is the log-end offset (i.e. last offset), >> what >> > is >> > > the compaction point, etc. I think that done right this would be the >> > > successor to the very awkward OffsetRequest we have today. >> > > >> > >> > yes! >> > >> > >> > > >> > > -Jay >> > > >> > > On Wed, Jan 21, 2015 at 10:27 PM, Joe Stein <joe.st...@stealth.ly> >> > wrote: >> > > >> > > > Hi, created a KIP >> > > > >> > > > >> > > >> > >> >>https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+an >>d+centralized+administrative+operations >> > > > >> > > > JIRA https://issues.apache.org/jira/browse/KAFKA-1694 >> > > > >> > > > /******************************************* >> > > > Joe Stein >> > > > Founder, Principal Consultant >> > > > Big Data Open Source Security LLC >> > > > http://www.stealth.ly >> > > > Twitter: @allthingshadoop >><http://www.twitter.com/allthingshadoop> >> > > > ********************************************/ >> > > > >> > > >> > >> >> >> >> -- >> -- Guozhang >>