I think a first step would be to do a detailed comparison of kafka semantics and the AMQP model and see how good the fit is. If no one else is game I would be willing to do this, though i probably wouldn't start for about a month. I think this would be more meaningful since unless we know the explicit drawbacks it is hard to really think concretely about the pros and cons.
-Jay On Tue, Oct 2, 2012 at 5:27 PM, William Henry <whe...@redhat.com> wrote: > > > ----- Original Message ----- > > I'm not exactly sure about why talking the same at the wire level is > > an > > explicit goal; if raw blazing speed is also equally important. > > > > Removing the wireformat from Kafka could be done through abstraction; > > but > > that would incur reinterpretation costs(talking native Kafka) and > > take a > > performance hit. > > > > I could make a similar argument over the second goal as well. It is > > not > > apparent that solving ALL problems through abstraction and then > > universally > > accepting a performance hit is that ideal. It may make > > purchasing/acquiring > > easier; but by adhering to the lowest common denominator. > > > > Wouldn't it be better to make explicit tradeoffs for one-offs based > > on > > specific needs? i.e. if your architecture doesnt require zookeeper in > > Kafka > > for coordination, reduce the complexity. Don't force complexity in > > all > > cases. > > Sure but if AMQP solves all the goals then why do so-called best-of-breed > "proprietary"? > > Who said lowest common-denominator? If AMQP is lowest common-denominator > then it has failed. > > Not sure how this is complexity either. Surely reusing an existing tried > and tested protocol instead of developing and maintaining a new one is less > complex. > > > > > In other words, Kafka is different enough from other messaging > > systems that > > to enforce a common contract (aka amqp) ,without incurring a > > significant > > performance hit, would be very challenging. > > There is no metrics to say this would be a performance hit at all. > > Again I'm not saying it is the right fit but I don't think we can conclude > that without investigating. And I haven't seen anything that would suggest > it can't fit. > > William > > > > On Oct 2, 2012 3:22 PM, "William Henry" <whe...@redhat.com> wrote: > > > > > > > > > > > ----- Original Message ----- > > > > I looked into AMQP when I was first starting Kafka work. I see > > > > the > > > > crux of > > > > the issue as this: if you have a bunch of systems that > > > > essentially > > > > expose > > > > the same functionality there is value in standardizing the > > > > protocol > > > > by > > > > which they are accessed to help decouple interface from > > > > implementation. Of > > > > course I think it is better still to end up with a single good > > > > implementation (e.g. Linux rather than Posix). But invariably the > > > > protocol > > > > dictates the feature set, which dictates the implementation, and > > > > so > > > > this > > > > only really works if the systems have the same feature set and > > > > similar > > > > enough implementations. This becomes true in a domain over time > > > > as > > > > people > > > > learn the best way to build that kind of system, and all the > > > > systems > > > > converge to that. > > > > > > +1 > > > > > > > > > > > The reason we have not been pursuing this is that I think the set > > > > of > > > > functionality we are aiming for is a little different than what > > > > most > > > > message brokers have. Basically the idea we have is to attempt to > > > > re-imagine "messaging" or asynchronous processing infrastructure > > > > as a > > > > distributed, replicated, partitioned "commit log". This is > > > > different > > > > enough > > > > from what other system do that attempting to support a > > > > standardized > > > > protocol is unlikely to work out well. For example, the consumer > > > > balancing > > > > we do is not modeled in AMQP, and there are many AMQP features > > > > that > > > > Kafka > > > > doesn't have. > > > > > > I need to understand your consumer balancing a bit more but AMQP is > > > designed not to be another MOM like traditional broker based > > > messaging > > > systems, though it does support that model. > > > > > > I like to explain the goals of AMQP to be threefold (some may argue > > > differently): > > > > > > 1) A Standard wire protocol for interoperability. i.e. have all > > > messaging > > > systems speak the same on the wire. > > > 2) Handle all messaging use cases well - i.e. not just asynch, not > > > just > > > fanout, not just pub/sub but instead do it all so that AMQP is > > > applicable > > > to all use cases. Let's not have a "we do AMQP everywhere except X > > > because > > > it does do X very well. > > > 3) Must be fast. Even if it does 1 and 2 very well it will not be > > > adopted > > > by a wide range of applications. > > > > > > So if by consumer balancing you mean multiple consumers feeding off > > > a > > > particular address/source/publisher/producer etc. then AMQP does > > > manage > > > that model. > > > > > > > > > > > > > > Basically I don't really see other messaging systems as being > > > > fully > > > > formed > > > > distributed systems that acts as a *cluster* (rather than an > > > > ensemble > > > > of > > > > brokers). > > > > > > This is exactly what we in the Qpid community are working towards > > > right > > > now. I think AMQP as a protocol under Kafka and exploiting Kafka's > > > framework is a great idea. > > > > > > Please look at the new Qpid/Proton work and some of Ted Ross's > > > (cc-ed) > > > router work. > > > > > > > Conceptually when people program to, say, HDFS, you largely > > > > forget that under the covers it is a collection of data nodes and > > > > you > > > > think > > > > about it as a single entity. There are a number of points in the > > > > design > > > > that make this possible (and a number of areas where HDFS falls > > > > short). I > > > > think there is a lot to be gained by bringing to bear this modern > > > > style of > > > > distributed systems design in this space. Needless to say people > > > > who > > > > work > > > > on these other systems totally disagree with this assessment, so > > > > it > > > > is a > > > > bit of an experiment. > > > > > > This is very interesting to me and some of the customers (at least > > > 2) I > > > work with. > > > > > > > > > > > I think an interesting analogy is to databases. Relational > > > > databases > > > > took > > > > this path to some extent. They started out with a very diverse > > > > feature set, > > > > and eventually converged to a fairly standard set of > > > > functionality > > > > with > > > > reasonable compatibility protocols (ODBC, JDBC). Distributed > > > > databases, > > > > though, are much more constrained and virtually always fail when > > > > they > > > > attempt to be compatible with centralized RDBMS's because they > > > > just > > > > can't > > > > do all the same stuff (but can do other things). I think as the > > > > distributed > > > > database space settles down it will become clear how to provide > > > > some > > > > kind > > > > of general protocol to standardize access, but trying to do that > > > > too > > > > soon > > > > wouldn't really help. > > > > > > > > Another option, instead of making Kafka an AMPQ system, would be > > > > to > > > > try to > > > > make Kafka a multi-protocol system that supported many protocol's > > > > natively, > > > > sharing basic socket infrastructure. I have been down this path > > > > and > > > > it is a > > > > very hard road. I would not like to do that again. > > > > > > I understand that. > > > > > > > > > > > That said it would be very interesting to see how well AMQP could > > > > be > > > > mapped > > > > to Kafka semantics, and there is nothing that prevents this > > > > experiment from > > > > happening outside the main codebase. It is totally possible to > > > > just > > > > call > > > > new KafkaServer(), access all the business logic from there, and > > > > wrap > > > > that > > > > in AMQP, REST, or any other protocol. That might be a good way to > > > > conduct > > > > the experiment if anyone is interested in trying it. > > > > > > > > > > I would love to take a look at this. Any pointer on where an > > > integration > > > point might be would be welcome. There is so much work in the AMQP > > > and > > > Qpid communities that Kafka could benefit from. You could > > > concentrate on > > > the "cluster" model and let Qpid/Proton handle the payload > > > distribution on > > > the wire. > > > > > > I'm willing to take the risk that I might be wrong but right now I > > > don't > > > see where AMQP would fall down in this case. > > > > > > Best regards, > > > William > > > > > > > Cheers, > > > > > > > > -Jay > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Oct 1, 2012 at 12:07 PM, William Henry > > > > <whe...@redhat.com> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > Has anyone looked at this email? Anyone care to express an > > > > > opinion? > > > > > > > > > > It seems like Apache has ActiveMQ and Qpid, which are already > > > > > working on > > > > > integrating, and now Kafka. Kafka might benefit by using > > > > > Qpid/Proton just > > > > > as ActiveMQ is trying to integrate with Qpid/Proton. > > > > > > > > > > If folks are interested I'd be willing to take a look at the > > > > > integration > > > > > and help out. > > > > > > > > > > Best regards, > > > > > William > > > > > > > > > > ----- Original Message ----- > > > > > > Hi, > > > > > > > > > > > > > > > > > > Has anyone looked at integrating kafka with Apache Qpid to > > > > > > get > > > > > > AMQP > > > > > > support? > > > > > > > > > > > > > > > > > > Best, > > > > > > William > > > > > > > > > > > > > > >