+1. I can help you in the same timeframe. William
On Oct 2, 2012, at 8:32 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > I think a first step would be to do a detailed comparison of kafka > semantics and the AMQP model and see how good the fit is. If no one else is > game I would be willing to do this, though i probably wouldn't start for > about a month. I think this would be more meaningful since unless we know > the explicit drawbacks it is hard to really think concretely about the pros > and cons. > > -Jay > > On Tue, Oct 2, 2012 at 5:27 PM, William Henry <whe...@redhat.com> wrote: > >> >> >> ----- Original Message ----- >>> I'm not exactly sure about why talking the same at the wire level is >>> an >>> explicit goal; if raw blazing speed is also equally important. >>> >>> Removing the wireformat from Kafka could be done through abstraction; >>> but >>> that would incur reinterpretation costs(talking native Kafka) and >>> take a >>> performance hit. >>> >>> I could make a similar argument over the second goal as well. It is >>> not >>> apparent that solving ALL problems through abstraction and then >>> universally >>> accepting a performance hit is that ideal. It may make >>> purchasing/acquiring >>> easier; but by adhering to the lowest common denominator. >>> >>> Wouldn't it be better to make explicit tradeoffs for one-offs based >>> on >>> specific needs? i.e. if your architecture doesnt require zookeeper in >>> Kafka >>> for coordination, reduce the complexity. Don't force complexity in >>> all >>> cases. >> >> Sure but if AMQP solves all the goals then why do so-called best-of-breed >> "proprietary"? >> >> Who said lowest common-denominator? If AMQP is lowest common-denominator >> then it has failed. >> >> Not sure how this is complexity either. Surely reusing an existing tried >> and tested protocol instead of developing and maintaining a new one is less >> complex. >> >>> >>> In other words, Kafka is different enough from other messaging >>> systems that >>> to enforce a common contract (aka amqp) ,without incurring a >>> significant >>> performance hit, would be very challenging. >> >> There is no metrics to say this would be a performance hit at all. >> >> Again I'm not saying it is the right fit but I don't think we can conclude >> that without investigating. And I haven't seen anything that would suggest >> it can't fit. >> >> William >> >> >>> On Oct 2, 2012 3:22 PM, "William Henry" <whe...@redhat.com> wrote: >>> >>>> >>>> >>>> ----- Original Message ----- >>>>> I looked into AMQP when I was first starting Kafka work. I see >>>>> the >>>>> crux of >>>>> the issue as this: if you have a bunch of systems that >>>>> essentially >>>>> expose >>>>> the same functionality there is value in standardizing the >>>>> protocol >>>>> by >>>>> which they are accessed to help decouple interface from >>>>> implementation. Of >>>>> course I think it is better still to end up with a single good >>>>> implementation (e.g. Linux rather than Posix). But invariably the >>>>> protocol >>>>> dictates the feature set, which dictates the implementation, and >>>>> so >>>>> this >>>>> only really works if the systems have the same feature set and >>>>> similar >>>>> enough implementations. This becomes true in a domain over time >>>>> as >>>>> people >>>>> learn the best way to build that kind of system, and all the >>>>> systems >>>>> converge to that. >>>> >>>> +1 >>>> >>>>> >>>>> The reason we have not been pursuing this is that I think the set >>>>> of >>>>> functionality we are aiming for is a little different than what >>>>> most >>>>> message brokers have. Basically the idea we have is to attempt to >>>>> re-imagine "messaging" or asynchronous processing infrastructure >>>>> as a >>>>> distributed, replicated, partitioned "commit log". This is >>>>> different >>>>> enough >>>>> from what other system do that attempting to support a >>>>> standardized >>>>> protocol is unlikely to work out well. For example, the consumer >>>>> balancing >>>>> we do is not modeled in AMQP, and there are many AMQP features >>>>> that >>>>> Kafka >>>>> doesn't have. >>>> >>>> I need to understand your consumer balancing a bit more but AMQP is >>>> designed not to be another MOM like traditional broker based >>>> messaging >>>> systems, though it does support that model. >>>> >>>> I like to explain the goals of AMQP to be threefold (some may argue >>>> differently): >>>> >>>> 1) A Standard wire protocol for interoperability. i.e. have all >>>> messaging >>>> systems speak the same on the wire. >>>> 2) Handle all messaging use cases well - i.e. not just asynch, not >>>> just >>>> fanout, not just pub/sub but instead do it all so that AMQP is >>>> applicable >>>> to all use cases. Let's not have a "we do AMQP everywhere except X >>>> because >>>> it does do X very well. >>>> 3) Must be fast. Even if it does 1 and 2 very well it will not be >>>> adopted >>>> by a wide range of applications. >>>> >>>> So if by consumer balancing you mean multiple consumers feeding off >>>> a >>>> particular address/source/publisher/producer etc. then AMQP does >>>> manage >>>> that model. >>>> >>>> >>>>> >>>>> Basically I don't really see other messaging systems as being >>>>> fully >>>>> formed >>>>> distributed systems that acts as a *cluster* (rather than an >>>>> ensemble >>>>> of >>>>> brokers). >>>> >>>> This is exactly what we in the Qpid community are working towards >>>> right >>>> now. I think AMQP as a protocol under Kafka and exploiting Kafka's >>>> framework is a great idea. >>>> >>>> Please look at the new Qpid/Proton work and some of Ted Ross's >>>> (cc-ed) >>>> router work. >>>> >>>>> Conceptually when people program to, say, HDFS, you largely >>>>> forget that under the covers it is a collection of data nodes and >>>>> you >>>>> think >>>>> about it as a single entity. There are a number of points in the >>>>> design >>>>> that make this possible (and a number of areas where HDFS falls >>>>> short). I >>>>> think there is a lot to be gained by bringing to bear this modern >>>>> style of >>>>> distributed systems design in this space. Needless to say people >>>>> who >>>>> work >>>>> on these other systems totally disagree with this assessment, so >>>>> it >>>>> is a >>>>> bit of an experiment. >>>> >>>> This is very interesting to me and some of the customers (at least >>>> 2) I >>>> work with. >>>> >>>>> >>>>> I think an interesting analogy is to databases. Relational >>>>> databases >>>>> took >>>>> this path to some extent. They started out with a very diverse >>>>> feature set, >>>>> and eventually converged to a fairly standard set of >>>>> functionality >>>>> with >>>>> reasonable compatibility protocols (ODBC, JDBC). Distributed >>>>> databases, >>>>> though, are much more constrained and virtually always fail when >>>>> they >>>>> attempt to be compatible with centralized RDBMS's because they >>>>> just >>>>> can't >>>>> do all the same stuff (but can do other things). I think as the >>>>> distributed >>>>> database space settles down it will become clear how to provide >>>>> some >>>>> kind >>>>> of general protocol to standardize access, but trying to do that >>>>> too >>>>> soon >>>>> wouldn't really help. >>>>> >>>>> Another option, instead of making Kafka an AMPQ system, would be >>>>> to >>>>> try to >>>>> make Kafka a multi-protocol system that supported many protocol's >>>>> natively, >>>>> sharing basic socket infrastructure. I have been down this path >>>>> and >>>>> it is a >>>>> very hard road. I would not like to do that again. >>>> >>>> I understand that. >>>> >>>>> >>>>> That said it would be very interesting to see how well AMQP could >>>>> be >>>>> mapped >>>>> to Kafka semantics, and there is nothing that prevents this >>>>> experiment from >>>>> happening outside the main codebase. It is totally possible to >>>>> just >>>>> call >>>>> new KafkaServer(), access all the business logic from there, and >>>>> wrap >>>>> that >>>>> in AMQP, REST, or any other protocol. That might be a good way to >>>>> conduct >>>>> the experiment if anyone is interested in trying it. >>>> >>>> I would love to take a look at this. Any pointer on where an >>>> integration >>>> point might be would be welcome. There is so much work in the AMQP >>>> and >>>> Qpid communities that Kafka could benefit from. You could >>>> concentrate on >>>> the "cluster" model and let Qpid/Proton handle the payload >>>> distribution on >>>> the wire. >>>> >>>> I'm willing to take the risk that I might be wrong but right now I >>>> don't >>>> see where AMQP would fall down in this case. >>>> >>>> Best regards, >>>> William >>>> >>>>> Cheers, >>>>> >>>>> -Jay >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, Oct 1, 2012 at 12:07 PM, William Henry >>>>> <whe...@redhat.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Has anyone looked at this email? Anyone care to express an >>>>>> opinion? >>>>>> >>>>>> It seems like Apache has ActiveMQ and Qpid, which are already >>>>>> working on >>>>>> integrating, and now Kafka. Kafka might benefit by using >>>>>> Qpid/Proton just >>>>>> as ActiveMQ is trying to integrate with Qpid/Proton. >>>>>> >>>>>> If folks are interested I'd be willing to take a look at the >>>>>> integration >>>>>> and help out. >>>>>> >>>>>> Best regards, >>>>>> William >>>>>> >>>>>> ----- Original Message ----- >>>>>>> Hi, >>>>>>> >>>>>>> >>>>>>> Has anyone looked at integrating kafka with Apache Qpid to >>>>>>> get >>>>>>> AMQP >>>>>>> support? >>>>>>> >>>>>>> >>>>>>> Best, >>>>>>> William >>