I looked into AMQP when I was first starting Kafka work. I see the crux of
the issue as this: if you have a bunch of systems that essentially expose
the same functionality there is value in standardizing the protocol by
which they are accessed to help decouple interface from implementation. Of
course I think it is better still to end up with a single good
implementation (e.g. Linux rather than Posix). But invariably the protocol
dictates the feature set, which dictates the implementation, and so this
only really works if the systems have the same feature set and similar
enough implementations. This becomes true in a domain over time as people
learn the best way to build that kind of system, and all the systems
converge to that.

The reason we have not been pursuing this is that I think the set of
functionality we are aiming for is a little different than what most
message brokers have. Basically the idea we have is to attempt to
re-imagine "messaging" or asynchronous processing infrastructure as a
distributed, replicated, partitioned "commit log". This is different enough
from what other system do that attempting to support a standardized
protocol is unlikely to work out well. For example, the consumer balancing
we do is not modeled in AMQP, and there are many AMQP features that Kafka
doesn't have.

Basically I don't really see other messaging systems as being fully formed
distributed systems that acts as a *cluster* (rather than an ensemble of
brokers). Conceptually when people program to, say, HDFS, you largely
forget that under the covers it is a collection of data nodes and you think
about it as a single entity. There are a number of points in the design
that make this possible (and a number of areas where HDFS falls short). I
think there is a lot to be gained by bringing to bear this modern style of
distributed systems design in this space. Needless to say people who work
on these other systems totally disagree with this assessment, so it is a
bit of an experiment.

I think an interesting analogy is to databases. Relational databases took
this path to some extent. They started out with a very diverse feature set,
and eventually converged to a fairly standard set of functionality with
reasonable compatibility protocols (ODBC, JDBC). Distributed databases,
though, are much more constrained and virtually always fail when they
attempt to be compatible with centralized RDBMS's because they just can't
do all the same stuff (but can do other things). I think as the distributed
database space settles down it will become clear how to provide some kind
of general protocol to standardize access, but trying to do that too soon
wouldn't really help.

Another option, instead of making Kafka an AMPQ system, would be to try to
make Kafka a multi-protocol system that supported many protocol's natively,
sharing basic socket infrastructure. I have been down this path and it is a
very hard road. I would not like to do that again.

That said it would be very interesting to see how well AMQP could be mapped
to Kafka semantics, and there is nothing that prevents this experiment from
happening outside the main codebase. It is totally possible to just call
new KafkaServer(), access all the business logic from there, and wrap that
in AMQP, REST, or any other protocol. That might be a good way to conduct
the experiment if anyone is interested in trying it.

Cheers,

-Jay







On Mon, Oct 1, 2012 at 12:07 PM, William Henry <whe...@redhat.com> wrote:

> Hi,
>
> Has anyone looked at this email?  Anyone care to express an opinion?
>
> It seems like Apache has ActiveMQ and Qpid, which are already working on
> integrating, and now Kafka. Kafka might benefit by using Qpid/Proton just
> as ActiveMQ is trying to integrate with Qpid/Proton.
>
> If folks are interested I'd be willing to take a look at the integration
> and help out.
>
> Best regards,
> William
>
> ----- Original Message -----
> > Hi,
> >
> >
> > Has anyone looked at integrating kafka with Apache Qpid to get AMQP
> > support?
> >
> >
> > Best,
> > William
>

Reply via email to