Re: Notifications between Aeolus components

Matt Wagner Thu, 24 Jan 2013 11:00:14 -0800

On Thu, Jan 24, 2013 at 10:10:47AM +0100, Martin Povolny wrote:
> 
> I was not here when this decisions where made and did not hear any
> reasonable argument why the MQ was removed from the project.


I'm trying to find the mailing list thread, but I think it predates the
aeolus-devel.

But suffice it to say that a lot of smart people thought about it and
the community decided it was the right path forward at the time. That's
not to say that it means we shouldn't consider re-adding it, just that
your not seeing "any reasonable argument" doesn't mean that the decision
at the time was wrong.

> So just my IMHO
> 
> Conductor has 2 components that are sort of "daemons"
> 
> * dbomatic
> * delayed job
> 
> Then there is the web ui part.
> 
> The communication layer for the 3 is the dababase.

I wouldn't characterize it as "communication" through the database,
though it might be technically correct.

dbomatic is a script that polls Deltacloud and updates the database. It
predates our use of delayed_job. I think dbomatic is universally
accepted as something that needs major overhaul at this point, and does
a lot of things in weird ways. But the whole concept is doing what you
seem to propose later in this message -- moving the long-running stuff
away from the core UI/API code.

Similarly, delayed_job runs jobs enqueued in the database. It's widely
used in the Rails community, so it's not as if it's unholy thing writing
to the database like dbomatic started out as. It's true that it reads
things written to the database by Conductor, so in that sense it's
'communicating' through the database, but I see nothing wrong with this
approach, and presumably all of the other people using and contributing
to delayed_job would agree.

> IMHO it would make much more sense to have a "backend conductor" that
> would do the stuff that really matters including servicing dome sort of
> job-queue (now dbomatic) and
>   >> communicating with the other components <<
> including polling where necessary (now delayed job).
> 
> Then the web part would be a thin layer to "display the results" -- the
> web ui part.
> 
> As opposed to current situation where we have conductor the web part
> that:
> 
> 1) serves the web users
> 2) serves the API

This was a separate thread, but this is a common pattern in Rails.
Controllers can service web and API requests from the same methods,
reducing code duplication or inadvertent divergence between API and web
functionality.

> 3) accepts the "RESTful" callbacks from ImageFactory

Isn't accepting callbacks just an extension of the API?

> 4) also does some communication with DC and maybe other parts while
> answering a client's request

I don't see how we could separate this out -- it's often integral to
servicing the user's requests (web or API). Where it can happen in the
background, we should certainly be throwing those tasks into
delayed_job, but I think we're already doing that today.

> IMHO this is a mess from the architecture point of view.

I disagree. There's room for improvement, for sure, but I don't think
breaking Conductor apart in ways that aren't commonly practiced in the
Rails community is going to do anything but make the thing more of a
confusing mess.

> From my point of view it is much more important to solve this mess and
> have the communication
>     >> happen in the right place <<
> than deciding whether the communication is HTTP based (REST, Message,
> RPC, whatever) or if we use MQ.

I think I agree with this.

> The question of MQ of then comes secondary.
> 
> I don't know why you guys failed with the MQ in the past but I see the
> MQs today at a similar level as SQL databases. It's an industrial
> verified way of doing communication in situations such as ours.

I agree that it's a stable and successful way of doing things. I'm just
not convinced it's the right choice for us.

> We cannot directly link because:
> 
> a) we use a REST proxy for REST provider APIs (the deltacloud)
> although it's written in the same language as Conductor and it's
> stateless, it's not a library and it's to stay that way I understood.

I think the library thing might be a point of minor controversy. I know
some have expressed interest in using it this way, but it sounds like
it's not something Deltacloud is interested in implementing.

In any case, though, they present a REST API which we use with great
results. I believe we could mount it as a Rack app if we wanted, but I'm
not sure that would really help us any.

> b) we use components written in other languages (the ImageFactory)

Sure. This can easily be solved using an HTTP API (as we do today) or
AMQP for exchanging data. We previously tried AMQP and fairly recently
switched to HTTP.

> c) we might want to support a scenario where the individual components
> run on different machines.

Yes, and I worry my previous reply may have come across as dismissing
this. This is definitely a possibility, and something we should support.

> We have more then a pair of communicating parties.
> 
> We have more the one party communicating with other parties.

And indeed, AMQP makes this problem a little bit easier, though I'll
note that we've already got something in place here, so it'd be
reimplementing something that we've already solved for the sake of
swapping in something slightly cleaner for this use case.

> We have or want to have optional components.

Interestingly, this is one of my arguments _against_ using AMQP. HTTP is
sort of a 'lowest common denominator' that's pretty easy to implement or
inteface with. For purpose-built applications that have clearly-defined
ways of connecting, AMQP is probably a big win. But for products that
can work together or be used independently, I'm not sure everyone would
want to write an AMQP interface to their application.

> There are reasons for using MQ over ad-hoc communication between
> components and I would thing that it's not necessary to write that, but:
> 
> * as "Steve Loranz" pointed out: there's just one point to communicate with
> in case of MQ

This is a bit easier, admittedly, but I haven't seen the current system
as being too complicated. I think the most complicated is Conductor,
which has to talk to Imagefactory and Deltacloud. (And I'm not sure that
Deltacloud will ever use AMQP.)

> * reliability -- not easy to get this with ad-hoc solution

I'll grant that AMQP would be more reliable. My question, really, is:
are we having such a reliability problem that it's worth rewriting the
entire thing?

> * then of course you have the bindings for all the necessary language

I think both languages are equal here. Is there any language out there
that has AMQP bindings but no support for HTTP? Either way, it's a moot
point for us: Factory (Python) and Conductor (Ruby) have both used AMQP
previously and now use REST calls, so both will clearly work either way.
I imagine the same could be said if we ever implement C or Java or
whatever apps -- either way will work.

> * that someone else is using and testing for you, you have known and
>   documented ways of "doing it" for various types of message exchange
>   scenarios

Well the use and testing occurs at the protocol level, versus our
implementation. I could make the same claim of HTTP.

AMQP likely does have best practices around this that are more advanced
than with webhooks, though there are plenty of projects (GitHub,
WordPress, etc.) using them successfully.

But I don't think we're having a problem right now with figuring out how
to implement webhooks -- they're already there -- so I'm not sure this
one matters to us.

> * and then you have the MQ implemented, debugged, tested and working

We already have this more or less for our REST-based system, do we not?
Switching would _require_ implementation, debugging, and testing. It's
surely doable, but I don't think this is a reason to switch.

> * message format, error handling etc.
> 
>   see the problems we have when trying to handle the various error
>   conditions -- how many times do you get a reasonable error message in
>   conductor when IMF or DC fails? this is also easier to do with a MQ

We do a bad job with error-handling, it's true. It's especially bad
between components. But the problem is that we just don't do a good job
of spelling out what the possible errors are or how we should present
them. This isn't going to change with a message bus -- it's going to
change when we actually fix our handling of errors.

> * we can go on -- google knows

But this is exactly my point here -- I don't think the overall
advantages of AMQP matter here. What matters is how it will make _our
project_ better.

> To get job done, its good to write less code. It might not be the
> scientific approach but it works (IMHO).

I agree with that, but I don't think switching our communication
protocols will help this.

> Somewhere in the thread someone said that it seems we decided to use MQ
> because it is "cool" and seek the reasons why to do so. I don't see it
> this way.
> 
> What I see much more the decision NOT to use the MQ and ignoring the
> good reasons to use one.

There's admittedly some lingering animosity towards AMQP here from the
last time we used it -- we put a bunch of effort into making it work,
had all sorts of problems with it, and then decided to rip it out and
reimplement REST callbacks. Those that were involved are probably going
to be pretty reluctant to now rip out the REST bits and implement AMQP.

What I'm interested in aren't the "good reasons" that AMQP is a superior
messaging protocol, but the reasons it will make _our project_ better.
The only one I've heard so far is that message delivery will be more
robust. That would be an improvement but I'm not sure it's worth the
overhead of implementing this. Maybe I'm wrong, though.

> Then I see (well no longer since the new year) use of "cool" noSQL thing
> with no reason in a project that already had enough complexity.

I'm not sure what this refers to, to be honest. Image Warehouse used
Mongo, if that's what you're referring to. That has been painful and
we're ripping the whole thing out.

> Then see big effort on using "cool and in" RESTful API in a situation
> that really is about events.

Deltacloud and Conductor have been using REST (or at least something
like it) for the past several years. Before I even joined the company a
couple of years ago I was reading the Deltacloud documentation about its
HATEOAS API.

> But after the short time being on the team I am already tired of this
> topic and as I said the architecture of Conductor seems to be a much
> bigger problem for me then just the messaging between the components.
> 
> So that's all of my IMHO. Shoot me if you please.

Heh, in fairness, +1 to this -- I, too, am tired of this discussion.

-- Matt

Re: Notifications between Aeolus components

Reply via email to