On Thu, Jan 24, 2013 at 10:10:47AM +0100, Martin Povolny wrote: > > I was not here when this decisions where made and did not hear any > reasonable argument why the MQ was removed from the project.
I'm trying to find the mailing list thread, but I think it predates the aeolus-devel. But suffice it to say that a lot of smart people thought about it and the community decided it was the right path forward at the time. That's not to say that it means we shouldn't consider re-adding it, just that your not seeing "any reasonable argument" doesn't mean that the decision at the time was wrong. > So just my IMHO > > Conductor has 2 components that are sort of "daemons" > > * dbomatic > * delayed job > > Then there is the web ui part. > > The communication layer for the 3 is the dababase. I wouldn't characterize it as "communication" through the database, though it might be technically correct. dbomatic is a script that polls Deltacloud and updates the database. It predates our use of delayed_job. I think dbomatic is universally accepted as something that needs major overhaul at this point, and does a lot of things in weird ways. But the whole concept is doing what you seem to propose later in this message -- moving the long-running stuff away from the core UI/API code. Similarly, delayed_job runs jobs enqueued in the database. It's widely used in the Rails community, so it's not as if it's unholy thing writing to the database like dbomatic started out as. It's true that it reads things written to the database by Conductor, so in that sense it's 'communicating' through the database, but I see nothing wrong with this approach, and presumably all of the other people using and contributing to delayed_job would agree. > IMHO it would make much more sense to have a "backend conductor" that > would do the stuff that really matters including servicing dome sort of > job-queue (now dbomatic) and > >> communicating with the other components << > including polling where necessary (now delayed job). > > Then the web part would be a thin layer to "display the results" -- the > web ui part. > > As opposed to current situation where we have conductor the web part > that: > > 1) serves the web users > 2) serves the API This was a separate thread, but this is a common pattern in Rails. Controllers can service web and API requests from the same methods, reducing code duplication or inadvertent divergence between API and web functionality. > 3) accepts the "RESTful" callbacks from ImageFactory Isn't accepting callbacks just an extension of the API? > 4) also does some communication with DC and maybe other parts while > answering a client's request I don't see how we could separate this out -- it's often integral to servicing the user's requests (web or API). Where it can happen in the background, we should certainly be throwing those tasks into delayed_job, but I think we're already doing that today. > IMHO this is a mess from the architecture point of view. I disagree. There's room for improvement, for sure, but I don't think breaking Conductor apart in ways that aren't commonly practiced in the Rails community is going to do anything but make the thing more of a confusing mess. > From my point of view it is much more important to solve this mess and > have the communication > >> happen in the right place << > than deciding whether the communication is HTTP based (REST, Message, > RPC, whatever) or if we use MQ. I think I agree with this. > The question of MQ of then comes secondary. > > I don't know why you guys failed with the MQ in the past but I see the > MQs today at a similar level as SQL databases. It's an industrial > verified way of doing communication in situations such as ours. I agree that it's a stable and successful way of doing things. I'm just not convinced it's the right choice for us. > We cannot directly link because: > > a) we use a REST proxy for REST provider APIs (the deltacloud) > although it's written in the same language as Conductor and it's > stateless, it's not a library and it's to stay that way I understood. I think the library thing might be a point of minor controversy. I know some have expressed interest in using it this way, but it sounds like it's not something Deltacloud is interested in implementing. In any case, though, they present a REST API which we use with great results. I believe we could mount it as a Rack app if we wanted, but I'm not sure that would really help us any. > b) we use components written in other languages (the ImageFactory) Sure. This can easily be solved using an HTTP API (as we do today) or AMQP for exchanging data. We previously tried AMQP and fairly recently switched to HTTP. > c) we might want to support a scenario where the individual components > run on different machines. Yes, and I worry my previous reply may have come across as dismissing this. This is definitely a possibility, and something we should support. > We have more then a pair of communicating parties. > > We have more the one party communicating with other parties. And indeed, AMQP makes this problem a little bit easier, though I'll note that we've already got something in place here, so it'd be reimplementing something that we've already solved for the sake of swapping in something slightly cleaner for this use case. > We have or want to have optional components. Interestingly, this is one of my arguments _against_ using AMQP. HTTP is sort of a 'lowest common denominator' that's pretty easy to implement or inteface with. For purpose-built applications that have clearly-defined ways of connecting, AMQP is probably a big win. But for products that can work together or be used independently, I'm not sure everyone would want to write an AMQP interface to their application. > There are reasons for using MQ over ad-hoc communication between > components and I would thing that it's not necessary to write that, but: > > * as "Steve Loranz" pointed out: there's just one point to communicate with > in case of MQ This is a bit easier, admittedly, but I haven't seen the current system as being too complicated. I think the most complicated is Conductor, which has to talk to Imagefactory and Deltacloud. (And I'm not sure that Deltacloud will ever use AMQP.) > * reliability -- not easy to get this with ad-hoc solution I'll grant that AMQP would be more reliable. My question, really, is: are we having such a reliability problem that it's worth rewriting the entire thing? > * then of course you have the bindings for all the necessary language I think both languages are equal here. Is there any language out there that has AMQP bindings but no support for HTTP? Either way, it's a moot point for us: Factory (Python) and Conductor (Ruby) have both used AMQP previously and now use REST calls, so both will clearly work either way. I imagine the same could be said if we ever implement C or Java or whatever apps -- either way will work. > * that someone else is using and testing for you, you have known and > documented ways of "doing it" for various types of message exchange > scenarios Well the use and testing occurs at the protocol level, versus our implementation. I could make the same claim of HTTP. AMQP likely does have best practices around this that are more advanced than with webhooks, though there are plenty of projects (GitHub, WordPress, etc.) using them successfully. But I don't think we're having a problem right now with figuring out how to implement webhooks -- they're already there -- so I'm not sure this one matters to us. > * and then you have the MQ implemented, debugged, tested and working We already have this more or less for our REST-based system, do we not? Switching would _require_ implementation, debugging, and testing. It's surely doable, but I don't think this is a reason to switch. > * message format, error handling etc. > > see the problems we have when trying to handle the various error > conditions -- how many times do you get a reasonable error message in > conductor when IMF or DC fails? this is also easier to do with a MQ We do a bad job with error-handling, it's true. It's especially bad between components. But the problem is that we just don't do a good job of spelling out what the possible errors are or how we should present them. This isn't going to change with a message bus -- it's going to change when we actually fix our handling of errors. > * we can go on -- google knows But this is exactly my point here -- I don't think the overall advantages of AMQP matter here. What matters is how it will make _our project_ better. > To get job done, its good to write less code. It might not be the > scientific approach but it works (IMHO). I agree with that, but I don't think switching our communication protocols will help this. > Somewhere in the thread someone said that it seems we decided to use MQ > because it is "cool" and seek the reasons why to do so. I don't see it > this way. > > What I see much more the decision NOT to use the MQ and ignoring the > good reasons to use one. There's admittedly some lingering animosity towards AMQP here from the last time we used it -- we put a bunch of effort into making it work, had all sorts of problems with it, and then decided to rip it out and reimplement REST callbacks. Those that were involved are probably going to be pretty reluctant to now rip out the REST bits and implement AMQP. What I'm interested in aren't the "good reasons" that AMQP is a superior messaging protocol, but the reasons it will make _our project_ better. The only one I've heard so far is that message delivery will be more robust. That would be an improvement but I'm not sure it's worth the overhead of implementing this. Maybe I'm wrong, though. > Then I see (well no longer since the new year) use of "cool" noSQL thing > with no reason in a project that already had enough complexity. I'm not sure what this refers to, to be honest. Image Warehouse used Mongo, if that's what you're referring to. That has been painful and we're ripping the whole thing out. > Then see big effort on using "cool and in" RESTful API in a situation > that really is about events. Deltacloud and Conductor have been using REST (or at least something like it) for the past several years. Before I even joined the company a couple of years ago I was reading the Deltacloud documentation about its HATEOAS API. > But after the short time being on the team I am already tired of this > topic and as I said the architecture of Conductor seems to be a much > bigger problem for me then just the messaging between the components. > > So that's all of my IMHO. Shoot me if you please. Heh, in fairness, +1 to this -- I, too, am tired of this discussion. -- Matt
