Hi Gordon, While I agree with you that the flow-to-disk queues have a lot of problems, I do not think they are totally useless. If you remove them without any real alternative, you may block the upgrade path for many people using them. At least speaking for my self, it probably would be a problem for some of our brokers.
Regards Jakub On Thu, Jul 19, 2012 at 7:56 PM, Gordon Sim <[email protected]> wrote: > I have been looking at what would be required to get AMQP 1.0 support > alongside AMQP 0-10 support in the c++ broker, i.e. qpidd. > > As part of that it became clear some refactoring of the broker codebase > would be required[1]. That in turn led me to believe that we should consider > dropping certain features. These would be dropped *after* the pending 0.18 > release; i.e. they would still be present in 0.18, but that would be the > last release in which they were present if my proposal were accepted. > > The purpose of this mail is to list the features I would propose to drop and > my reasons for doing so. For those who find it overly long, I apologise and > offer a very short summary at the end! > > In each case the basic argument is that I believe the features are not very > well implemented and keeping them working as part of my refactoring would > take extra time that I would rather spend on achieving 1.0 support making > real improvements. > > The first feature I propose we drop is the 'legacy' versions of LVQ > behaviour. These forced a choice in the behaviour of the queue when browsers > (i.e. not destructive subscribers) received messages from it. The choice was > to either have browsers miss updates, or to suppress the replacing of one > message by another with a matching key. This choice was really driven by a > technical problem with the first implementation. We have since already moved > to an improved implementation where the distinction is not relevant. I see > no good reason to keep the old behaviour any longer. > > The second feature is the old async queue replication mechanism. This is > very fragile and I believe is no longer necessary given the new and improved > ha solution that first appeared in 0.16 and has been improved significantly > for 0.18. > > The third feature is the 'last man standing' or 'cluster durable' option. > The biggest reason for dropping this comes later(!), but considered on its > own my concern is that there are no system level tests for it so it is very > hard to guarantee it still works without writing all those tests. I am > entirely unconvinced by this solution, and think that again the new HA > mechanism would be a better way to achieve this (you could start up a backup > node that forced all the replicated messages to disk). I am therefore keen > to avoid wasting time and effort. > > The fourth feature is - wait for it - the clustered broker capability as > enabled by the cluster.so plugin. I believe this is nearing the end of its > life anyway. It is currently only available on linux with no real prospects > of being ported to windows. The design as it turns out was very fragile to > changes in the codebase and there are still some difficult to solve bugs > within it. A new HA mechanism has been developed (as alluded to above) and I > believe that will replace the old cluster. The work needed to keep the > cluster working through my refactor is sizeable. It would in any case have > the potential to destabilise the cluster (the aforementioned issue with > fragility). This seems to me to argue strongly for dropping this in releases > after 0.18, and for anyone affected, that would give them some time to try > out the new HA and give feedback as well. > > The fifth and final feature I propose we drop is the confusingly named 'flow > to disk' feature. Now for this one I have no alternative to offer yet. The > problem is supporting large queues whose aggregate size far exceeds a > bounded amount of memory. I believe the current implementation is next to > useless for the majority of cases as it keeps the headers of all messages in > memory. It is useless unless your messages are large enough that the > overhead keeping these headers in memory is outweighed by the size of the > body (this overhead is significantly larger than the transfer size of the > headers). Further, since a common cause for large queues is a short lived > disparity between the rate of inflow and outflow, the current solution can > compound the problem by radically slowing down consumers even more. I > believe there is a better solution and I'm not convinced the current > solution is worth the effort of maintaining any further. (I know Kim has > been working on a new store interface and removing flow to disk would clean > that up nicely as well!) > > I hope this makes sense. I'm keen to get any thoughts or feedback on these > points. The purpose is not to deprive anyone of features they are using but > rather to spend time on more important work. > > Summary: > > features to drop are: > > (i) legacy lvq modes; lvq support would still remain, only the two old and > peculiar modes would go; I really doubt anyone actually depends on these > anyway, they were more a limitation than a feature > > (ii) asynchronous queue replication; solution is not mature enough for real > world use anyway due to fragility and inability to resync; new HA mechanism > as introduced in 0.16 and improved on in 0.18 should address the need > anyway. > > (iii) clustering including last-man-standing mode; design is brittle and > currently ties it to linux platform; new HA is the long term solution here > anyway. > > (iv) flow to disk; current solution really doesn't solve the problem anyway > > --Gordon > > [1] If you are interested at all, you kind find my latest patch and some > notes on the internal changes up on reviewboard: > https://reviews.apache.org/r/5833/ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
