That was really useful. Thanks for writting Rob. On Wed, Jan 4, 2012 at 5:08 PM, Rob Godfrey <[email protected]> wrote:
> Robbie beat me to replying, and said mostly what I was going to say... but > anyway ... > > This design decision is before even my time on the project, but my opinion > is that if you have "millions of messages" then a Message Broker is the > wrong solution to your problem - you want a Database... That being said I > think there is a reasonable argument to be made that a broker should be > able to run in a low memory environment in which case you may want to be > able to swap out sections of the list structure. > > Fundamentally this would be quite a major change. The internal queue > design is predicated on the list structure being in memory and being able > to apply lockless atomic operations on it. From a performance point of > view it would also be potentially very tricky. There are a number of > processes within the broker which periodically scan the entire queue > (looking for expired messages and such) as well the fact (as Robbie pointed > out) many use cases are not strict FIFO (priority queues, LVQs, selectors, > etc). And ultimately you are still going to be limited by a finite > resource (albeit a larger one). That being said we should definitely look > at trying to reduce the per-message overhead. > > At some point soon I hope to be looking at implementing a flow-to-disk > policy for messages within queues to protect the broker from out of memory > situations with transient messages (as well as improving on the rather > drastic SoftReference based method currently employed). The main issue > though is that the reason why you need to flow to disk is that your > producers are producing faster than your consumers can consume... and > pushing messages (or worse queue structure too) to disk is only going to > slow your consumers down even more - likely making the problem even worse. > > Cheers, > Rob > > On 4 January 2012 22:09, Praveen M <[email protected]> wrote: > > > Thanks for the explanation Robbie. > > > > On Wed, Jan 4, 2012 at 1:12 PM, Robbie Gemmell <[email protected] > > >wrote: > > > > > Hi Praveen, > > > > > > I can only really guess to any design decision on that front as it > > > would have been before my time with the project, but I'd say its > > > likely just that way because theres never been a strong need / use > > > case that actually required doing anything else. For example, with > > > most of the users I liase with the data they are using has at least > > > some degree of time sensitivity to it and having anywhere near that > > > volume of persistent data in the broker would represent some sort of > > > ongoing period of catastrophic failure in their application. I can > > > only really think of one group who make it into multi-million message > > > backlogs at all, and that usually includes having knowingly published > > > things which noone will ever consume. > > > > > > For a FIFO queue you are correct it would 'just' need to load in more > > > as required. Things get trickier when dealing with some of the other > > > queue types however, such as as LVQ/conflation and the recently added > > > Sorted queue types. Making the broker able to hold partial segments of > > > the queue in memory is something we have discussed doing in the past > > > for other reasons, but message volume hasnt really been a significant > > > factor in those considerations until now. I will take note of it for > > > any future work we do in that area though. > > > > > > Robbie > > > > > > On 1 January 2012 17:46, Praveen M <[email protected]> wrote: > > > > Hi, > > > > > > > > I was digging in the code base and was trying to understand how the > > > broker > > > > is implemented. > > > > I see that for each message enqueued there are certain objects kept > in > > > > memory one for each message. > > > > > > > > example: MessageTransferReference, SimpleQueueEntryImpl etc. > > > > > > > > I tried computing the memory footprint of each individual message and > > it > > > > amounts about 320 bytes/message. > > > > I see that because of the footprint of each message, if i'm limited > to > > > 4GB > > > > of memory, then I am limited to only about 13 million messages in the > > > > system at one point. > > > > > > > > Since I'm using a persistent store I'd have expected to go over 13 > > > million > > > > messages and be limited by disk store rather than physical memory, > but > > > > I realized this isn't the case. > > > > > > > > I am curious as to what were the driving points for this design > > decision > > > to > > > > keep a reference to every message in memory. I'd have expected in a > > FIFO > > > > queue you just need a subset of messages in memory and can pull in > > > > messages on demand rather than maintain reference to every message in > > > > memory. > > > > > > > > Can someone please explain as to the reasons for this design? Also, > was > > > it > > > > assumed that we'd never flood the queues over 13 million messages at > > one > > > > time. Was there a bound > > > > decided upon? > > > > > > > > Thank you, > > > > Praveen > > > > > > > > > > > > > > > > -- > > > > -Praveen > > > > > > --------------------------------------------------------------------- > > > Apache Qpid - AMQP Messaging Implementation > > > Project: http://qpid.apache.org > > > Use/Interact: mailto:[email protected] > > > > > > > > > > > > -- > > -Praveen > > > -- -Praveen
