Just to jump in on this thread.

Re "

but my opinion
is that if you have "millions of messages" then a Message Broker is the
wrong solution to your problem - you want a Database.

"
I can't say I agree with Rob's assertion here!!

Well maybe that's a reasonable comment if the *intention* is to have millions of messages hanging around, but what if it's due to an unfortunate circumstance......

So one classic scenario might be connecting over a WAN when the WAN goes down messages build up....it's not what I want, but it's what'll happen.

In my scenario I'm actually federating between C++ brokers and I'm using queue routes and boxes with loads of memory so I can have big queues. I'm also using circular queues cause I don't want things to die if I eventually use up all of my capacity.

In the C++ broker flow to disk works OK sort off, but you can't have it bigger than your available memory and also have things circular (TBH it's all a little untidy as Gordon Sim will I'm sure tell you).

For my core use case performance is more important than (modest) message loss, so I'm not using persistence (my view is that the C++ broker is more reliable than the disk :-) and if I see any issues I'm likely to federate to multiple load balanced brokers on different power supplies - or even different locations). To be fair I'm not over keen on the eventual data loss I'm going to get if the WAN dies and I hit the limit of my circular queue, but my cunning plan is to make use of the QMF queueThresholdExceeded event (don't know if that exists for the Java broker).

I've already written a Java QMF application that intercepts queueThresholdExceeded and uses that to trigger a QMF purge method on a queue, so the QMF client basically acts as a "fuse" to prevent slow consumers taking out message producers. I think that it's likely to be pretty simple to extend this idea so that rather than purging the queue I redirect the messages to a queue that has a persisting consumer, so in essence I only actually trigger a flow to disk when I have a slow consumer (or in my case a dead WAN).

Frase

On 05/01/12 17:51, Praveen M wrote:
That was really useful. Thanks for writting Rob.

On Wed, Jan 4, 2012 at 5:08 PM, Rob Godfrey<[email protected]>  wrote:

Robbie beat me to replying, and said mostly what I was going to say... but
anyway ...

This design decision is before even my time on the project, but my opinion
is that if you have "millions of messages" then a Message Broker is the
wrong solution to your problem - you want a Database...  That being said I
think there is a reasonable argument to be made that a broker should be
able to run in a low memory environment in which case you may want to be
able to swap out sections of the list structure.

Fundamentally this would be quite a major change.  The internal queue
design is predicated on the list structure being in memory and being able
to apply lockless atomic operations on it.  From a performance point of
view it would also be potentially very tricky. There are a number of
processes within the broker which periodically scan the entire queue
(looking for expired messages and such) as well the fact (as Robbie pointed
out) many use cases are not strict FIFO (priority queues, LVQs, selectors,
etc).  And ultimately you are still going to be limited by a finite
resource (albeit a larger one). That being said we should definitely look
at trying to reduce the per-message overhead.

At some point soon I hope to be looking at implementing a flow-to-disk
policy for messages within queues to protect the broker from out of memory
situations with transient messages (as well as improving on the rather
drastic SoftReference based method currently employed).  The main issue
though is that the reason why you need to flow to disk is that your
producers are producing faster than your consumers can consume... and
pushing messages (or worse queue structure too) to disk is only going to
slow your consumers down even more - likely making the problem even worse.

Cheers,
Rob

On 4 January 2012 22:09, Praveen M<[email protected]>  wrote:

Thanks for the explanation Robbie.

On Wed, Jan 4, 2012 at 1:12 PM, Robbie Gemmell<[email protected]
wrote:
Hi Praveen,

I can only really guess to any design decision on that front as it
would have been before my time with the project, but I'd say its
likely just that way because theres never been a strong need / use
case that actually required doing anything else. For example, with
most of the users I liase with the data they are using has at least
some degree of time sensitivity to it and having anywhere near that
volume of persistent data in the broker would represent some sort of
ongoing period of catastrophic failure in their application. I can
only really think of one group who make it into multi-million message
backlogs at all, and that usually includes having knowingly published
things which noone will ever consume.

For a FIFO queue you are correct it would 'just' need to load in more
as required. Things get trickier when dealing with some of the other
queue types however, such as as LVQ/conflation and the recently added
Sorted queue types. Making the broker able to hold partial segments of
the queue in memory is something we have discussed doing in the past
for other reasons, but message volume hasnt really been a significant
factor in those considerations until now. I will take note of it for
any future work we do in that area though.

Robbie

On 1 January 2012 17:46, Praveen M<[email protected]>  wrote:
Hi,

I was digging in the code base and was trying to understand how the
broker
is implemented.
I see that for each message enqueued there are certain objects kept
in
memory one for each message.

example: MessageTransferReference, SimpleQueueEntryImpl etc.

I tried computing the memory footprint of each individual message and
it
amounts about 320 bytes/message.
I see that because of the footprint of each message,  if i'm limited
to
4GB
of memory, then I am limited to only about 13 million messages in the
system at one point.

Since I'm using a persistent store I'd have expected to go over 13
million
messages and be limited by disk store rather than physical memory,
but
I realized this isn't the case.

I am curious as to what were the driving points for this design
decision
to
keep a reference to every message in memory. I'd have expected in a
FIFO
queue you just need a subset of messages in memory and can  pull in
messages on demand rather than maintain reference to every message in
memory.

Can someone please explain as to the reasons for this design? Also,
was
it
assumed that we'd never flood the queues over 13 million messages at
one
time. Was there a bound
decided upon?

Thank you,
Praveen



--
-Praveen
---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]



--
-Praveen





---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Reply via email to