Re: c++ broker startup problem with large journal containing lots of data

Kim van der Riet Fri, 12 Mar 2010 04:52:05 -0800

On Thu, 2010-03-11 at 13:13 -0800, Charles Woerner wrote:
> Thanks Kim,
> 
> 1) I'm using qpid from RedHat MRG 0.5.752581-34 for i386 and the store  
> module is RHM 0.5.3206-27, also for i386.


Thanks, this is the most recent set.

> 
> 2) Yes, my messages are persistent.  I am aware that transient  
> messages are discarded on recovery.  Your comment that durable  
> messages "have their message content discarded when the policy limit  
> is reached" is surprising.  I have an 80 GB store and a 1 GB queue  
> with flow-to-disk.  My expectation was that I would be able to  
> continue to enqueue durable messages past the 1GB mark (up to 80% of  
> 80 GB) and only the first (or latest, not sure) 1GB of queue data  
> would be held in memory.  I expected the dequeues would sort of  
> "backfill" the 1GB memory buffer until the "flowed-to-disk" messages  
> were consumed from the store.  I recognize that that this is sort of  
> dual-purposing the backing store to do both journaling and spillover,  
> but that doesn't seem too far fetched.  I'm beginning to think this is  
> incorrect.  Is there some kind of identity relating available memory  
> to store geometry to queue size (ie. is the following a good rule of  
> thumb "available memory" = "store size" = "sum(max-queue-size of all  
> queues)")?

Perhaps there is a terminology gap... By "discarded", I mean the process
by which the broker discards the message content from memory while
retaining the message metadata in the queue - this is the process known
as "flow-to-disk". In order to consume a message which has has its
content discarded, the store must be called upon to read the content
from disk to restore the content. So you are correct: when the policy
threshold is reached, messages may continue to be enqueued, but with
their message content saved to disk and discarded (from memory).

You are also correct concerning the store space limitations - being
essentially a circular disk store, the size of the journal files impose
a fundamental limit on the total size of the content that may be stored.
When an approx. 80% full condition is reached, the store will deny
further storage for enqueue operations, dequeues may continue as normal.
The dequeues (if they consume the messages in approximately the same
order as they were enqueued) will free up space in the journals,
allowing enqueuing to continue as normal. The store does not distinguish
between persistent and transient messages placed there (except in
recovery in which the transient messages are discarded); as messages are
consumed, the disk space is freed up in the journal.

There is a distinct performance hole that is entered when flow-to-disk
is triggered. This process requires that the store perform both read and
write operations simultaneously, which causes the disk head to jump
around repeatedly within the journal files. Normal store operation is
sequential write-only.

I should point out that the concept of "flow-to-disk" as implemented is
not well thought through and has several noticeable limitations:

a) All policies are per-queue only, there is not a means to cleanly set
an overall limit. There is no limit on creating additional queues when
memory is near fully used;

b) There are several complicated inter-queue interactions. If a message
is routed to several queues (bearing in mind that there is just one copy
of the message content in memory, and that the store is created on a
per-queue basis), and a size policy is violated on just one of those
queues, then it is not possible to discard the message content in this
case as the other queues will not have knowledge of the missing message
content in their stores.

This is one area of the implementation that needs a re-think. What we
currently have is somewhat useful if you can live with its limitations.
Gordon may care to comment further on this.



---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:[email protected]

Re: c++ broker startup problem with large journal containing lots of data

Reply via email to