Chuck Rolke created QPID-4715:
---------------------------------

             Summary: C++ Broker Replicating Event Listener needs a limit on 
messages enqueued for replication
                 Key: QPID-4715
                 URL: https://issues.apache.org/jira/browse/QPID-4715
             Project: Qpid
          Issue Type: Improvement
          Components: C++ Broker
    Affects Versions: 0.22
            Reporter: Chuck Rolke
            Assignee: Chuck Rolke


A system has replication turned on (replicating_listener.so is loaded) and 
events are being queued for replication. When the peer system stops receiving 
messages then the replication queue grows unbounded and the broker runs out of 
memory.

The proposed feature would have two parts:

# An optional upper limit may be placed on the replication queue size via the 
CLI. When that limit is exceed then replication is stopped and the replication 
queue is purged to reclaim its memory. 
# A management method allows an administrator to stop replication at any time, 
disabling the replication queue and reclaiming its memory.

Impact assessment:

||Design consideration||Proposed feature||
|Threading model|queues provide adequate locking|
|Memory management|n/a|
|Automated testing approach|easy to test|
|Impact on public API|Adds management method and Replication CLI switch|
|- Interoperability with implementations in other languages|n/a|
|- Backwards compatibility|n/a|
|Performance implications|Insignificant|
|Security implications|New method already protected by ACL|
|Platform support|n/a|
|Logging|Logs to be added|
|Monitoring|Event to be added|
|Management|New broker method|

This feature is a response to a panic situation. The theory is that it is 
preferable to abandon replication than to drive the broker out of memory and 
crash.

An administrator may want to trigger this feature based on several conditions:

# The qpidd process is using too much memory.
# The host system is running low on virtual memory
# The queue in question is using too much memory inside qpidd.

The proposal is to support feature #1 by specifying a _maximum message count_ 
for the replication queue. Queues already have this and it is a quick statistic 
to compare for limit checking.

Queue sizes in bytes are not directly known. The customer is free to monitor 
system and process virtual memory and still trigger the cleanup by using 
feature #2.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org

Reply via email to