[
https://issues.apache.org/jira/browse/QPID-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16577503#comment-16577503
]
dan clark commented on QPID-8134:
---------------------------------
deal with normal conditions, and have your application wait to send
messages when the sender is at capacity.
There use case for using qpid is to have the ability to restart a receiver
(or sender) and not lose the messages during the short down time associated
with the service. Say you have two daemons running a high transaction
volume 50k messages/second, one is restarted but takes 5 sec for the system
to do all the work of brining the failed component back into action, then
ideally the sending buffer would need 5*50k of elasticity. Then having a
large send queue default would be a sensible policy. Designing a reliable
buffering mechanism inside the sender on top of a reliable messaging system
is not ideal when simply tuning the sender capacity, as was done, would be
a simple configuration change.
On the other hand, adding a caching policy for memory on the sender that
reserves in the qpid library memory for each message based on the size of
the string of the topic and the size of the sting for the buffer used by
the application plus all the message structure accounting creates a policy
in the qpid library consuming valuable node memory resources in an
arbitrary pattern and does not let the memory allocator (malloc) free
resources for other parts of the system. In this case, to avoid delays, no
swapping is allowed on the node, so once the library consumes the memory it
is forever lost to all other components.
Can we set up the qpid library to NOT do any caching of message memory
based on the number of messages, buffer sizes used by the application and
size of topics? This would allow the library to be used for a wider set of
reliable messaging scenarios and decouple message memory caching policies
from capacity handling when there are restarts of components. Using an
'un-reliable' message option does not provide the characteristics desirable
from the messaging library. A good example of how this is done is in
the implementation of openamq c library interface.
On Tue, Aug 7, 2018 at 3:22 PM, Alan Conway (JIRA) <[email protected]> wrote:
--
Dan Clark 503-915-3646
> qpid::client::Message::send multiple memory leaks
> -------------------------------------------------
>
> Key: QPID-8134
> URL: https://issues.apache.org/jira/browse/QPID-8134
> Project: Qpid
> Issue Type: Bug
> Components: C++ Client
> Affects Versions: qpid-cpp-1.37.0, qpid-cpp-1.38.0
> Environment: *CentOS* Linux release 7.4.1708 (Core)
> Linux localhost.novalocal 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57
> UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> *qpid*-qmf-1.37.0-1.el7.x86_64
> *qpid*-dispatch-debuginfo-1.0.0-1.el7.x86_64
> python-*qpid*-1.37.0-1.el7.noarch
> *qpid*-proton-c-0.18.1-1.el7.x86_64
> python-*qpid*-qmf-1.37.0-1.el7.x86_64
> *qpid*-proton-debuginfo-0.18.1-1.el7.x86_64
> *qpid*-cpp-debuginfo-1.37.0-1.el7.x86_64
> *qpid*-cpp-client-devel-1.37.0-1.el7.x86_64
> *qpid*-cpp-server-1.37.0-1.el7.x86_64
> *qpid*-cpp-client-1.37.0-1.el7.x86_64
>
> Reporter: dan clark
> Assignee: Alan Conway
> Priority: Blocker
> Labels: leak, maven
> Fix For: qpid-cpp-1.39.0
>
> Attachments: drain.cpp, godrain.sh, gospout.sh, qpid-8134.tgz,
> qpid-stat.out, spout.cpp, spout.log
>
> Original Estimate: 40h
> Remaining Estimate: 40h
>
> There may be multiple leaks of the outgoing message structure and associated
> fields when using the qpid::client::amqp0_10::SenderImpl::send function to
> publish messages under certain setups. I will concede that there may be
> options that are beyond my ken to ameliorate the leak of messages structures,
> especially since there is an indication that under prolonged runs (a
> demonized version of an application like spout) that the statistics for quidd
> indicate increased acquires with zero releases.
> The basic notion is illustrated with the test application spout (and drain).
> Consider a long running daemon reducing the overhead of open/send/close by
> keeping the message connection open for long periods of time. Then the logic
> would be: start application/open connection. In a loop send data (and never
> reach a close). Thus the drain application illustrates the behavior and
> demonstrates the leak using valgrind by sending the data followed by an
> exit(0).
> Note also the lack of 'releases' associated with the 'acquires' in the stats
> output.
> Capturing the leaks using the test applications spout/drain required adding
> an 'exit()' prior to the close, as during normal operations of a daemon, the
> connection remains open for a sustained period of time, thus the leak of
> structures within the c++ client library are found as structures still
> tracked by the library and cleaned up on 'connection.close()', but they
> should be cleaned up as a result of the completion of the send/receive ack or
> the termination of the life of the message based on the TTL of the message,
> which ever comes first. I have witnessed growth of the leaked structures
> into the millions of messages lasting more than 24hours with short (300sec)
> TTL of the messages based on scenarios attached using spout/drain as test
> vehicle.
> The attached spout.log uses a short 10message test and the spout.log contains
> 5 sets of different structures leaked (found with the 'bytes in 10 blocks are
> still reachable' lines, that are in line with much more sustained leaks when
> running the application for multiple days with millions of messages.
> The leaks seem to be associated with structures allocation 'stdstrings' to
> save the "subject" and the "payload" for string based messages using send for
> amq.topic output.
> Suggested work arounds are welcome based on application level changes to
> spout/drain (if they are missing key components) or changes to the
> address/setup of the queues for amq.topic messages (see the 'gospout.sh and
> godrain.sh' test drivers providing the specific address structures being used.
> For example, the following is one of the 5 different categories of leaked
> data from 'spout.log' on a valgrind analysis of the output post the send and
> session.sync but prior connection.close():
>
> ==3388== 3,680 bytes in 10 blocks are still reachable in loss record 233 of
> 234
> ==3388== at 0x4C2A203: operator new(unsigned long)
> (vg_replace_malloc.c:334)
> ==3388== by 0x4EB046C: qpid::client::Message::Message(std::string const&,
> std::string const&) (Message.cpp:31)
> ==3388== by 0x51742C1:
> qpid::client::amqp0_10::OutgoingMessage::OutgoingMessage()
> (OutgoingMessage.cpp:167)
> ==3388== by 0x5186200:
> qpid::client::amqp0_10::SenderImpl::sendImpl(qpid::messaging::Message const&)
> (SenderImpl.cpp:140)
> ==3388== by 0x5186485: operator() (SenderImpl.h:114)
> ==3388== by 0x5186485: execute<qpid::client::amqp0_10::SenderImpl::Send>
> (SessionImpl.h:102)
> ==3388== by 0x5186485:
> qpid::client::amqp0_10::SenderImpl::send(qpid::messaging::Message const&,
> bool) (SenderImpl.cpp:49)
> ==3388== by 0x40438D: main (spout.cpp:185)
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]