[ https://issues.apache.org/activemq/browse/AMQCPP-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_39480 ]
Gerald Kaas commented on AMQCPP-93: ----------------------------------- I've been spending the last day or so using a profiler against a test harness we have which sends and receives within the same process 2000 BytesMessages of approximately 500k in size with integers, doubles, strings, etc embedded within the body of the message. Normally this task takes 5-7 seconds on a couple of other middeware products we use but with the ActiveMQ C++ interface it was taking 26 seconds or so. Essentially I made two rather small changes which cut the time down to 15 seconds, which is approximately a 30% increase in performance. Both of these changes were done within the DataInputStream.cpp class. First, majority of the unmarshalling code was calling readFully for 1 byte. For the single byte extraction, I changed these readFully operations to inputStream->read() which only return a single byte. There is an incorrectness in the code that assumes that the inputStream->read(buffer, size) operation will return -1 when inputStream is at its end. This is incorrect. As far as I can tell, read(buffer, size) always throws an exception at the end rather than returning -1. The second enhancement is I buffered up 256 characters before appending the string in readString. Appending one character at a time is very inefficient since it constantly needs to determine if the string object needs to grow, realloc, copy the characters, etc. The readString operation is still a major hitter in my profiling. It is calling inputStream->read() millions of times which is a virtual function. Most strings are tens, hundreds, or thousands of characters in size. Since it is virtual, it cannot be inlined and they are CPU intensive when called repeatively. I'm sure CPU usage could be dropped another 30-50% if we could move the readString operation from the DataInputStream class to each inputStream class. That way the virtual function is called only once for each string unmarshalling and more inlining can happen. I'll leave it up to the experts in this group on how they want to proceed. Everything else seems fairly efficient, at least in the test case I was dealing with. The only other item that showed up in the radar is the ByteArrayOutputStream::write(unsigned char c) method. Again in this case, it is constantly realloc, copying, etc when the vector needs to resize. There is a setBuffer method in place but I need to evaluate it a little further to see if we can preallocate our own buffer based on the size we already know what we need up front. Attached is my diffs. Let me know what you think. > Performance analysis > -------------------- > > Key: AMQCPP-93 > URL: https://issues.apache.org/activemq/browse/AMQCPP-93 > Project: ActiveMQ C++ Client > Issue Type: Task > Affects Versions: 2.0 > Reporter: Nathan Mittler > Assignee: Nathan Mittler > Fix For: 2.2 > > Attachments: amqcpp-perf1.patch, amqcpp-perf1v2.patch, bench1.cpp > > > Do a performance analysis on openwire vs stomp. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.