[ 
https://issues.apache.org/activemq/browse/AMQCPP-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_39480
 ] 

Gerald Kaas commented on AMQCPP-93:
-----------------------------------

I've been spending the last day or so using a profiler against a test harness 
we have which sends and receives within the same process 2000 BytesMessages of 
approximately 500k in size with integers, doubles, strings, etc embedded within 
the body of the message. Normally this task takes 5-7 seconds on a couple of 
other middeware products we use but with the ActiveMQ C++ interface it was 
taking 26 seconds or so. Essentially I made two rather small changes which cut 
the time down to 15 seconds, which is approximately a 30% increase in 
performance. Both of these changes were done within the DataInputStream.cpp 
class. First, majority of the unmarshalling code was calling readFully for 1 
byte. For the single byte extraction, I changed these readFully operations to 
inputStream->read() which only return a single byte. There is an incorrectness 
in the code that assumes that the inputStream->read(buffer, size) operation 
will return -1 when inputStream is at its end. This is incorrect. As far as I 
can tell, read(buffer, size) always throws an exception at the end rather than 
returning -1. The second enhancement is I buffered up 256 characters before 
appending the string in readString. Appending one character at a time is very 
inefficient since it constantly needs to determine if the string object needs 
to grow, realloc, copy the characters, etc.

The readString operation is still a major hitter in my profiling. It is calling 
inputStream->read() millions of times which is a virtual function. Most strings 
are tens, hundreds, or thousands of characters in size. Since it is virtual, it 
cannot be inlined and they are CPU intensive when called repeatively. I'm sure 
CPU usage could be dropped another 30-50% if we could move the readString 
operation from the DataInputStream class to each inputStream class. That way 
the virtual function is called only once for each string unmarshalling and more 
inlining can happen. I'll leave it up to the experts in this group on how they 
want to proceed.

Everything else seems fairly efficient, at least in the test case I was dealing 
with. The only other item that showed up in the radar is the 
ByteArrayOutputStream::write(unsigned char c) method. Again in this case, it is 
constantly realloc, copying, etc when the vector needs to resize. There is a 
setBuffer method in place but I need to evaluate it a little further to see if 
we can preallocate our own buffer based on the size we already know what we 
need up front.

Attached is my diffs. Let me know what you think.

> Performance analysis
> --------------------
>
>                 Key: AMQCPP-93
>                 URL: https://issues.apache.org/activemq/browse/AMQCPP-93
>             Project: ActiveMQ C++ Client
>          Issue Type: Task
>    Affects Versions: 2.0
>            Reporter: Nathan Mittler
>            Assignee: Nathan Mittler
>             Fix For: 2.2
>
>         Attachments: amqcpp-perf1.patch, amqcpp-perf1v2.patch, bench1.cpp
>
>
> Do a performance analysis on openwire vs stomp.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to