I have gotten callgrind call-graph pictures of my proton/engine sender and receiver when the test is running fast and when it slows down.
The difference is in the sender -- when running fast, it is spending most of its time in the subtree of pn_connector_process() .. like 71%. When it slows down it is instead spending 47% in pn_delivery_writeable(), and only 17% in pn_connector_process(). Since it is still not instantly obvious to me what has happened, I thought I would share with you-all. Please see cool pictures at: http://people.apache.org/~mgoulish/protonics/performance/slowdown/2014_10_28/svg/psend_fast.svg http://people.apache.org/~mgoulish/protonics/performance/slowdown/2014_10_28/svg/psend_slow.svg To recap -- I can trigger this condition by getting the box busy while my proton/engine test is running. I.e. by doing a build. Even though I stop the build, and all 6 other processors on the box go back to being idle -- the test never recovers. The receiver goes down to 50% CPU or worse -- but these pictures show that the behavior change is in the sender.
