On 28. 10. 14 20:18, Michael Goulish wrote: > I have gotten callgrind call-graph pictures of my proton/engine sender > and receiver when the test is running fast and when it slows down. > > The difference is in the sender -- when running fast, it is spending > most of its time in the subtree of pn_connector_process() .. like 71%. > > When it slows down it is instead spending 47% in pn_delivery_writeable(), > and only 17% in pn_connector_process(). > > > Since it is still not instantly obvious to me what has happened, > I thought I would share with you-all. > > > Please see cool pictures at: > > > http://people.apache.org/~mgoulish/protonics/performance/slowdown/2014_10_28/svg/psend_fast.svg > > > http://people.apache.org/~mgoulish/protonics/performance/slowdown/2014_10_28/svg/psend_slow.svg > > > > > To recap -- I can trigger this condition by getting the box busy while > my proton/engine test is running. I.e. by doing a build. > Even though I stop the build, and all 6 other processors on > the box go back to being idle -- the test never recovers. > > The receiver goes down to 50% CPU or worse -- but these pictures > show that the behavior change is in the sender. > > >
look at call counts, for pn_connector_process() and pn_delivery_writable() fast : ratio 1 : 5 slow : ratio 1 : 244.5 (!) The iteration over connection work list gets really expensive, which means the connection thinks it has to work on other stuff than what psend.c wants to work on. I still think that the call to pn_delivery() in psend.c is in a really unfortunate spot. btw, why do you iterate over connection work list at all, you could just remember the delivery when calling pn_delivery()? Bozzo
