proton slowdown - I'm not doing it right
Bozo -- i see you're right. The size of my delivery list -- i.e. the list at connection->work_head -- is slowly increasing, and that's the problem. There is something I'm not handling, which gets a lot worse under heavy system load, and my sender never digs itself out. But -- what I want here is a canonical example of how to get high throughput with proton-c at the engine level -- and what I am hearing is that I should be using the event-collector interface which Dispatch uses. That's what we want to steer new users toward. So far the simple fixes I've tried have all resulted in zero speed... So! Rather than spending more time fixing these examples I will switch to the event collector model and see if I can get a nice little example working that way. (And from what I hear, it will be much nicer and much littler.) At least I know better now what I am trying to do: Get a send/receive example at the lowest level, that we want to direct new users toward, that has throughput as high as possible, and that can run flat-out on a time scale of days or weeks without crashing, without growing, and without slowing down. - Original Message - On 28. 10. 14 20:18, Michael Goulish wrote: > I have gotten callgrind call-graph pictures of my proton/engine sender > and receiver when the test is running fast and when it slows down. > > The difference is in the sender -- when running fast, it is spending > most of its time in the subtree of pn_connector_process() .. like 71%. > > When it slows down it is instead spending 47% in pn_delivery_writeable(), > and only 17% in pn_connector_process(). > > > Since it is still not instantly obvious to me what has happened, > I thought I would share with you-all. > > > Please see cool pictures at: > > > http://people.apache.org/~mgoulish/protonics/performance/slowdown/2014_10_28/svg/psend_fast.svg > > > http://people.apache.org/~mgoulish/protonics/performance/slowdown/2014_10_28/svg/psend_slow.svg > > > > > To recap -- I can trigger this condition by getting the box busy while > my proton/engine test is running. I.e. by doing a build. > Even though I stop the build, and all 6 other processors on > the box go back to being idle -- the test never recovers. > > The receiver goes down to 50% CPU or worse -- but these pictures > show that the behavior change is in the sender. > > > look at call counts, for pn_connector_process() and pn_delivery_writable() fast : ratio 1 : 5 slow : ratio 1 : 244.5 (!) The iteration over connection work list gets really expensive, which means the connection thinks it has to work on other stuff than what psend.c wants to work on. I still think that the call to pn_delivery() in psend.c is in a really unfortunate spot. btw, why do you iterate over connection work list at all, you could just remember the delivery when calling pn_delivery()? Bozzo
Re: proton slowdown - major clue
On 28. 10. 14 20:18, Michael Goulish wrote: > I have gotten callgrind call-graph pictures of my proton/engine sender > and receiver when the test is running fast and when it slows down. > > The difference is in the sender -- when running fast, it is spending > most of its time in the subtree of pn_connector_process() .. like 71%. > > When it slows down it is instead spending 47% in pn_delivery_writeable(), > and only 17% in pn_connector_process(). > > > Since it is still not instantly obvious to me what has happened, > I thought I would share with you-all. > > > Please see cool pictures at: > > > http://people.apache.org/~mgoulish/protonics/performance/slowdown/2014_10_28/svg/psend_fast.svg > > > http://people.apache.org/~mgoulish/protonics/performance/slowdown/2014_10_28/svg/psend_slow.svg > > > > > To recap -- I can trigger this condition by getting the box busy while > my proton/engine test is running. I.e. by doing a build. > Even though I stop the build, and all 6 other processors on > the box go back to being idle -- the test never recovers. > > The receiver goes down to 50% CPU or worse -- but these pictures > show that the behavior change is in the sender. > > > look at call counts, for pn_connector_process() and pn_delivery_writable() fast : ratio 1 : 5 slow : ratio 1 : 244.5 (!) The iteration over connection work list gets really expensive, which means the connection thinks it has to work on other stuff than what psend.c wants to work on. I still think that the call to pn_delivery() in psend.c is in a really unfortunate spot. btw, why do you iterate over connection work list at all, you could just remember the delivery when calling pn_delivery()? Bozzo
Proton release planning
Hi Everyone, I'd like to try to get the proton releases to be a bit more frequent, and I'm also trying to get a bit more up front planning into them. To that end I've put together a quick description of what I'd propose for timeline and scope of the next release here: http://qpid.apache.org/proton/development.html Please have a look and let me know what you think. --Rafael
proton slowdown - major clue
I have gotten callgrind call-graph pictures of my proton/engine sender and receiver when the test is running fast and when it slows down. The difference is in the sender -- when running fast, it is spending most of its time in the subtree of pn_connector_process() .. like 71%. When it slows down it is instead spending 47% in pn_delivery_writeable(), and only 17% in pn_connector_process(). Since it is still not instantly obvious to me what has happened, I thought I would share with you-all. Please see cool pictures at: http://people.apache.org/~mgoulish/protonics/performance/slowdown/2014_10_28/svg/psend_fast.svg http://people.apache.org/~mgoulish/protonics/performance/slowdown/2014_10_28/svg/psend_slow.svg To recap -- I can trigger this condition by getting the box busy while my proton/engine test is running. I.e. by doing a build. Even though I stop the build, and all 6 other processors on the box go back to being idle -- the test never recovers. The receiver goes down to 50% CPU or worse -- but these pictures show that the behavior change is in the sender.
[jira] [Updated] (PROTON-334) SASL Plug-in for Proton
[ https://issues.apache.org/jira/browse/PROTON-334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Justin Ross updated PROTON-334: --- Assignee: Andrew Stitcher > SASL Plug-in for Proton > --- > > Key: PROTON-334 > URL: https://issues.apache.org/jira/browse/PROTON-334 > Project: Qpid Proton > Issue Type: Wish > Components: proton-c >Reporter: Ted Ross >Assignee: Andrew Stitcher > > It would be desirable to have the ability to use a plug-in module for SASL in > Proton. The following implementations could then be developed: > 1) A portable stand-alone plugin that does ANONYMOUS, PLAIN, and EXTERNAL > 2) A Cyrus-Sasl based plugin for Linux > 3) A Windows plugin -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: VOTE: Release Proton 0.8 RC5 as 0.8 final
[X] Yes, release Proton 0.8 RC5 as 0.8 final Proton-J side still works for AMQ. On 10/27/2014 09:51 PM, Rafael Schloming wrote: Hi Everyone, Sorry for the delay, there seemed to be some kind of Nexus outage today, so I was unable to generate the java binaries until just now. I've posted RC5 in the usual places. The only difference from RC4 is a one line delta that replaces the assertion failure when we receive out-of-sequence ids with a connection shutdown error. Please have a look and register your vote. Source code can be found here: http://people.apache.org/~rhs/qpid-proton-0.8rc5/ Java binaries are here: https://repository.apache.org/content/repositories/orgapacheqpid-1021 [ ] Yes, release Proton 0.8 RC5 as 0.8 final [ ] No, because ... --Rafael -- Tim Bish Sr Software Engineer | RedHat Inc. tim.b...@redhat.com | www.redhat.com skype: tabish121 | twitter: @tabish121 blog: http://timbish.blogspot.com/
Re: VOTE: Release Proton 0.8 RC5 as 0.8 final
[X] Yes, release Proton 0.8 RC5 as 0.8 final Testing: Proton-c build, unit tests, and install on Fedora20 and Centos7 x86_64 - Original Message - > From: "Rafael Schloming" > To: proton@qpid.apache.org > Sent: Monday, October 27, 2014 9:51:00 PM > Subject: VOTE: Release Proton 0.8 RC5 as 0.8 final > > Hi Everyone, > > Sorry for the delay, there seemed to be some kind of Nexus outage today, so > I was unable to generate the java binaries until just now. > > I've posted RC5 in the usual places. The only difference from RC4 is a one > line delta that replaces the assertion failure when we receive > out-of-sequence ids with a connection shutdown error. Please have a look > and register your vote. > > Source code can be found here: > > http://people.apache.org/~rhs/qpid-proton-0.8rc5/ > > Java binaries are here: > > https://repository.apache.org/content/repositories/orgapacheqpid-1021 > > [ ] Yes, release Proton 0.8 RC5 as 0.8 final > [ ] No, because ... > > --Rafael > -- -K
Re: VOTE: Release Proton 0.8 RC5 as 0.8 final
[ X ] Yes, release Proton 0.8 RC5 as 0.8 final. I ran the C and Java build+tests, and tried out the published Java binaries using the JMS client build+tests. Aside: doing a binary diff of the archive contents shows a second small change since RC4, in the python bindings: http://svn.apache.org/r1634078 Robbie On 28 October 2014 01:51, Rafael Schloming wrote: > Hi Everyone, > > Sorry for the delay, there seemed to be some kind of Nexus outage today, so > I was unable to generate the java binaries until just now. > > I've posted RC5 in the usual places. The only difference from RC4 is a one > line delta that replaces the assertion failure when we receive > out-of-sequence ids with a connection shutdown error. Please have a look > and register your vote. > > Source code can be found here: > > http://people.apache.org/~rhs/qpid-proton-0.8rc5/ > > Java binaries are here: > > https://repository.apache.org/content/repositories/orgapacheqpid-1021 > > [ ] Yes, release Proton 0.8 RC5 as 0.8 final > [ ] No, because ... > > --Rafael >