Re: Optimizations on Proton-j
Hi, Comments inline... On Mon, May 5, 2014 at 5:51 PM, Clebert Suconic wrote: > I have some ideas as well: > > > - Calculating size prior to sending: > > - We could write zeroes, write to the buffer... come back to the previous > position.. write the size instead of calculating it. > Yeah, this is what I've done before. The only tricky thing here is figuring out how much space to reserve for the size. In order to minimize the size of the encoded data, it's better to use an encoding with a 1 byte size when you can, but of course you don't know in advance if the size of the encoded data will fit within a single byte. > > I have read this code a lot.. and I wouldn't rewrite the code.. just > optimize these cases... I wouldn't optimize it for every possible case > TBH.. just on message Delivery and Settling unless you want to also > optimize other cases for use cases that I'm not aware at the moment. > I think the ability to access key data without doing a full decode is likely to be useful at some point. I should also say that I think the actual codec interface is not terribly useful/friendly right now for end users. I don't particularly mind whether we iterate or replace the current implementation, but I do think we need a solid idea of the end goal. To that end I've put together a mock up of a few different strategies that I've posted in another thread. > > > other things that could boost performance based on the micro benchmark I > wrote: > > > - Using Integer, Long.. etc..inside of UnsignedInt, UnsignedLong would > give you a good boost in performance. The JDK is already optimized to box > these types... while UnsignedInt, UnsignedLong.. etc.. its not well > optimized. > I haven't noticed much of a difference between Integer and UnsignedInteger in any of the profiling I've done, but using the unboxed variants would definitely make a difference. > > - Reusing buffers.. maybe adding a framework where we could reuse > buffers.. or delegate into other frameworks (e.g. Netty). > Yeah, we should look at this in the context of copying as well. --Rafael
Re: Optimizations on Proton-j
I have some ideas as well: - Calculating size prior to sending: - We could write zeroes, write to the buffer... come back to the previous position.. write the size instead of calculating it. I have read this code a lot.. and I wouldn't rewrite the code.. just optimize these cases... I wouldn't optimize it for every possible case TBH.. just on message Delivery and Settling unless you want to also optimize other cases for use cases that I'm not aware at the moment. other things that could boost performance based on the micro benchmark I wrote: - Using Integer, Long.. etc..inside of UnsignedInt, UnsignedLong would give you a good boost in performance. The JDK is already optimized to box these types... while UnsignedInt, UnsignedLong.. etc.. its not well optimized. - Reusing buffers.. maybe adding a framework where we could reuse buffers.. or delegate into other frameworks (e.g. Netty). On May 5, 2014, at 5:40 PM, Clebert Suconic wrote: > > On May 5, 2014, at 5:02 PM, Rafael Schloming wrote: > >> Hi Clebert, >> >> Sorry for taking so long to follow up with the benchmark. I've been tweaking >> it and using it to do some memory and CPU profiling. >> >> The good news is I was able to significantly reduce memory utilization by >> making some simple changes to CollectorImpl. The benchmark triggers over 42 >> million events. The way CollectorImpl was coded initially this would create >> two throwaway objects for each event. This was ending up somewhere north of >> 84 million throw away objects over the course of the benchmark. One being >> the event itself, and the other being the linked list node. I changed >> CollectorImpl to use a simple chain instead of java.util's linked list and >> also to pool/reuse popped events. The same benchmark now only results in >> about 250 actual event objects being created in total in order to handle the >> same 42 million events. >> >> While this reduces memory pressure a lot, surprisingly enough, the event >> related objects were not the biggest source of garbage. At the top is >> java.nio.HeapByteBuffer (I haven't looked into this, but I suspect we're >> throwing away some of these that we could be reusing), and the second >> biggest source of garbage is org.apache.qpid.proton.amqp.UnsignedInteger. >> The latter one I find a bit more concerning as its use is fairly >> interwingled into the design of the current codec and as a result it is >> likely less straightforward to address. >> >> On the CPU profiling front I've noticed a couple of interesting things. >> First off it does appear that much of the processing time is spent in codec, >> roughly 63%, but the cost is not symmetrically divided between encode and >> decode. Encoding accounts for about 46% of the total, and decoding about >> 17%. I think this may be why its hard to measure the effect of your patch on >> this benchmark. The decoding cost just isn't all that high compared to >> encoding. I did the same profiling again with your patch applied and >> decoding dropped to about 10% of the total while encoding increased to about >> 50% of the total. >> >> Digging into the encoding side a bit, it appears that a significant amount >> of time is being spent calculating the encoded size of a value prior to >> writing its encoded representation to the wire. One of the optimizations >> I've had success with in the past (both on previous Java codecs and in the C >> codec) is to avoid calculating the size up front and instead simply reserve >> the necessary space for it and fill it in after the encoded representation >> has been written. In the past this has close to doubled the performance of >> encode since calculating the encoded size is often as expensive as simply >> doing the encoding. Unfortunately I'm guessing this kind of thing would >> probably require a major rework of the codec. >> >> To summarize I think there are really two design related issues we will need >> to address in order to achieve optimum performance. On the memory front, I >> think the fact that every described type is rendered into a tree of generic >> objects on both decode/encode is going to be problematic. The strategy >> you've taken in your patch to special case certain frames and eliminate the >> intermediate list objects helps with this, but I think we could do a whole >> lot better if we were to adopt a design that did not require any >> intermediate objects at all. On the CPU front, I think we'll get the biggest >> bang for our buck if we look into a design that doesn't require calculating >> the size up front. >> >> I have some ideas in mind for a new design that I hope will address both of >> these issues. I'm going to write them up in a separate post. >> >> Regarding your patch, I'm happy to apply it, but I suspect that much of the >> current codec layer would need to be modified and/or replaced to address the >> above findings. Let me know how you would like
Re: Optimizations on Proton-j
On May 5, 2014, at 5:02 PM, Rafael Schloming wrote: > Hi Clebert, > > Sorry for taking so long to follow up with the benchmark. I've been tweaking > it and using it to do some memory and CPU profiling. > > The good news is I was able to significantly reduce memory utilization by > making some simple changes to CollectorImpl. The benchmark triggers over 42 > million events. The way CollectorImpl was coded initially this would create > two throwaway objects for each event. This was ending up somewhere north of > 84 million throw away objects over the course of the benchmark. One being the > event itself, and the other being the linked list node. I changed > CollectorImpl to use a simple chain instead of java.util's linked list and > also to pool/reuse popped events. The same benchmark now only results in > about 250 actual event objects being created in total in order to handle the > same 42 million events. > > While this reduces memory pressure a lot, surprisingly enough, the event > related objects were not the biggest source of garbage. At the top is > java.nio.HeapByteBuffer (I haven't looked into this, but I suspect we're > throwing away some of these that we could be reusing), and the second biggest > source of garbage is org.apache.qpid.proton.amqp.UnsignedInteger. The latter > one I find a bit more concerning as its use is fairly interwingled into the > design of the current codec and as a result it is likely less straightforward > to address. > > On the CPU profiling front I've noticed a couple of interesting things. First > off it does appear that much of the processing time is spent in codec, > roughly 63%, but the cost is not symmetrically divided between encode and > decode. Encoding accounts for about 46% of the total, and decoding about 17%. > I think this may be why its hard to measure the effect of your patch on this > benchmark. The decoding cost just isn't all that high compared to encoding. I > did the same profiling again with your patch applied and decoding dropped to > about 10% of the total while encoding increased to about 50% of the total. > > Digging into the encoding side a bit, it appears that a significant amount of > time is being spent calculating the encoded size of a value prior to writing > its encoded representation to the wire. One of the optimizations I've had > success with in the past (both on previous Java codecs and in the C codec) is > to avoid calculating the size up front and instead simply reserve the > necessary space for it and fill it in after the encoded representation has > been written. In the past this has close to doubled the performance of encode > since calculating the encoded size is often as expensive as simply doing the > encoding. Unfortunately I'm guessing this kind of thing would probably > require a major rework of the codec. > > To summarize I think there are really two design related issues we will need > to address in order to achieve optimum performance. On the memory front, I > think the fact that every described type is rendered into a tree of generic > objects on both decode/encode is going to be problematic. The strategy you've > taken in your patch to special case certain frames and eliminate the > intermediate list objects helps with this, but I think we could do a whole > lot better if we were to adopt a design that did not require any intermediate > objects at all. On the CPU front, I think we'll get the biggest bang for our > buck if we look into a design that doesn't require calculating the size up > front. > > I have some ideas in mind for a new design that I hope will address both of > these issues. I'm going to write them up in a separate post. > > Regarding your patch, I'm happy to apply it, but I suspect that much of the > current codec layer would need to be modified and/or replaced to address the > above findings. Let me know how you would like to proceed. It's all consistent with what I have seen... I have also realized the calculating encode size prior to sending. I would prefer if we could evolve in top of what did. I think the patch that I did is one step further on avoiding intermediate objects... I have seen already other cases where we could avoid it... It's all an evolution... and I'm still working into other cases... if you take this patch now we would move a step further. the patch is not only addressing the codec optimizations, but it's also avoiding multiple instances of the codec itself.. what makes it lighter. I'm now working on a benchmark based on yours that will be closer to what I will be using.
Re: Optimizations on Proton-j
Hi Clebert, Sorry for taking so long to follow up with the benchmark. I've been tweaking it and using it to do some memory and CPU profiling. The good news is I was able to significantly reduce memory utilization by making some simple changes to CollectorImpl. The benchmark triggers over 42 million events. The way CollectorImpl was coded initially this would create two throwaway objects for each event. This was ending up somewhere north of 84 million throw away objects over the course of the benchmark. One being the event itself, and the other being the linked list node. I changed CollectorImpl to use a simple chain instead of java.util's linked list and also to pool/reuse popped events. The same benchmark now only results in about 250 actual event objects being created in total in order to handle the same 42 million events. While this reduces memory pressure a lot, surprisingly enough, the event related objects were not the biggest source of garbage. At the top is java.nio.HeapByteBuffer (I haven't looked into this, but I suspect we're throwing away some of these that we could be reusing), and the second biggest source of garbage is org.apache.qpid.proton.amqp.UnsignedInteger. The latter one I find a bit more concerning as its use is fairly interwingled into the design of the current codec and as a result it is likely less straightforward to address. On the CPU profiling front I've noticed a couple of interesting things. First off it does appear that much of the processing time is spent in codec, roughly 63%, but the cost is not symmetrically divided between encode and decode. Encoding accounts for about 46% of the total, and decoding about 17%. I think this may be why its hard to measure the effect of your patch on this benchmark. The decoding cost just isn't all that high compared to encoding. I did the same profiling again with your patch applied and decoding dropped to about 10% of the total while encoding increased to about 50% of the total. Digging into the encoding side a bit, it appears that a significant amount of time is being spent calculating the encoded size of a value prior to writing its encoded representation to the wire. One of the optimizations I've had success with in the past (both on previous Java codecs and in the C codec) is to avoid calculating the size up front and instead simply reserve the necessary space for it and fill it in after the encoded representation has been written. In the past this has close to doubled the performance of encode since calculating the encoded size is often as expensive as simply doing the encoding. Unfortunately I'm guessing this kind of thing would probably require a major rework of the codec. To summarize I think there are really two design related issues we will need to address in order to achieve optimum performance. On the memory front, I think the fact that every described type is rendered into a tree of generic objects on both decode/encode is going to be problematic. The strategy you've taken in your patch to special case certain frames and eliminate the intermediate list objects helps with this, but I think we could do a whole lot better if we were to adopt a design that did not require any intermediate objects at all. On the CPU front, I think we'll get the biggest bang for our buck if we look into a design that doesn't require calculating the size up front. I have some ideas in mind for a new design that I hope will address both of these issues. I'm going to write them up in a separate post. Regarding your patch, I'm happy to apply it, but I suspect that much of the current codec layer would need to be modified and/or replaced to address the above findings. Let me know how you would like to proceed. --Rafael On Fri, May 2, 2014 at 10:45 AM, Clebert Suconic wrote: > These shuld be all cleared now.. > > > My github branch and PR are up to date now: > > https://github.com/apache/qpid-proton/pull/1 > > > And isn't git is beautiful. It's already rebased with Rafi's last > commit! > > > > On May 1, 2014, at 5:32 PM, Clebert Suconic wrote: > > > I will do some cleanup on this .. I already fixed the headers here on my > copy and I will do some cleanup on imports that were't supposed to be done. > > On May 1, 2014, at 5:11 PM, Robbie Gemmell > wrote: > > > >> As no mail arrived here or qpid-dev, and none seems to have arrived at > what > >> used to be the default location (infra-dev) either, I had a quick look > and > >> it seems like they might have changed the process slightly and we will > need > >> to ask for the mails to be enabled at all: > >> > https://blogs.apache.org/infra/entry/improved_integration_between_apache_and > >> > >> I particularly like the mention of the new comment syncing between our > >> mailing list and the Pull Requests. > >> > >> Regarding closing the pull requests, it seems like something along the > >> lines of "This closes # at GitHub" added to the end of > the > >> svn commit message should do the tric
Re: Optimizations on Proton-j
These shuld be all cleared now.. My github branch and PR are up to date now: https://github.com/apache/qpid-proton/pull/1 And isn't git is beautiful. It's already rebased with Rafi's last commit! On May 1, 2014, at 5:32 PM, Clebert Suconic wrote: > I will do some cleanup on this .. I already fixed the headers here on my copy > and I will do some cleanup on imports that were't supposed to be done. > On May 1, 2014, at 5:11 PM, Robbie Gemmell wrote: > >> As no mail arrived here or qpid-dev, and none seems to have arrived at what >> used to be the default location (infra-dev) either, I had a quick look and >> it seems like they might have changed the process slightly and we will need >> to ask for the mails to be enabled at all: >> https://blogs.apache.org/infra/entry/improved_integration_between_apache_and >> >> I particularly like the mention of the new comment syncing between our >> mailing list and the Pull Requests. >> >> Regarding closing the pull requests, it seems like something along the >> lines of "This closes # at GitHub" added to the end of the >> svn commit message should do the trick: >> https://help.github.com/articles/closing-issues-via-commit-messages >> >> I havent had a chance to really look at the actual code change but when I >> was quickly scrolling down the PR, in addition to the licence headers on >> the new files that Rafi already mentioned (which I spotted due to the >> Copyright notices we wouldnt typically have) I noticed Encoder.java having >> its existing licence header corrupted a little by some wayward code. >> >> Robbie >> I just submitted it as a git PR: >> >> https://github.com/apache/qpid-proton/pull/1 >> >> >> >> On Apr 30, 2014, at 10:47 AM, Robbie Gemmell >> wrote: >> >>> I think anyone can sign up for ReviewBoard themselves. It certainly didn't >>> used to be linked to the ASF LDAP in the past, presumably for that reason. >>> >>> Its probably also worth noting you can initiate pull requests against the >>> github mirrors. If it hasn't already been done for the proton mirror, we >>> can have the emails that would generate be directed to this list (e.g. >>> >> http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E >> ). >>> We obviously can't merge the pull request via github, but you can use >>> the reviewing tools etc and the resultant patch can be downloaded or >>> attached to a JIRA and then applied in the usual fashion (I believe there >>> is a commit message syntax that can be used to trigger closing the pull >>> request). >>> >>> Robbie >>> >>> On 30 April 2014 15:22, Rafael Schloming wrote: >>> On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic wrote: > @Rafi: I see there is a patch review process within Apache (based on your > other thread on Java8) > > Should we make this through the patch process at some point? > I'm fine looking at it on your git branch, but if you'd like to play with the review tool then feel free. Just let me know if you need an account and I will try to remember how to set one up (or who to bug to get you one). ;-) --Rafael >
Re: Optimizations on Proton-j
I will do some cleanup on this .. I already fixed the headers here on my copy and I will do some cleanup on imports that were't supposed to be done. On May 1, 2014, at 5:11 PM, Robbie Gemmell wrote: > As no mail arrived here or qpid-dev, and none seems to have arrived at what > used to be the default location (infra-dev) either, I had a quick look and > it seems like they might have changed the process slightly and we will need > to ask for the mails to be enabled at all: > https://blogs.apache.org/infra/entry/improved_integration_between_apache_and > > I particularly like the mention of the new comment syncing between our > mailing list and the Pull Requests. > > Regarding closing the pull requests, it seems like something along the > lines of "This closes # at GitHub" added to the end of the > svn commit message should do the trick: > https://help.github.com/articles/closing-issues-via-commit-messages > > I havent had a chance to really look at the actual code change but when I > was quickly scrolling down the PR, in addition to the licence headers on > the new files that Rafi already mentioned (which I spotted due to the > Copyright notices we wouldnt typically have) I noticed Encoder.java having > its existing licence header corrupted a little by some wayward code. > > Robbie > I just submitted it as a git PR: > > https://github.com/apache/qpid-proton/pull/1 > > > > On Apr 30, 2014, at 10:47 AM, Robbie Gemmell > wrote: > >> I think anyone can sign up for ReviewBoard themselves. It certainly didn't >> used to be linked to the ASF LDAP in the past, presumably for that reason. >> >> Its probably also worth noting you can initiate pull requests against the >> github mirrors. If it hasn't already been done for the proton mirror, we >> can have the emails that would generate be directed to this list (e.g. >> > http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E > ). >> We obviously can't merge the pull request via github, but you can use >> the reviewing tools etc and the resultant patch can be downloaded or >> attached to a JIRA and then applied in the usual fashion (I believe there >> is a commit message syntax that can be used to trigger closing the pull >> request). >> >> Robbie >> >> On 30 April 2014 15:22, Rafael Schloming wrote: >> >>> On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic >>> wrote: >>> @Rafi: I see there is a patch review process within Apache (based on >>> your other thread on Java8) Should we make this through the patch process at some point? >>> >>> I'm fine looking at it on your git branch, but if you'd like to play with >>> the review tool then feel free. Just let me know if you need an account >>> and I will try to remember how to set one up (or who to bug to get you >>> one). ;-) >>> >>> --Rafael >>>
Re: Optimizations on Proton-j
As no mail arrived here or qpid-dev, and none seems to have arrived at what used to be the default location (infra-dev) either, I had a quick look and it seems like they might have changed the process slightly and we will need to ask for the mails to be enabled at all: https://blogs.apache.org/infra/entry/improved_integration_between_apache_and I particularly like the mention of the new comment syncing between our mailing list and the Pull Requests. Regarding closing the pull requests, it seems like something along the lines of "This closes # at GitHub" added to the end of the svn commit message should do the trick: https://help.github.com/articles/closing-issues-via-commit-messages I havent had a chance to really look at the actual code change but when I was quickly scrolling down the PR, in addition to the licence headers on the new files that Rafi already mentioned (which I spotted due to the Copyright notices we wouldnt typically have) I noticed Encoder.java having its existing licence header corrupted a little by some wayward code. Robbie I just submitted it as a git PR: https://github.com/apache/qpid-proton/pull/1 On Apr 30, 2014, at 10:47 AM, Robbie Gemmell wrote: > I think anyone can sign up for ReviewBoard themselves. It certainly didn't > used to be linked to the ASF LDAP in the past, presumably for that reason. > > Its probably also worth noting you can initiate pull requests against the > github mirrors. If it hasn't already been done for the proton mirror, we > can have the emails that would generate be directed to this list (e.g. > http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E ). > We obviously can't merge the pull request via github, but you can use > the reviewing tools etc and the resultant patch can be downloaded or > attached to a JIRA and then applied in the usual fashion (I believe there > is a commit message syntax that can be used to trigger closing the pull > request). > > Robbie > > On 30 April 2014 15:22, Rafael Schloming wrote: > >> On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic >> wrote: >> >>> @Rafi: I see there is a patch review process within Apache (based on >> your >>> other thread on Java8) >>> >>> Should we make this through the patch process at some point? >>> >> >> I'm fine looking at it on your git branch, but if you'd like to play with >> the review tool then feel free. Just let me know if you need an account >> and I will try to remember how to set one up (or who to bug to get you >> one). ;-) >> >> --Rafael >>
Re: Optimizations on Proton-j
Ah... I will fix the license headers shortly. On May 1, 2014, at 4:51 PM, Clebert Suconic wrote: > For now I have pretty much optimized the Transfer type only. no other types. > > for instance I see that Disposition type needs optimization as well... > > For us though the biggest advantage on the patch I'm making . > > If you send a 1K message.. you won't have much of the optimization on the > codec being exercised. > > we could do 10 Million Transfer in 3 seconds before... against 1.5 on my > laptop. If transferring 10Million * 10K is taking 40 seconds the optimization > of the 1.5 would be spread among the delivery and you wouldn't be able to see > a difference. > > > Why don't you try sending empty messages? meaning.. a message is received > with an empty body. > > On May 1, 2014, at 4:44 PM, Rafael Schloming wrote: > >> Hi Clebert, >> >> I've been (amongst other things) doing a little bit of investigation on >> this topic over the past couple of days. I wrote a microbenchmark that >> takes two engines and directly wires their transports together. It then >> pumps about 10 million 1K messages from one engine to the other. I ran this >> benchmark under jprofiler and codec definitely came up as a hot spot, but >> when I apply your patch, I don't see any measurable difference in results. >> Either way it's taking about 40 seconds to pump all the messages through. >> >> I'm not quite sure what is going on, but I'm guessing either the code path >> you've optimized isn't coming up enough to make much of a difference, or >> I've somehow messed up the measurements. I will post the benchmark shortly, >> so hopefully you can check up on my measurements yourself. >> >> On a more mundane note, Andrew pointed out that the new files you've added >> in your patch use an outdated license header. You can take a look at some >> existing files in the repo to get a current license header. >> >> --Rafael >> >> >> >> On Wed, Apr 30, 2014 at 2:15 PM, Clebert Suconic wrote: >> >>> I just submitted it as a git PR: >>> >>> https://github.com/apache/qpid-proton/pull/1 >>> >>> >>> >>> On Apr 30, 2014, at 10:47 AM, Robbie Gemmell >>> wrote: >>> I think anyone can sign up for ReviewBoard themselves. It certainly >>> didn't used to be linked to the ASF LDAP in the past, presumably for that >>> reason. Its probably also worth noting you can initiate pull requests against the github mirrors. If it hasn't already been done for the proton mirror, we can have the emails that would generate be directed to this list (e.g. >>> http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E >>> ). We obviously can't merge the pull request via github, but you can use the reviewing tools etc and the resultant patch can be downloaded or attached to a JIRA and then applied in the usual fashion (I believe there is a commit message syntax that can be used to trigger closing the pull request). Robbie On 30 April 2014 15:22, Rafael Schloming wrote: > On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic > wrote: > >> @Rafi: I see there is a patch review process within Apache (based on > your >> other thread on Java8) >> >> Should we make this through the patch process at some point? >> > > I'm fine looking at it on your git branch, but if you'd like to play >>> with > the review tool then feel free. Just let me know if you need an account > and I will try to remember how to set one up (or who to bug to get you > one). ;-) > > --Rafael > >>> >>> >
Re: Optimizations on Proton-j
For now I have pretty much optimized the Transfer type only. no other types. for instance I see that Disposition type needs optimization as well... For us though the biggest advantage on the patch I'm making . If you send a 1K message.. you won't have much of the optimization on the codec being exercised. we could do 10 Million Transfer in 3 seconds before... against 1.5 on my laptop. If transferring 10Million * 10K is taking 40 seconds the optimization of the 1.5 would be spread among the delivery and you wouldn't be able to see a difference. Why don't you try sending empty messages? meaning.. a message is received with an empty body. On May 1, 2014, at 4:44 PM, Rafael Schloming wrote: > Hi Clebert, > > I've been (amongst other things) doing a little bit of investigation on > this topic over the past couple of days. I wrote a microbenchmark that > takes two engines and directly wires their transports together. It then > pumps about 10 million 1K messages from one engine to the other. I ran this > benchmark under jprofiler and codec definitely came up as a hot spot, but > when I apply your patch, I don't see any measurable difference in results. > Either way it's taking about 40 seconds to pump all the messages through. > > I'm not quite sure what is going on, but I'm guessing either the code path > you've optimized isn't coming up enough to make much of a difference, or > I've somehow messed up the measurements. I will post the benchmark shortly, > so hopefully you can check up on my measurements yourself. > > On a more mundane note, Andrew pointed out that the new files you've added > in your patch use an outdated license header. You can take a look at some > existing files in the repo to get a current license header. > > --Rafael > > > > On Wed, Apr 30, 2014 at 2:15 PM, Clebert Suconic wrote: > >> I just submitted it as a git PR: >> >> https://github.com/apache/qpid-proton/pull/1 >> >> >> >> On Apr 30, 2014, at 10:47 AM, Robbie Gemmell >> wrote: >> >>> I think anyone can sign up for ReviewBoard themselves. It certainly >> didn't >>> used to be linked to the ASF LDAP in the past, presumably for that >> reason. >>> >>> Its probably also worth noting you can initiate pull requests against the >>> github mirrors. If it hasn't already been done for the proton mirror, we >>> can have the emails that would generate be directed to this list (e.g. >>> >> http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E >> ). >>> We obviously can't merge the pull request via github, but you can use >>> the reviewing tools etc and the resultant patch can be downloaded or >>> attached to a JIRA and then applied in the usual fashion (I believe there >>> is a commit message syntax that can be used to trigger closing the pull >>> request). >>> >>> Robbie >>> >>> On 30 April 2014 15:22, Rafael Schloming wrote: >>> On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic wrote: > @Rafi: I see there is a patch review process within Apache (based on your > other thread on Java8) > > Should we make this through the patch process at some point? > I'm fine looking at it on your git branch, but if you'd like to play >> with the review tool then feel free. Just let me know if you need an account and I will try to remember how to set one up (or who to bug to get you one). ;-) --Rafael >> >>
Re: Optimizations on Proton-j
Hi Clebert, I've been (amongst other things) doing a little bit of investigation on this topic over the past couple of days. I wrote a microbenchmark that takes two engines and directly wires their transports together. It then pumps about 10 million 1K messages from one engine to the other. I ran this benchmark under jprofiler and codec definitely came up as a hot spot, but when I apply your patch, I don't see any measurable difference in results. Either way it's taking about 40 seconds to pump all the messages through. I'm not quite sure what is going on, but I'm guessing either the code path you've optimized isn't coming up enough to make much of a difference, or I've somehow messed up the measurements. I will post the benchmark shortly, so hopefully you can check up on my measurements yourself. On a more mundane note, Andrew pointed out that the new files you've added in your patch use an outdated license header. You can take a look at some existing files in the repo to get a current license header. --Rafael On Wed, Apr 30, 2014 at 2:15 PM, Clebert Suconic wrote: > I just submitted it as a git PR: > > https://github.com/apache/qpid-proton/pull/1 > > > > On Apr 30, 2014, at 10:47 AM, Robbie Gemmell > wrote: > > > I think anyone can sign up for ReviewBoard themselves. It certainly > didn't > > used to be linked to the ASF LDAP in the past, presumably for that > reason. > > > > Its probably also worth noting you can initiate pull requests against the > > github mirrors. If it hasn't already been done for the proton mirror, we > > can have the emails that would generate be directed to this list (e.g. > > > http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E > ). > > We obviously can't merge the pull request via github, but you can use > > the reviewing tools etc and the resultant patch can be downloaded or > > attached to a JIRA and then applied in the usual fashion (I believe there > > is a commit message syntax that can be used to trigger closing the pull > > request). > > > > Robbie > > > > On 30 April 2014 15:22, Rafael Schloming wrote: > > > >> On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic >>> wrote: > >> > >>> @Rafi: I see there is a patch review process within Apache (based on > >> your > >>> other thread on Java8) > >>> > >>> Should we make this through the patch process at some point? > >>> > >> > >> I'm fine looking at it on your git branch, but if you'd like to play > with > >> the review tool then feel free. Just let me know if you need an account > >> and I will try to remember how to set one up (or who to bug to get you > >> one). ;-) > >> > >> --Rafael > >> > >
Re: Optimizations on Proton-j
I just submitted it as a git PR: https://github.com/apache/qpid-proton/pull/1 On Apr 30, 2014, at 10:47 AM, Robbie Gemmell wrote: > I think anyone can sign up for ReviewBoard themselves. It certainly didn't > used to be linked to the ASF LDAP in the past, presumably for that reason. > > Its probably also worth noting you can initiate pull requests against the > github mirrors. If it hasn't already been done for the proton mirror, we > can have the emails that would generate be directed to this list (e.g. > http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E). > We obviously can't merge the pull request via github, but you can use > the reviewing tools etc and the resultant patch can be downloaded or > attached to a JIRA and then applied in the usual fashion (I believe there > is a commit message syntax that can be used to trigger closing the pull > request). > > Robbie > > On 30 April 2014 15:22, Rafael Schloming wrote: > >> On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic >> wrote: >> >>> @Rafi: I see there is a patch review process within Apache (based on >> your >>> other thread on Java8) >>> >>> Should we make this through the patch process at some point? >>> >> >> I'm fine looking at it on your git branch, but if you'd like to play with >> the review tool then feel free. Just let me know if you need an account >> and I will try to remember how to set one up (or who to bug to get you >> one). ;-) >> >> --Rafael >>
Re: Optimizations on Proton-j
I think anyone can sign up for ReviewBoard themselves. It certainly didn't used to be linked to the ASF LDAP in the past, presumably for that reason. Its probably also worth noting you can initiate pull requests against the github mirrors. If it hasn't already been done for the proton mirror, we can have the emails that would generate be directed to this list (e.g. http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E). We obviously can't merge the pull request via github, but you can use the reviewing tools etc and the resultant patch can be downloaded or attached to a JIRA and then applied in the usual fashion (I believe there is a commit message syntax that can be used to trigger closing the pull request). Robbie On 30 April 2014 15:22, Rafael Schloming wrote: > On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic >wrote: > > > @Rafi: I see there is a patch review process within Apache (based on > your > > other thread on Java8) > > > > Should we make this through the patch process at some point? > > > > I'm fine looking at it on your git branch, but if you'd like to play with > the review tool then feel free. Just let me know if you need an account > and I will try to remember how to set one up (or who to bug to get you > one). ;-) > > --Rafael >
Re: Optimizations on Proton-j
On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic wrote: > @Rafi: I see there is a patch review process within Apache (based on your > other thread on Java8) > > Should we make this through the patch process at some point? > I'm fine looking at it on your git branch, but if you'd like to play with the review tool then feel free. Just let me know if you need an account and I will try to remember how to set one up (or who to bug to get you one). ;-) --Rafael
Re: Optimizations on Proton-j
@Rafi: I see there is a patch review process within Apache (based on your other thread on Java8) Should we make this through the patch process at some point? On Apr 29, 2014, at 4:04 PM, Clebert Suconic wrote: > > On Apr 29, 2014, at 4:00 PM, Rafael Schloming wrote: >>> >>> Notice that this will also minimize the footprint of the codec but I'm not >>> measuring that here. >>> >> >> Just out of curiosity, when you say minimize the footprint, are you >> referring to the in memory overhead, or do you mean the encoded size on the >> wire? > > I didn't change any encoding size on these refactoring. The code should > produce the same bytes as it was producing before. > > > I meant memory overhead.. Before my changes you had to instantiate a Decoder > per Connection, and a Decoder per Message (that was actually recently changed > to per thread). The decoder has the DefinedTypes and registered objects... > with this change now you can have a single instance of the Decoder and > Encoder per JVM since it's stateless now. >
Re: Optimizations on Proton-j
On Apr 29, 2014, at 4:00 PM, Rafael Schloming wrote: >> >> Notice that this will also minimize the footprint of the codec but I'm not >> measuring that here. >> > > Just out of curiosity, when you say minimize the footprint, are you > referring to the in memory overhead, or do you mean the encoded size on the > wire? I didn't change any encoding size on these refactoring. The code should produce the same bytes as it was producing before. I meant memory overhead.. Before my changes you had to instantiate a Decoder per Connection, and a Decoder per Message (that was actually recently changed to per thread). The decoder has the DefinedTypes and registered objects... with this change now you can have a single instance of the Decoder and Encoder per JVM since it's stateless now.
Re: Optimizations on Proton-j
On Tue, Apr 29, 2014 at 9:27 AM, Clebert Suconic wrote: > I have done some work last week on optimizing the Codec.. and I think i've > gotten some interesting results. > > > - The Decoder now is stateless, meaning the same instance can be used over > and over (no more need for one instance per connection). Bozo Dragojefic > has actually seen how heavy is to create a Decoder and has recently > optimized MessageImpl to always take the same instance through > ThreadLocals. This optimization goes a step further > - I have changed the ListDecoders somehow you won't need intermediate > objects to parse Types. For now I have only made Transfer as that effective > type but I could do that for all the other Types at some point > - There were a few hotspots that I found on the test and I have refactored > accordingly, meaning no semantic changes. > > As a result of these optimizations, DecoderImpl won't have a setBuffer > method any longer. Instead of that each method will take a > read(ReadableBuffer..., old signature). > > > And talking about ReadableBuffer, I have introduced the interface > ReadableBuffer. When integrating on the broker, I had a situation where I > won't have a ByteBuffer, and this interface will allow me to further > optimize the Parser later as I could take the usage of Netty Buffer (aka > ByteBuf). > > > You will find these optimizations on my branch on github: > https://github.com/clebertsuconic/qpid-proton/tree/optimizations > > > Where I will have two commits: > > I - a micro benchmark where I added a testcase doing a direct read on the > buffer without any framework. I've actually written a simple parser that > will work for the byte array I have, but that's very close to reading > directly from the bytes. >I used that to compare raw reading and interpreting the buffer to the > current framework we had. >I was actually concerned about the number of intermediate objects, so I > used that to map these differences. > > > https://github.com/clebertsuconic/qpid-proton/commit/7b2b02649e5bdd35aa2e4cc487ffb91c01e75685 > > > I - a commit with the actual optimizations: > > > > https://github.com/clebertsuconic/qpid-proton/commit/305ecc6aaa5192fc0a1ae42b90cb4eb8ddfe046e > > > > > > > > > Without these optimizations my MicroBenchmark, parsing 1000L instances > of Transfer, without reallocating any buffers could complete on my laptop > in: > > - 3480 milliseconds , against 750 milliseconds with raw reading > > > After these optimizations: > - 1927 milliseconds, against 750 milliseconds with raw reading > This sounds very promising and an excellent excuse for me to dust off some of my somewhat rusty git skills. ;-) > > Notice that this will also minimize the footprint of the codec but I'm not > measuring that here. > Just out of curiosity, when you say minimize the footprint, are you referring to the in memory overhead, or do you mean the encoded size on the wire? > I'm looking forward to work with this group, I actually had a meeting with > Rafi and Ted last week, and I plan to work closer to you guys on this > Excellent! I'm also looking forward to digging in a bit more on the Java side of things. --Rafael