Re: Optimizations on Proton-j
On May 5, 2014, at 5:02 PM, Rafael Schloming r...@alum.mit.edu wrote: Hi Clebert, Sorry for taking so long to follow up with the benchmark. I've been tweaking it and using it to do some memory and CPU profiling. The good news is I was able to significantly reduce memory utilization by making some simple changes to CollectorImpl. The benchmark triggers over 42 million events. The way CollectorImpl was coded initially this would create two throwaway objects for each event. This was ending up somewhere north of 84 million throw away objects over the course of the benchmark. One being the event itself, and the other being the linked list node. I changed CollectorImpl to use a simple chain instead of java.util's linked list and also to pool/reuse popped events. The same benchmark now only results in about 250 actual event objects being created in total in order to handle the same 42 million events. While this reduces memory pressure a lot, surprisingly enough, the event related objects were not the biggest source of garbage. At the top is java.nio.HeapByteBuffer (I haven't looked into this, but I suspect we're throwing away some of these that we could be reusing), and the second biggest source of garbage is org.apache.qpid.proton.amqp.UnsignedInteger. The latter one I find a bit more concerning as its use is fairly interwingled into the design of the current codec and as a result it is likely less straightforward to address. On the CPU profiling front I've noticed a couple of interesting things. First off it does appear that much of the processing time is spent in codec, roughly 63%, but the cost is not symmetrically divided between encode and decode. Encoding accounts for about 46% of the total, and decoding about 17%. I think this may be why its hard to measure the effect of your patch on this benchmark. The decoding cost just isn't all that high compared to encoding. I did the same profiling again with your patch applied and decoding dropped to about 10% of the total while encoding increased to about 50% of the total. Digging into the encoding side a bit, it appears that a significant amount of time is being spent calculating the encoded size of a value prior to writing its encoded representation to the wire. One of the optimizations I've had success with in the past (both on previous Java codecs and in the C codec) is to avoid calculating the size up front and instead simply reserve the necessary space for it and fill it in after the encoded representation has been written. In the past this has close to doubled the performance of encode since calculating the encoded size is often as expensive as simply doing the encoding. Unfortunately I'm guessing this kind of thing would probably require a major rework of the codec. To summarize I think there are really two design related issues we will need to address in order to achieve optimum performance. On the memory front, I think the fact that every described type is rendered into a tree of generic objects on both decode/encode is going to be problematic. The strategy you've taken in your patch to special case certain frames and eliminate the intermediate list objects helps with this, but I think we could do a whole lot better if we were to adopt a design that did not require any intermediate objects at all. On the CPU front, I think we'll get the biggest bang for our buck if we look into a design that doesn't require calculating the size up front. I have some ideas in mind for a new design that I hope will address both of these issues. I'm going to write them up in a separate post. Regarding your patch, I'm happy to apply it, but I suspect that much of the current codec layer would need to be modified and/or replaced to address the above findings. Let me know how you would like to proceed. It's all consistent with what I have seen... I have also realized the calculating encode size prior to sending. I would prefer if we could evolve in top of what did. I think the patch that I did is one step further on avoiding intermediate objects... I have seen already other cases where we could avoid it... It's all an evolution... and I'm still working into other cases... if you take this patch now we would move a step further. the patch is not only addressing the codec optimizations, but it's also avoiding multiple instances of the codec itself.. what makes it lighter. I'm now working on a benchmark based on yours that will be closer to what I will be using.
Re: Optimizations on Proton-j
I have some ideas as well: - Calculating size prior to sending: - We could write zeroes, write to the buffer... come back to the previous position.. write the size instead of calculating it. I have read this code a lot.. and I wouldn't rewrite the code.. just optimize these cases... I wouldn't optimize it for every possible case TBH.. just on message Delivery and Settling unless you want to also optimize other cases for use cases that I'm not aware at the moment. other things that could boost performance based on the micro benchmark I wrote: - Using Integer, Long.. etc..inside of UnsignedInt, UnsignedLong would give you a good boost in performance. The JDK is already optimized to box these types... while UnsignedInt, UnsignedLong.. etc.. its not well optimized. - Reusing buffers.. maybe adding a framework where we could reuse buffers.. or delegate into other frameworks (e.g. Netty). On May 5, 2014, at 5:40 PM, Clebert Suconic csuco...@redhat.com wrote: On May 5, 2014, at 5:02 PM, Rafael Schloming r...@alum.mit.edu wrote: Hi Clebert, Sorry for taking so long to follow up with the benchmark. I've been tweaking it and using it to do some memory and CPU profiling. The good news is I was able to significantly reduce memory utilization by making some simple changes to CollectorImpl. The benchmark triggers over 42 million events. The way CollectorImpl was coded initially this would create two throwaway objects for each event. This was ending up somewhere north of 84 million throw away objects over the course of the benchmark. One being the event itself, and the other being the linked list node. I changed CollectorImpl to use a simple chain instead of java.util's linked list and also to pool/reuse popped events. The same benchmark now only results in about 250 actual event objects being created in total in order to handle the same 42 million events. While this reduces memory pressure a lot, surprisingly enough, the event related objects were not the biggest source of garbage. At the top is java.nio.HeapByteBuffer (I haven't looked into this, but I suspect we're throwing away some of these that we could be reusing), and the second biggest source of garbage is org.apache.qpid.proton.amqp.UnsignedInteger. The latter one I find a bit more concerning as its use is fairly interwingled into the design of the current codec and as a result it is likely less straightforward to address. On the CPU profiling front I've noticed a couple of interesting things. First off it does appear that much of the processing time is spent in codec, roughly 63%, but the cost is not symmetrically divided between encode and decode. Encoding accounts for about 46% of the total, and decoding about 17%. I think this may be why its hard to measure the effect of your patch on this benchmark. The decoding cost just isn't all that high compared to encoding. I did the same profiling again with your patch applied and decoding dropped to about 10% of the total while encoding increased to about 50% of the total. Digging into the encoding side a bit, it appears that a significant amount of time is being spent calculating the encoded size of a value prior to writing its encoded representation to the wire. One of the optimizations I've had success with in the past (both on previous Java codecs and in the C codec) is to avoid calculating the size up front and instead simply reserve the necessary space for it and fill it in after the encoded representation has been written. In the past this has close to doubled the performance of encode since calculating the encoded size is often as expensive as simply doing the encoding. Unfortunately I'm guessing this kind of thing would probably require a major rework of the codec. To summarize I think there are really two design related issues we will need to address in order to achieve optimum performance. On the memory front, I think the fact that every described type is rendered into a tree of generic objects on both decode/encode is going to be problematic. The strategy you've taken in your patch to special case certain frames and eliminate the intermediate list objects helps with this, but I think we could do a whole lot better if we were to adopt a design that did not require any intermediate objects at all. On the CPU front, I think we'll get the biggest bang for our buck if we look into a design that doesn't require calculating the size up front. I have some ideas in mind for a new design that I hope will address both of these issues. I'm going to write them up in a separate post. Regarding your patch, I'm happy to apply it, but I suspect that much of the current codec layer would need to be modified and/or replaced to address the above findings. Let me know how you would like to proceed. It's all consistent with what I have seen... I have also realized the
Re: Optimizations on Proton-j
These shuld be all cleared now.. My github branch and PR are up to date now: https://github.com/apache/qpid-proton/pull/1 And isn't git is beautiful. It's already rebased with Rafi's last commit! On May 1, 2014, at 5:32 PM, Clebert Suconic csuco...@redhat.com wrote: I will do some cleanup on this .. I already fixed the headers here on my copy and I will do some cleanup on imports that were't supposed to be done. On May 1, 2014, at 5:11 PM, Robbie Gemmell robbie.gemm...@gmail.com wrote: As no mail arrived here or qpid-dev, and none seems to have arrived at what used to be the default location (infra-dev) either, I had a quick look and it seems like they might have changed the process slightly and we will need to ask for the mails to be enabled at all: https://blogs.apache.org/infra/entry/improved_integration_between_apache_and I particularly like the mention of the new comment syncing between our mailing list and the Pull Requests. Regarding closing the pull requests, it seems like something along the lines of This closes #request number at GitHub added to the end of the svn commit message should do the trick: https://help.github.com/articles/closing-issues-via-commit-messages I havent had a chance to really look at the actual code change but when I was quickly scrolling down the PR, in addition to the licence headers on the new files that Rafi already mentioned (which I spotted due to the Copyright notices we wouldnt typically have) I noticed Encoder.java having its existing licence header corrupted a little by some wayward code. Robbie I just submitted it as a git PR: https://github.com/apache/qpid-proton/pull/1 On Apr 30, 2014, at 10:47 AM, Robbie Gemmell robbie.gemm...@gmail.com wrote: I think anyone can sign up for ReviewBoard themselves. It certainly didn't used to be linked to the ASF LDAP in the past, presumably for that reason. Its probably also worth noting you can initiate pull requests against the github mirrors. If it hasn't already been done for the proton mirror, we can have the emails that would generate be directed to this list (e.g. http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E ). We obviously can't merge the pull request via github, but you can use the reviewing tools etc and the resultant patch can be downloaded or attached to a JIRA and then applied in the usual fashion (I believe there is a commit message syntax that can be used to trigger closing the pull request). Robbie On 30 April 2014 15:22, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic csuco...@redhat.com wrote: @Rafi: I see there is a patch review process within Apache (based on your other thread on Java8) Should we make this through the patch process at some point? I'm fine looking at it on your git branch, but if you'd like to play with the review tool then feel free. Just let me know if you need an account and I will try to remember how to set one up (or who to bug to get you one). ;-) --Rafael
Re: Optimizations on Proton-j
Hi Clebert, I've been (amongst other things) doing a little bit of investigation on this topic over the past couple of days. I wrote a microbenchmark that takes two engines and directly wires their transports together. It then pumps about 10 million 1K messages from one engine to the other. I ran this benchmark under jprofiler and codec definitely came up as a hot spot, but when I apply your patch, I don't see any measurable difference in results. Either way it's taking about 40 seconds to pump all the messages through. I'm not quite sure what is going on, but I'm guessing either the code path you've optimized isn't coming up enough to make much of a difference, or I've somehow messed up the measurements. I will post the benchmark shortly, so hopefully you can check up on my measurements yourself. On a more mundane note, Andrew pointed out that the new files you've added in your patch use an outdated license header. You can take a look at some existing files in the repo to get a current license header. --Rafael On Wed, Apr 30, 2014 at 2:15 PM, Clebert Suconic csuco...@redhat.comwrote: I just submitted it as a git PR: https://github.com/apache/qpid-proton/pull/1 On Apr 30, 2014, at 10:47 AM, Robbie Gemmell robbie.gemm...@gmail.com wrote: I think anyone can sign up for ReviewBoard themselves. It certainly didn't used to be linked to the ASF LDAP in the past, presumably for that reason. Its probably also worth noting you can initiate pull requests against the github mirrors. If it hasn't already been done for the proton mirror, we can have the emails that would generate be directed to this list (e.g. http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E ). We obviously can't merge the pull request via github, but you can use the reviewing tools etc and the resultant patch can be downloaded or attached to a JIRA and then applied in the usual fashion (I believe there is a commit message syntax that can be used to trigger closing the pull request). Robbie On 30 April 2014 15:22, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic csuco...@redhat.com wrote: @Rafi: I see there is a patch review process within Apache (based on your other thread on Java8) Should we make this through the patch process at some point? I'm fine looking at it on your git branch, but if you'd like to play with the review tool then feel free. Just let me know if you need an account and I will try to remember how to set one up (or who to bug to get you one). ;-) --Rafael
Re: Optimizations on Proton-j
For now I have pretty much optimized the Transfer type only. no other types. for instance I see that Disposition type needs optimization as well... For us though the biggest advantage on the patch I'm making . If you send a 1K message.. you won't have much of the optimization on the codec being exercised. we could do 10 Million Transfer in 3 seconds before... against 1.5 on my laptop. If transferring 10Million * 10K is taking 40 seconds the optimization of the 1.5 would be spread among the delivery and you wouldn't be able to see a difference. Why don't you try sending empty messages? meaning.. a message is received with an empty body. On May 1, 2014, at 4:44 PM, Rafael Schloming r...@alum.mit.edu wrote: Hi Clebert, I've been (amongst other things) doing a little bit of investigation on this topic over the past couple of days. I wrote a microbenchmark that takes two engines and directly wires their transports together. It then pumps about 10 million 1K messages from one engine to the other. I ran this benchmark under jprofiler and codec definitely came up as a hot spot, but when I apply your patch, I don't see any measurable difference in results. Either way it's taking about 40 seconds to pump all the messages through. I'm not quite sure what is going on, but I'm guessing either the code path you've optimized isn't coming up enough to make much of a difference, or I've somehow messed up the measurements. I will post the benchmark shortly, so hopefully you can check up on my measurements yourself. On a more mundane note, Andrew pointed out that the new files you've added in your patch use an outdated license header. You can take a look at some existing files in the repo to get a current license header. --Rafael On Wed, Apr 30, 2014 at 2:15 PM, Clebert Suconic csuco...@redhat.comwrote: I just submitted it as a git PR: https://github.com/apache/qpid-proton/pull/1 On Apr 30, 2014, at 10:47 AM, Robbie Gemmell robbie.gemm...@gmail.com wrote: I think anyone can sign up for ReviewBoard themselves. It certainly didn't used to be linked to the ASF LDAP in the past, presumably for that reason. Its probably also worth noting you can initiate pull requests against the github mirrors. If it hasn't already been done for the proton mirror, we can have the emails that would generate be directed to this list (e.g. http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E ). We obviously can't merge the pull request via github, but you can use the reviewing tools etc and the resultant patch can be downloaded or attached to a JIRA and then applied in the usual fashion (I believe there is a commit message syntax that can be used to trigger closing the pull request). Robbie On 30 April 2014 15:22, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic csuco...@redhat.com wrote: @Rafi: I see there is a patch review process within Apache (based on your other thread on Java8) Should we make this through the patch process at some point? I'm fine looking at it on your git branch, but if you'd like to play with the review tool then feel free. Just let me know if you need an account and I will try to remember how to set one up (or who to bug to get you one). ;-) --Rafael
Re: Optimizations on Proton-j
Ah... I will fix the license headers shortly. On May 1, 2014, at 4:51 PM, Clebert Suconic csuco...@redhat.com wrote: For now I have pretty much optimized the Transfer type only. no other types. for instance I see that Disposition type needs optimization as well... For us though the biggest advantage on the patch I'm making . If you send a 1K message.. you won't have much of the optimization on the codec being exercised. we could do 10 Million Transfer in 3 seconds before... against 1.5 on my laptop. If transferring 10Million * 10K is taking 40 seconds the optimization of the 1.5 would be spread among the delivery and you wouldn't be able to see a difference. Why don't you try sending empty messages? meaning.. a message is received with an empty body. On May 1, 2014, at 4:44 PM, Rafael Schloming r...@alum.mit.edu wrote: Hi Clebert, I've been (amongst other things) doing a little bit of investigation on this topic over the past couple of days. I wrote a microbenchmark that takes two engines and directly wires their transports together. It then pumps about 10 million 1K messages from one engine to the other. I ran this benchmark under jprofiler and codec definitely came up as a hot spot, but when I apply your patch, I don't see any measurable difference in results. Either way it's taking about 40 seconds to pump all the messages through. I'm not quite sure what is going on, but I'm guessing either the code path you've optimized isn't coming up enough to make much of a difference, or I've somehow messed up the measurements. I will post the benchmark shortly, so hopefully you can check up on my measurements yourself. On a more mundane note, Andrew pointed out that the new files you've added in your patch use an outdated license header. You can take a look at some existing files in the repo to get a current license header. --Rafael On Wed, Apr 30, 2014 at 2:15 PM, Clebert Suconic csuco...@redhat.comwrote: I just submitted it as a git PR: https://github.com/apache/qpid-proton/pull/1 On Apr 30, 2014, at 10:47 AM, Robbie Gemmell robbie.gemm...@gmail.com wrote: I think anyone can sign up for ReviewBoard themselves. It certainly didn't used to be linked to the ASF LDAP in the past, presumably for that reason. Its probably also worth noting you can initiate pull requests against the github mirrors. If it hasn't already been done for the proton mirror, we can have the emails that would generate be directed to this list (e.g. http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E ). We obviously can't merge the pull request via github, but you can use the reviewing tools etc and the resultant patch can be downloaded or attached to a JIRA and then applied in the usual fashion (I believe there is a commit message syntax that can be used to trigger closing the pull request). Robbie On 30 April 2014 15:22, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic csuco...@redhat.com wrote: @Rafi: I see there is a patch review process within Apache (based on your other thread on Java8) Should we make this through the patch process at some point? I'm fine looking at it on your git branch, but if you'd like to play with the review tool then feel free. Just let me know if you need an account and I will try to remember how to set one up (or who to bug to get you one). ;-) --Rafael
Re: Optimizations on Proton-j
As no mail arrived here or qpid-dev, and none seems to have arrived at what used to be the default location (infra-dev) either, I had a quick look and it seems like they might have changed the process slightly and we will need to ask for the mails to be enabled at all: https://blogs.apache.org/infra/entry/improved_integration_between_apache_and I particularly like the mention of the new comment syncing between our mailing list and the Pull Requests. Regarding closing the pull requests, it seems like something along the lines of This closes #request number at GitHub added to the end of the svn commit message should do the trick: https://help.github.com/articles/closing-issues-via-commit-messages I havent had a chance to really look at the actual code change but when I was quickly scrolling down the PR, in addition to the licence headers on the new files that Rafi already mentioned (which I spotted due to the Copyright notices we wouldnt typically have) I noticed Encoder.java having its existing licence header corrupted a little by some wayward code. Robbie I just submitted it as a git PR: https://github.com/apache/qpid-proton/pull/1 On Apr 30, 2014, at 10:47 AM, Robbie Gemmell robbie.gemm...@gmail.com wrote: I think anyone can sign up for ReviewBoard themselves. It certainly didn't used to be linked to the ASF LDAP in the past, presumably for that reason. Its probably also worth noting you can initiate pull requests against the github mirrors. If it hasn't already been done for the proton mirror, we can have the emails that would generate be directed to this list (e.g. http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E ). We obviously can't merge the pull request via github, but you can use the reviewing tools etc and the resultant patch can be downloaded or attached to a JIRA and then applied in the usual fashion (I believe there is a commit message syntax that can be used to trigger closing the pull request). Robbie On 30 April 2014 15:22, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic csuco...@redhat.com wrote: @Rafi: I see there is a patch review process within Apache (based on your other thread on Java8) Should we make this through the patch process at some point? I'm fine looking at it on your git branch, but if you'd like to play with the review tool then feel free. Just let me know if you need an account and I will try to remember how to set one up (or who to bug to get you one). ;-) --Rafael
Re: Optimizations on Proton-j
I will do some cleanup on this .. I already fixed the headers here on my copy and I will do some cleanup on imports that were't supposed to be done. On May 1, 2014, at 5:11 PM, Robbie Gemmell robbie.gemm...@gmail.com wrote: As no mail arrived here or qpid-dev, and none seems to have arrived at what used to be the default location (infra-dev) either, I had a quick look and it seems like they might have changed the process slightly and we will need to ask for the mails to be enabled at all: https://blogs.apache.org/infra/entry/improved_integration_between_apache_and I particularly like the mention of the new comment syncing between our mailing list and the Pull Requests. Regarding closing the pull requests, it seems like something along the lines of This closes #request number at GitHub added to the end of the svn commit message should do the trick: https://help.github.com/articles/closing-issues-via-commit-messages I havent had a chance to really look at the actual code change but when I was quickly scrolling down the PR, in addition to the licence headers on the new files that Rafi already mentioned (which I spotted due to the Copyright notices we wouldnt typically have) I noticed Encoder.java having its existing licence header corrupted a little by some wayward code. Robbie I just submitted it as a git PR: https://github.com/apache/qpid-proton/pull/1 On Apr 30, 2014, at 10:47 AM, Robbie Gemmell robbie.gemm...@gmail.com wrote: I think anyone can sign up for ReviewBoard themselves. It certainly didn't used to be linked to the ASF LDAP in the past, presumably for that reason. Its probably also worth noting you can initiate pull requests against the github mirrors. If it hasn't already been done for the proton mirror, we can have the emails that would generate be directed to this list (e.g. http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E ). We obviously can't merge the pull request via github, but you can use the reviewing tools etc and the resultant patch can be downloaded or attached to a JIRA and then applied in the usual fashion (I believe there is a commit message syntax that can be used to trigger closing the pull request). Robbie On 30 April 2014 15:22, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic csuco...@redhat.com wrote: @Rafi: I see there is a patch review process within Apache (based on your other thread on Java8) Should we make this through the patch process at some point? I'm fine looking at it on your git branch, but if you'd like to play with the review tool then feel free. Just let me know if you need an account and I will try to remember how to set one up (or who to bug to get you one). ;-) --Rafael
Re: Optimizations on Proton-j
@Rafi: I see there is a patch review process within Apache (based on your other thread on Java8) Should we make this through the patch process at some point? On Apr 29, 2014, at 4:04 PM, Clebert Suconic csuco...@redhat.com wrote: On Apr 29, 2014, at 4:00 PM, Rafael Schloming r...@alum.mit.edu wrote: Notice that this will also minimize the footprint of the codec but I'm not measuring that here. Just out of curiosity, when you say minimize the footprint, are you referring to the in memory overhead, or do you mean the encoded size on the wire? I didn't change any encoding size on these refactoring. The code should produce the same bytes as it was producing before. I meant memory overhead.. Before my changes you had to instantiate a Decoder per Connection, and a Decoder per Message (that was actually recently changed to per thread). The decoder has the DefinedTypes and registered objects... with this change now you can have a single instance of the Decoder and Encoder per JVM since it's stateless now.
Re: Optimizations on Proton-j
On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic csuco...@redhat.comwrote: @Rafi: I see there is a patch review process within Apache (based on your other thread on Java8) Should we make this through the patch process at some point? I'm fine looking at it on your git branch, but if you'd like to play with the review tool then feel free. Just let me know if you need an account and I will try to remember how to set one up (or who to bug to get you one). ;-) --Rafael
Re: Optimizations on Proton-j
I think anyone can sign up for ReviewBoard themselves. It certainly didn't used to be linked to the ASF LDAP in the past, presumably for that reason. Its probably also worth noting you can initiate pull requests against the github mirrors. If it hasn't already been done for the proton mirror, we can have the emails that would generate be directed to this list (e.g. http://mail-archives.apache.org/mod_mbox/qpid-dev/201401.mbox/%3c20140130180355.3cf9e916...@tyr.zones.apache.org%3E). We obviously can't merge the pull request via github, but you can use the reviewing tools etc and the resultant patch can be downloaded or attached to a JIRA and then applied in the usual fashion (I believe there is a commit message syntax that can be used to trigger closing the pull request). Robbie On 30 April 2014 15:22, Rafael Schloming r...@alum.mit.edu wrote: On Wed, Apr 30, 2014 at 8:35 AM, Clebert Suconic csuco...@redhat.com wrote: @Rafi: I see there is a patch review process within Apache (based on your other thread on Java8) Should we make this through the patch process at some point? I'm fine looking at it on your git branch, but if you'd like to play with the review tool then feel free. Just let me know if you need an account and I will try to remember how to set one up (or who to bug to get you one). ;-) --Rafael
Re: Optimizations on Proton-j
On Tue, Apr 29, 2014 at 9:27 AM, Clebert Suconic csuco...@redhat.comwrote: I have done some work last week on optimizing the Codec.. and I think i've gotten some interesting results. - The Decoder now is stateless, meaning the same instance can be used over and over (no more need for one instance per connection). Bozo Dragojefic has actually seen how heavy is to create a Decoder and has recently optimized MessageImpl to always take the same instance through ThreadLocals. This optimization goes a step further - I have changed the ListDecoders somehow you won't need intermediate objects to parse Types. For now I have only made Transfer as that effective type but I could do that for all the other Types at some point - There were a few hotspots that I found on the test and I have refactored accordingly, meaning no semantic changes. As a result of these optimizations, DecoderImpl won't have a setBuffer method any longer. Instead of that each method will take a read(ReadableBuffer..., old signature). And talking about ReadableBuffer, I have introduced the interface ReadableBuffer. When integrating on the broker, I had a situation where I won't have a ByteBuffer, and this interface will allow me to further optimize the Parser later as I could take the usage of Netty Buffer (aka ByteBuf). You will find these optimizations on my branch on github: https://github.com/clebertsuconic/qpid-proton/tree/optimizations Where I will have two commits: I - a micro benchmark where I added a testcase doing a direct read on the buffer without any framework. I've actually written a simple parser that will work for the byte array I have, but that's very close to reading directly from the bytes. I used that to compare raw reading and interpreting the buffer to the current framework we had. I was actually concerned about the number of intermediate objects, so I used that to map these differences. https://github.com/clebertsuconic/qpid-proton/commit/7b2b02649e5bdd35aa2e4cc487ffb91c01e75685 I - a commit with the actual optimizations: https://github.com/clebertsuconic/qpid-proton/commit/305ecc6aaa5192fc0a1ae42b90cb4eb8ddfe046e Without these optimizations my MicroBenchmark, parsing 1000L instances of Transfer, without reallocating any buffers could complete on my laptop in: - 3480 milliseconds , against 750 milliseconds with raw reading After these optimizations: - 1927 milliseconds, against 750 milliseconds with raw reading This sounds very promising and an excellent excuse for me to dust off some of my somewhat rusty git skills. ;-) Notice that this will also minimize the footprint of the codec but I'm not measuring that here. Just out of curiosity, when you say minimize the footprint, are you referring to the in memory overhead, or do you mean the encoded size on the wire? I'm looking forward to work with this group, I actually had a meeting with Rafi and Ted last week, and I plan to work closer to you guys on this Excellent! I'm also looking forward to digging in a bit more on the Java side of things. --Rafael
Re: Optimizations on Proton-j
On Apr 29, 2014, at 4:00 PM, Rafael Schloming r...@alum.mit.edu wrote: Notice that this will also minimize the footprint of the codec but I'm not measuring that here. Just out of curiosity, when you say minimize the footprint, are you referring to the in memory overhead, or do you mean the encoded size on the wire? I didn't change any encoding size on these refactoring. The code should produce the same bytes as it was producing before. I meant memory overhead.. Before my changes you had to instantiate a Decoder per Connection, and a Decoder per Message (that was actually recently changed to per thread). The decoder has the DefinedTypes and registered objects... with this change now you can have a single instance of the Decoder and Encoder per JVM since it's stateless now.