Re: [infinispan-dev] Store as binary
On Feb 4, 2014, at 7:14 AM, Galder Zamarreño gal...@redhat.com wrote: On 21 Jan 2014, at 17:45, Mircea Markus mmar...@redhat.com wrote: On Jan 21, 2014, at 2:13 PM, Sanne Grinovero sa...@infinispan.org wrote: On 21 January 2014 13:37, Mircea Markus mmar...@redhat.com wrote: On Jan 21, 2014, at 1:21 PM, Galder Zamarreño gal...@redhat.com wrote: What's the point for these tests? +1 To validate if storing the data in binary format yields better performance than store is as a POJO. That will highly depend on the scenarios you want to test for. AFAIK this started after Paul described how session replication works in WildFly, and we already know that both strategies are suboptimal with the current options available: in his case the active node will always write on the POJO, while the backup node will essentially only need to store the buffer just in case he might need to take over. Indeed as it is today, it doesn't make sense for WildFly's session replication. Sure, one will be slower, but if you want to make a suggestion to him about which configuration he should be using, we should measure his use case, not a different one. Even then as discussed in Palma, an in memory String representation might be way more compact because of pooling of strings and a very high likelihood for repeated headers (as common in web frameworks), pooling like in String.intern()? Even so, if most of your access to the String is to serialize it and sent is remotely then you have a serialization cost(CPU) to pay for the reduced size. Serialization has a cost, but nothing compared with the transport itself, and you don’t have to go very far to see the impact of transport. Just recently we were chasing some performance regression and even though there were some changes in serialization, the impact of my improvements was minimal, max 2-3%. Optimal network and transport configuration is more important IMO, and once again, misconfiguration in that layer is what was causing us to be ~20% slower. yes, I din't expect huge improvements from storeAsBinary, but at least some improvement caused by the fact that lots of serialization should't happen in the tested scenario. 2-3% improvement wouldn't hurt, though :-) Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Store as binary
20 % writes, 80 % reads Radim On 01/29/2014 03:20 PM, Paul Ferraro wrote: What was the read/write ratio used for this test? On Fri, 2014-01-17 at 14:06 +0100, Radim Vansa wrote: Hi Mircea, I've ran a simple stress test [1] in dist mode with store as binary (not enabled, enabled keys only, enabled values only, enabled both). The difference is 2 % (with storeAsBinary enabled fully being slower). Radim [1] https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/1/artifact/report/All_report.html ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa rva...@redhat.com JBoss DataGrid QA ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Store as binary
On Jan 21, 2014, at 1:36 PM, Sanne Grinovero sa...@infinispan.org wrote: What's the point for these tests? +1 On 20 Jan 2014 15:48, Radim Vansa rva...@redhat.com wrote: OK, I have results for dist-udp-no-tx or local-no-tx modes on 8 nodes (in local mode the nodes don't communicate, naturally): Dist mode: 3 % down for reads, 1 % for writes Local mode: 19 % down for reads, 16 % for writes Details in [1], ^ is for both keys and values stored as binary. Radim [1] https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/4/artifact/report/All_report.html On 01/20/2014 11:14 AM, Pedro Ruivo wrote: On 01/20/2014 10:07 AM, Mircea Markus wrote: Would be interesting to see as well, though performance figure would not include the network latency, hence it would not tell much about the benefit of using this on a real life system. that's my point. I'm interested to see the worst scenario since all other cluster modes, will have a lower (or none) impact in performance. Of course, the best scenario would be only each node have access to remote keys... Pedro On Jan 20, 2014, at 9:48 AM, Pedro Ruivo pe...@infinispan.org wrote: IMO, we should try the worst scenario: Local Mode + Single thread. this will show us the highest impact in performance. Cheers, ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa rva...@redhat.com JBoss DataGrid QA ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño gal...@redhat.com twitter.com/galderz Project Lead, Escalante http://escalante.io Engineer, Infinispan http://infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Store as binary
On 21 January 2014 13:37, Mircea Markus mmar...@redhat.com wrote: On Jan 21, 2014, at 1:21 PM, Galder Zamarreño gal...@redhat.com wrote: What's the point for these tests? +1 To validate if storing the data in binary format yields better performance than store is as a POJO. That will highly depend on the scenarios you want to test for. AFAIK this started after Paul described how session replication works in WildFly, and we already know that both strategies are suboptimal with the current options available: in his case the active node will always write on the POJO, while the backup node will essentially only need to store the buffer just in case he might need to take over. Sure, one will be slower, but if you want to make a suggestion to him about which configuration he should be using, we should measure his use case, not a different one. Even then as discussed in Palma, an in memory String representation might be way more compact because of pooling of strings and a very high likelihood for repeated headers (as common in web frameworks), so you might want to measure the CPU vs storage cost on the receiving side.. but then again your results will definitely depend on the input data and assumptions on likelihood of failover, how often is being written on the owner node vs on the other node (since he uses locality), etc.. many factors I'm not seeing being considered here and which could make a significant difference. As of now, it doesn't so I need to check why. You could play with the test parameters until it produces an output you like better, but I still see no point? This is not a realistic scenario, at best it could help us document suggestions about which scenarios you'd want to keep the option enabled vs disabled, but then again I think we're wasting time as we could implement a better strategy for Paul's use case: one which never deserializes a value received from a remote node until it's been requested as a POJO, but keeps the POJO as-is when it's stored locally. I believe that would make sense also for OGM and probably most other users of Embedded. Basically, that would re-implement something similar to the previous design but simplifying it a bit so that it doesn't allow for a back-and-forth conversion between storage types but rather dynamically favors a specific storage strategy. Cheers, Sanne Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Store as binary
On Jan 21, 2014, at 2:13 PM, Sanne Grinovero sa...@infinispan.org wrote: On 21 January 2014 13:37, Mircea Markus mmar...@redhat.com wrote: On Jan 21, 2014, at 1:21 PM, Galder Zamarreño gal...@redhat.com wrote: What's the point for these tests? +1 To validate if storing the data in binary format yields better performance than store is as a POJO. That will highly depend on the scenarios you want to test for. AFAIK this started after Paul described how session replication works in WildFly, and we already know that both strategies are suboptimal with the current options available: in his case the active node will always write on the POJO, while the backup node will essentially only need to store the buffer just in case he might need to take over. Indeed as it is today, it doesn't make sense for WildFly's session replication. Sure, one will be slower, but if you want to make a suggestion to him about which configuration he should be using, we should measure his use case, not a different one. Even then as discussed in Palma, an in memory String representation might be way more compact because of pooling of strings and a very high likelihood for repeated headers (as common in web frameworks), pooling like in String.intern()? Even so, if most of your access to the String is to serialize it and sent is remotely then you have a serialization cost(CPU) to pay for the reduced size. so you might want to measure the CPU vs storage cost on the receiving side.. but then again your results will definitely depend on the input data and assumptions on likelihood of failover, how often is being written on the owner node vs on the other node (since he uses locality), etc.. many factors I'm not seeing being considered here and which could make a significant difference. I'm looking for the default setting of storeAsBinary in the configurations we ship. I think the default configs should be optimized for distribution, random key access (every reads/writes for any key executes on every node of the cluster with the same probability) for both read an write. As of now, it doesn't so I need to check why. You could play with the test parameters until it produces an output you like better, but I still see no point? the point is to provide the best defaults params for the default config, and see what's the usefulness of storeAsBinary. This is not a realistic scenario, at best it could help us document suggestions about which scenarios you'd want to keep the option enabled vs disabled, but then again I think we're wasting time as we could implement a better strategy for Paul's use case: one which never deserializes a value received from a remote node until it's been requested as a POJO, but keeps the POJO as-is when it's stored locally. I disagree: Paul's scenario, whilst very important, is quite specific. For what I consider the general case (random key access, see above), your approach is suboptimal. I believe that would make sense also for OGM and probably most other users of Embedded. Basically, that would re-implement something similar to the previous design but simplifying it a bit so that it doesn't allow for a back-and-forth conversion between storage types but rather dynamically favors a specific storage strategy. It all boils down to what we want to optimize for: random key access or some degree of affinity. I think the former is the default. One way or the other, from the test Radim ran with random key access, the storeAsBinary doesn't bring any benefit and it should: http://lists.jboss.org/pipermail/infinispan-dev/2009-October/004299.html Cheers, Sanne Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Store as binary
Hi Radim, I think 4 nodes with numOwner=2 is too small of a cluster. My calculus here[1] points out that for numOwners=1, the performance benefits is only visible for clusters having more than two nodes. Following a similar logic for numOwenrs=2, the benefit would only be visible for clusters having more than 4 nodes. Would it be possible to run the test on a larger cluster, 8+ nodes? [1] http://lists.jboss.org/pipermail/infinispan-dev/2009-October/004299.html On Jan 17, 2014, at 1:06 PM, Radim Vansa rva...@redhat.com wrote: Hi Mircea, I've ran a simple stress test [1] in dist mode with store as binary (not enabled, enabled keys only, enabled values only, enabled both). The difference is 2 % (with storeAsBinary enabled fully being slower). Radim [1] https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/1/artifact/report/All_report.html -- Radim Vansa rva...@redhat.com JBoss DataGrid QA ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Store as binary
Hi, IMO, we should try the worst scenario: Local Mode + Single thread. this will show us the highest impact in performance. Cheers, Pedro On 01/20/2014 09:41 AM, Mircea Markus wrote: Hi Radim, I think 4 nodes with numOwner=2 is too small of a cluster. My calculus here[1] points out that for numOwners=1, the performance benefits is only visible for clusters having more than two nodes. Following a similar logic for numOwenrs=2, the benefit would only be visible for clusters having more than 4 nodes. Would it be possible to run the test on a larger cluster, 8+ nodes? [1] http://lists.jboss.org/pipermail/infinispan-dev/2009-October/004299.html On Jan 17, 2014, at 1:06 PM, Radim Vansa rva...@redhat.com wrote: Hi Mircea, I've ran a simple stress test [1] in dist mode with store as binary (not enabled, enabled keys only, enabled values only, enabled both). The difference is 2 % (with storeAsBinary enabled fully being slower). Radim [1] https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/1/artifact/report/All_report.html -- Radim Vansa rva...@redhat.com JBoss DataGrid QA ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev Cheers, ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Store as binary
Would be interesting to see as well, though performance figure would not include the network latency, hence it would not tell much about the benefit of using this on a real life system. On Jan 20, 2014, at 9:48 AM, Pedro Ruivo pe...@infinispan.org wrote: IMO, we should try the worst scenario: Local Mode + Single thread. this will show us the highest impact in performance. Cheers, -- Mircea Markus Infinispan lead (www.infinispan.org) ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Store as binary
OK, I have results for dist-udp-no-tx or local-no-tx modes on 8 nodes (in local mode the nodes don't communicate, naturally): Dist mode: 3 % down for reads, 1 % for writes Local mode: 19 % down for reads, 16 % for writes Details in [1], ^ is for both keys and values stored as binary. Radim [1] https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/4/artifact/report/All_report.html On 01/20/2014 11:14 AM, Pedro Ruivo wrote: On 01/20/2014 10:07 AM, Mircea Markus wrote: Would be interesting to see as well, though performance figure would not include the network latency, hence it would not tell much about the benefit of using this on a real life system. that's my point. I'm interested to see the worst scenario since all other cluster modes, will have a lower (or none) impact in performance. Of course, the best scenario would be only each node have access to remote keys... Pedro On Jan 20, 2014, at 9:48 AM, Pedro Ruivo pe...@infinispan.org wrote: IMO, we should try the worst scenario: Local Mode + Single thread. this will show us the highest impact in performance. Cheers, ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa rva...@redhat.com JBoss DataGrid QA ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev