Re: [infinispan-dev] Store as binary

2014-02-05 Thread Mircea Markus

On Feb 4, 2014, at 7:14 AM, Galder Zamarreño gal...@redhat.com wrote:

 On 21 Jan 2014, at 17:45, Mircea Markus mmar...@redhat.com wrote:
 
 
 On Jan 21, 2014, at 2:13 PM, Sanne Grinovero sa...@infinispan.org wrote:
 
 On 21 January 2014 13:37, Mircea Markus mmar...@redhat.com wrote:
 
 On Jan 21, 2014, at 1:21 PM, Galder Zamarreño gal...@redhat.com wrote:
 
 What's the point for these tests?
 
 +1
 
 To validate if storing the data in binary format yields better performance 
 than store is as a POJO.
 
 That will highly depend on the scenarios you want to test for. AFAIK
 this started after Paul described how session replication works in
 WildFly, and we already know that both strategies are suboptimal with
 the current options available: in his case the active node will always
 write on the POJO, while the backup node will essentially only need to
 store the buffer just in case he might need to take over.
 
 Indeed as it is today, it doesn't make sense for WildFly's session 
 replication.
 
 
 Sure, one will be slower, but if you want to make a suggestion to him
 about which configuration he should be using, we should measure his
 use case, not a different one.
 
 Even then as discussed in Palma, an in memory String representation
 might be way more compact because of pooling of strings and a very
 high likelihood for repeated headers (as common in web frameworks),
 
 pooling like in String.intern()? 
 Even so, if most of your access to the String is to serialize it and sent is 
 remotely then you have a serialization cost(CPU) to pay for the reduced size.
 
 Serialization has a cost, but nothing compared with the transport itself, and 
 you don’t have to go very far to see the impact of transport. Just recently 
 we were chasing some performance regression and even though there were some 
 changes in serialization, the impact of my improvements was minimal, max 
 2-3%. Optimal network and transport configuration is more important IMO, and 
 once again, misconfiguration in that layer is what was causing us to be ~20% 
 slower.

yes, I din't expect huge improvements from storeAsBinary, but at least some 
improvement caused by the fact that lots of serialization should't happen in 
the tested scenario. 2-3% improvement wouldn't hurt, though :-)

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)





___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Store as binary

2014-01-29 Thread Radim Vansa
20 % writes, 80 % reads

Radim

On 01/29/2014 03:20 PM, Paul Ferraro wrote:
 What was the read/write ratio used for this test?

 On Fri, 2014-01-17 at 14:06 +0100, Radim Vansa wrote:
 Hi Mircea,

 I've ran a simple stress test [1] in dist mode with store as binary (not
 enabled, enabled keys only, enabled values only, enabled both).
 The difference is  2 % (with storeAsBinary enabled fully being slower).

 Radim

 [1]
 https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/1/artifact/report/All_report.html


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


-- 
Radim Vansa rva...@redhat.com
JBoss DataGrid QA

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Store as binary

2014-01-21 Thread Galder Zamarreño

On Jan 21, 2014, at 1:36 PM, Sanne Grinovero sa...@infinispan.org wrote:

 What's the point for these tests? 

+1

 On 20 Jan 2014 15:48, Radim Vansa rva...@redhat.com wrote:
 OK, I have results for dist-udp-no-tx or local-no-tx modes on 8 nodes
 (in local mode the nodes don't communicate, naturally):
 Dist mode: 3 % down for reads, 1 % for writes
 Local mode: 19 % down for reads, 16 % for writes
 
 Details in [1], ^ is for both keys and values stored as binary.
 
 Radim
 
 [1]
 https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/4/artifact/report/All_report.html
 
 On 01/20/2014 11:14 AM, Pedro Ruivo wrote:
 
  On 01/20/2014 10:07 AM, Mircea Markus wrote:
  Would be interesting to see as well, though performance figure would not 
  include the network latency, hence it would not tell much about the 
  benefit of using this on a real life system.
  that's my point. I'm interested to see the worst scenario since all
  other cluster modes, will have a lower (or none) impact in performance.
 
  Of course, the best scenario would be only each node have access to
  remote keys...
 
  Pedro
 
  On Jan 20, 2014, at 9:48 AM, Pedro Ruivo pe...@infinispan.org wrote:
 
  IMO, we should try the worst scenario: Local Mode + Single thread.
 
  this will show us the highest impact in performance.
  Cheers,
 
  ___
  infinispan-dev mailing list
  infinispan-dev@lists.jboss.org
  https://lists.jboss.org/mailman/listinfo/infinispan-dev
 
 
 --
 Radim Vansa rva...@redhat.com
 JBoss DataGrid QA
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Galder Zamarreño
gal...@redhat.com
twitter.com/galderz

Project Lead, Escalante
http://escalante.io

Engineer, Infinispan
http://infinispan.org


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Store as binary

2014-01-21 Thread Sanne Grinovero
On 21 January 2014 13:37, Mircea Markus mmar...@redhat.com wrote:

 On Jan 21, 2014, at 1:21 PM, Galder Zamarreño gal...@redhat.com wrote:

 What's the point for these tests?

 +1

 To validate if storing the data in binary format yields better performance 
 than store is as a POJO.

That will highly depend on the scenarios you want to test for. AFAIK
this started after Paul described how session replication works in
WildFly, and we already know that both strategies are suboptimal with
the current options available: in his case the active node will always
write on the POJO, while the backup node will essentially only need to
store the buffer just in case he might need to take over.

Sure, one will be slower, but if you want to make a suggestion to him
about which configuration he should be using, we should measure his
use case, not a different one.

Even then as discussed in Palma, an in memory String representation
might be way more compact because of pooling of strings and a very
high likelihood for repeated headers (as common in web frameworks), so
you might want to measure the CPU vs storage cost on the receiving
side.. but then again your results will definitely depend on the input
data and assumptions on likelihood of failover, how often is being
written on the owner node vs on the other node (since he uses
locality), etc.. many factors I'm not seeing being considered here and
which could make a significant difference.

 As of now, it doesn't so I need to check why.

You could play with the test parameters until it produces an output
you like better, but I still see no point? This is not a realistic
scenario, at best it could help us document suggestions about which
scenarios you'd want to keep the option enabled vs disabled, but then
again I think we're wasting time as we could implement a better
strategy for Paul's use case: one which never deserializes a value
received from a remote node until it's been requested as a POJO, but
keeps the POJO as-is when it's stored locally. I believe that would
make sense also for OGM and probably most other users of Embedded.
Basically, that would re-implement something similar to the previous
design but simplifying it a bit so that it doesn't allow for a
back-and-forth conversion between storage types but rather dynamically
favors a specific storage strategy.

Cheers,
Sanne


 Cheers,
 --
 Mircea Markus
 Infinispan lead (www.infinispan.org)





 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Store as binary

2014-01-21 Thread Mircea Markus

On Jan 21, 2014, at 2:13 PM, Sanne Grinovero sa...@infinispan.org wrote:

 On 21 January 2014 13:37, Mircea Markus mmar...@redhat.com wrote:
 
 On Jan 21, 2014, at 1:21 PM, Galder Zamarreño gal...@redhat.com wrote:
 
 What's the point for these tests?
 
 +1
 
 To validate if storing the data in binary format yields better performance 
 than store is as a POJO.
 
 That will highly depend on the scenarios you want to test for. AFAIK
 this started after Paul described how session replication works in
 WildFly, and we already know that both strategies are suboptimal with
 the current options available: in his case the active node will always
 write on the POJO, while the backup node will essentially only need to
 store the buffer just in case he might need to take over.

Indeed as it is today, it doesn't make sense for WildFly's session replication.

 
 Sure, one will be slower, but if you want to make a suggestion to him
 about which configuration he should be using, we should measure his
 use case, not a different one.
 
 Even then as discussed in Palma, an in memory String representation
 might be way more compact because of pooling of strings and a very
 high likelihood for repeated headers (as common in web frameworks),

pooling like in String.intern()? 
Even so, if most of your access to the String is to serialize it and sent is 
remotely then you have a serialization cost(CPU) to pay for the reduced size.

 so
 you might want to measure the CPU vs storage cost on the receiving
 side.. but then again your results will definitely depend on the input
 data and assumptions on likelihood of failover, how often is being
 written on the owner node vs on the other node (since he uses
 locality), etc.. many factors I'm not seeing being considered here and
 which could make a significant difference.

I'm looking for the default setting of storeAsBinary in the configurations we 
ship. I think the default configs should be optimized for distribution, random 
key access (every reads/writes for any key executes on every node of the 
cluster with the same probability) for both read an write.

 
 As of now, it doesn't so I need to check why.
 
 You could play with the test parameters until it produces an output
 you like better, but I still see no point?

the point is to provide the best defaults params for the default config, and 
see what's the usefulness of storeAsBinary.  

 This is not a realistic
 scenario, at best it could help us document suggestions about which
 scenarios you'd want to keep the option enabled vs disabled, but then
 again I think we're wasting time as we could implement a better
 strategy for Paul's use case: one which never deserializes a value
 received from a remote node until it's been requested as a POJO, but
 keeps the POJO as-is when it's stored locally.

I disagree: Paul's scenario, whilst very important, is quite specific. For what 
I consider the general case (random key access, see above), your approach is 
suboptimal.  


 I believe that would
 make sense also for OGM and probably most other users of Embedded.
 Basically, that would re-implement something similar to the previous
 design but simplifying it a bit so that it doesn't allow for a
 back-and-forth conversion between storage types but rather dynamically
 favors a specific storage strategy.

It all boils down to what we want to optimize for: random key access or some 
degree of affinity. I think the former is the default.
One way or the other, from the test Radim ran with random key access, the 
storeAsBinary doesn't bring any benefit and it should: 
http://lists.jboss.org/pipermail/infinispan-dev/2009-October/004299.html

 
 Cheers,
 Sanne
 
 
 Cheers,
 --
 Mircea Markus
 Infinispan lead (www.infinispan.org)
 
 
 
 
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)





___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Store as binary

2014-01-20 Thread Mircea Markus
Hi Radim,

I think 4 nodes with numOwner=2 is too small of a cluster. My calculus here[1] 
points out that for numOwners=1, the performance benefits is only visible for 
clusters having more than two nodes. Following a similar logic for numOwenrs=2, 
the benefit would only be visible for clusters having more than 4 nodes. Would 
it be possible to run the test on a larger cluster, 8+ nodes?

[1] http://lists.jboss.org/pipermail/infinispan-dev/2009-October/004299.html

On Jan 17, 2014, at 1:06 PM, Radim Vansa rva...@redhat.com wrote:

 Hi Mircea,
 
 I've ran a simple stress test [1] in dist mode with store as binary (not 
 enabled, enabled keys only, enabled values only, enabled both).
 The difference is  2 % (with storeAsBinary enabled fully being slower).
 
 Radim
 
 [1] 
 https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/1/artifact/report/All_report.html
 
 -- 
 Radim Vansa rva...@redhat.com
 JBoss DataGrid QA
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)





___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Store as binary

2014-01-20 Thread Pedro Ruivo
Hi,

IMO, we should try the worst scenario: Local Mode + Single thread.

this will show us the highest impact in performance.

Cheers,
Pedro

On 01/20/2014 09:41 AM, Mircea Markus wrote:
 Hi Radim,

 I think 4 nodes with numOwner=2 is too small of a cluster. My calculus 
 here[1] points out that for numOwners=1, the performance benefits is only 
 visible for clusters having more than two nodes. Following a similar logic 
 for numOwenrs=2, the benefit would only be visible for clusters having more 
 than 4 nodes. Would it be possible to run the test on a larger cluster, 8+ 
 nodes?

 [1] http://lists.jboss.org/pipermail/infinispan-dev/2009-October/004299.html

 On Jan 17, 2014, at 1:06 PM, Radim Vansa rva...@redhat.com wrote:

 Hi Mircea,

 I've ran a simple stress test [1] in dist mode with store as binary (not
 enabled, enabled keys only, enabled values only, enabled both).
 The difference is  2 % (with storeAsBinary enabled fully being slower).

 Radim

 [1]
 https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/1/artifact/report/All_report.html

 --
 Radim Vansa rva...@redhat.com
 JBoss DataGrid QA

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

 Cheers,

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Store as binary

2014-01-20 Thread Mircea Markus
Would be interesting to see as well, though performance figure would not 
include the network latency, hence it would not tell much about the benefit of 
using this on a real life system.

On Jan 20, 2014, at 9:48 AM, Pedro Ruivo pe...@infinispan.org wrote:

 IMO, we should try the worst scenario: Local Mode + Single thread.
 
 this will show us the highest impact in performance.

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)





___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Store as binary

2014-01-20 Thread Radim Vansa
OK, I have results for dist-udp-no-tx or local-no-tx modes on 8 nodes 
(in local mode the nodes don't communicate, naturally):
Dist mode: 3 % down for reads, 1 % for writes
Local mode: 19 % down for reads, 16 % for writes

Details in [1], ^ is for both keys and values stored as binary.

Radim

[1] 
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/job/jdg-radargun-perf-store-as-binary/4/artifact/report/All_report.html

On 01/20/2014 11:14 AM, Pedro Ruivo wrote:

 On 01/20/2014 10:07 AM, Mircea Markus wrote:
 Would be interesting to see as well, though performance figure would not 
 include the network latency, hence it would not tell much about the benefit 
 of using this on a real life system.
 that's my point. I'm interested to see the worst scenario since all
 other cluster modes, will have a lower (or none) impact in performance.

 Of course, the best scenario would be only each node have access to
 remote keys...

 Pedro

 On Jan 20, 2014, at 9:48 AM, Pedro Ruivo pe...@infinispan.org wrote:

 IMO, we should try the worst scenario: Local Mode + Single thread.

 this will show us the highest impact in performance.
 Cheers,

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


-- 
Radim Vansa rva...@redhat.com
JBoss DataGrid QA

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev