Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On 14 Jun 2011, at 16:49, Bela Ban wrote: > Just copy the damn buffer and give it to me $@$#^%$#^%^$ :-) -- Manik Surtani ma...@jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
Just copy the damn buffer and give it to me $@$#^%$#^%^$ Simple. Performant. Reliable. :-) On 6/14/11 5:42 PM, Sanne Grinovero wrote: > 2011/6/14 Galder Zamarreño: >> >> On Jun 14, 2011, at 1:24 PM, Manik Surtani wrote: >> >>> >>> On 14 Jun 2011, at 12:15, Bela Ban wrote: >>> +1. There is also something else I wanted to bring to your attention. When you pass reference byte[] BUF to JGroups, JGroups will store BUF in the org.jgroups.Message MSG. MSG is subsequently stored in the retransmission table of NAKACK. If you now modify the contents of BUF, you will modify a subsequent potential retransmission of MSG as well ! I don't think this is done currently (Infinispan uses a new buffer every time), but just make sure buffer reuse doesn't get rolled into a new design... >>> >>> Good point. >> >> Uuups, that'd be rather nasty. So, once we pass it to you we can forget >> about it altogether. > > Right, so unless JGroups can notify us when it's not needing the > buffer anymore we can forget any kind of pooling for reuse. > > Sanne > >> >>> >>> -- >>> Manik Surtani >>> ma...@jboss.org >>> twitter.com/maniksurtani >>> >>> Lead, Infinispan >>> http://www.infinispan.org >>> >>> >>> >>> ___ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Galder Zamarreño >> Sr. Software Engineer >> Infinispan, JBoss Cache >> >> >> ___ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
2011/6/14 Galder Zamarreño : > > On Jun 14, 2011, at 1:24 PM, Manik Surtani wrote: > >> >> On 14 Jun 2011, at 12:15, Bela Ban wrote: >> >>> +1. >>> >>> >>> There is also something else I wanted to bring to your attention. When >>> you pass reference byte[] BUF to JGroups, JGroups will store BUF in the >>> org.jgroups.Message MSG. >>> >>> MSG is subsequently stored in the retransmission table of NAKACK. >>> >>> If you now modify the contents of BUF, you will modify a subsequent >>> potential retransmission of MSG as well ! >>> >>> I don't think this is done currently (Infinispan uses a new buffer every >>> time), but just make sure buffer reuse doesn't get rolled into a new >>> design... >> >> Good point. > > Uuups, that'd be rather nasty. So, once we pass it to you we can forget about > it altogether. Right, so unless JGroups can notify us when it's not needing the buffer anymore we can forget any kind of pooling for reuse. Sanne > >> >> -- >> Manik Surtani >> ma...@jboss.org >> twitter.com/maniksurtani >> >> Lead, Infinispan >> http://www.infinispan.org >> >> >> >> ___ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- > Galder Zamarreño > Sr. Software Engineer > Infinispan, JBoss Cache > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On Jun 14, 2011, at 1:24 PM, Manik Surtani wrote: > > On 14 Jun 2011, at 12:15, Bela Ban wrote: > >> +1. >> >> >> There is also something else I wanted to bring to your attention. When >> you pass reference byte[] BUF to JGroups, JGroups will store BUF in the >> org.jgroups.Message MSG. >> >> MSG is subsequently stored in the retransmission table of NAKACK. >> >> If you now modify the contents of BUF, you will modify a subsequent >> potential retransmission of MSG as well ! >> >> I don't think this is done currently (Infinispan uses a new buffer every >> time), but just make sure buffer reuse doesn't get rolled into a new >> design... > > Good point. Uuups, that'd be rather nasty. So, once we pass it to you we can forget about it altogether. > > -- > Manik Surtani > ma...@jboss.org > twitter.com/maniksurtani > > Lead, Infinispan > http://www.infinispan.org > > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On 14 Jun 2011, at 12:15, Bela Ban wrote: > +1. > > > There is also something else I wanted to bring to your attention. When > you pass reference byte[] BUF to JGroups, JGroups will store BUF in the > org.jgroups.Message MSG. > > MSG is subsequently stored in the retransmission table of NAKACK. > > If you now modify the contents of BUF, you will modify a subsequent > potential retransmission of MSG as well ! > > I don't think this is done currently (Infinispan uses a new buffer every > time), but just make sure buffer reuse doesn't get rolled into a new > design... Good point. -- Manik Surtani ma...@jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
+1. There is also something else I wanted to bring to your attention. When you pass reference byte[] BUF to JGroups, JGroups will store BUF in the org.jgroups.Message MSG. MSG is subsequently stored in the retransmission table of NAKACK. If you now modify the contents of BUF, you will modify a subsequent potential retransmission of MSG as well ! I don't think this is done currently (Infinispan uses a new buffer every time), but just make sure buffer reuse doesn't get rolled into a new design... On 6/14/11 12:54 PM, Sanne Grinovero wrote: > 2011/6/14 Galder Zamarreño: >> I like the idea but as Manik hinted I wonder how many people are gonna go >> and configure this unless Infinispan is blatant enough for the users to tell >> them their configuration is not optimal. >> >> We also need to consider the importance of the problem which is that STABLE >> keeps the whole buffer ref around. >> >> Before doing anything further, we should repeat the tests and see what GC >> looks like on sender side with the current adaptive buffer sizing. > > My understanding is that at the point the buffer reaches JGroups, and > so gets into the STABLE "long lifecycle" we already know the exact > size so we can do a last resize. > > The issue still looks to me about minimizing the amount of resizes > needed until we reach that point. -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
2011/6/14 Galder Zamarreño : > I like the idea but as Manik hinted I wonder how many people are gonna go and > configure this unless Infinispan is blatant enough for the users to tell them > their configuration is not optimal. > > We also need to consider the importance of the problem which is that STABLE > keeps the whole buffer ref around. > > Before doing anything further, we should repeat the tests and see what GC > looks like on sender side with the current adaptive buffer sizing. My understanding is that at the point the buffer reaches JGroups, and so gets into the STABLE "long lifecycle" we already know the exact size so we can do a last resize. The issue still looks to me about minimizing the amount of resizes needed until we reach that point. Cheers, Sanne > > On Jun 9, 2011, at 9:06 PM, Manik Surtani wrote: > >> Hi guys >> >> This is an excellent and fun discussion - very entertaining read for me. >> :-) So a quick summary based on everyones' ideas: >> >> I think we can't have a one size fits all solution here. I think simple >> array copies work well as long as the serialized forms are generally small. >> And while I agree with Bela that in some cases (HTTP session replication) it >> can be hard to determine the size of payloads, in others (Hibernate 2LC) >> this can be determined with a fair degree of certainty. >> >> Either way, I think this can be a bottleneck (both in terms of memory and >> CPU performance) if the serialized forms are large (over 100K? That's a >> guess... ) and buffers are sub-optimally sized. >> >> I think this should be pluggable - I haven't looked at the code paths in >> detail to see where the impact is, but perhaps different marshaller >> implementations (maybe all extending a generic JBoss Marshalling based >> marshaller) with different buffer/arraycopy logic? So here are the options >> I see: >> >> 1) Simple arraycopy (could be the default) >> 2) Static buffer size like we have now - but should be configurable in XML >> 3) Adaptive buffer (the current Netty-like policy Galder has implemented, >> maybe a separate one with reservoir sampling) >> 4) Per-Externalizer static buffer size - Externalizer to either provide a >> deterministic buffer size or a starting buffer size and growth factor. >> >> Option (4) would clearly be an "advanced option", reserved for use by very >> experienced developers who want to squeeze every drop of performance, and >> have intimate knowledge of their object graphs and know what this demands of >> their system in terms of serialization. >> >> But further, we should also have some logging in the marshaller - probably >> TRACE level, maybe JMX, disabled by default - to monitor samples and gather >> statistics on inefficiently configured buffer sizes and policies, perhaps >> even log marshalled types and resulting sizes. This could be run during a >> stress test on a staging environment to help determine how to tune >> marshalling based on the policies above. >> >> WDYT? I think the benefit of making this pluggable is that (a) it can be >> done piece-meal - one policy at a time and (b) each one is easier to unit >> test, so fewer bugs in, say, a reservoir sampling impl. >> >> Cheers >> Manik >> >> >> >> >> On 25 May 2011, at 08:45, Galder Zamarreño wrote: >> >>> >>> On May 24, 2011, at 1:08 PM, Dan Berindei wrote: >>> On Tue, May 24, 2011 at 11:57 AM, Sanne Grinovero wrote: > 2011/5/24 Galder Zamarreño : >> Guys, >> >> Some interesting discussions here, keep them coming! Let me summarise >> what I submitted yesterday as pull req for >> https://issues.jboss.org/browse/ISPN-1102 >> >> - I don't think users can really provide such accurate predictions of >> the objects sizes because first java does not give you an easy way of >> figuring out how much your object takes up and most of the people don't >> have such knowledge. What I think could be more interesting is >> potentially having a buffer predictor that predicts sizes per type, so >> rather than calculate the next buffer size taking all objects into >> account, do that per object type. To enable to do this in the future, >> I'm gonna add the object to be marshalled as parameter to >> https://github.com/infinispan/infinispan/pull/338/files#diff-2 - This >> enhancement allows for your suggestions on externalizers providing >> estimate size to be implemented, but I'm not keen on that. >> >> - For a solution to ISPN-1102, I've gone for a simpler adaptive buffer >> size algorithm that Netty uses for determining the receiver buffer size. >> The use cases are different but I liked the simplicity of the algorithm >> since calculating the next buffer size was an O(1) op and can grow both >> ways very easily. I agree that it might not be as exact as reservoir >> sampling+percentile, but at least it's cheaper to compute and it >> re
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
I like the idea but as Manik hinted I wonder how many people are gonna go and configure this unless Infinispan is blatant enough for the users to tell them their configuration is not optimal. We also need to consider the importance of the problem which is that STABLE keeps the whole buffer ref around. Before doing anything further, we should repeat the tests and see what GC looks like on sender side with the current adaptive buffer sizing. On Jun 9, 2011, at 9:06 PM, Manik Surtani wrote: > Hi guys > > This is an excellent and fun discussion - very entertaining read for me. :-) > So a quick summary based on everyones' ideas: > > I think we can't have a one size fits all solution here. I think simple > array copies work well as long as the serialized forms are generally small. > And while I agree with Bela that in some cases (HTTP session replication) it > can be hard to determine the size of payloads, in others (Hibernate 2LC) this > can be determined with a fair degree of certainty. > > Either way, I think this can be a bottleneck (both in terms of memory and CPU > performance) if the serialized forms are large (over 100K? That's a guess... > ) and buffers are sub-optimally sized. > > I think this should be pluggable - I haven't looked at the code paths in > detail to see where the impact is, but perhaps different marshaller > implementations (maybe all extending a generic JBoss Marshalling based > marshaller) with different buffer/arraycopy logic? So here are the options I > see: > > 1) Simple arraycopy (could be the default) > 2) Static buffer size like we have now - but should be configurable in XML > 3) Adaptive buffer (the current Netty-like policy Galder has implemented, > maybe a separate one with reservoir sampling) > 4) Per-Externalizer static buffer size - Externalizer to either provide a > deterministic buffer size or a starting buffer size and growth factor. > > Option (4) would clearly be an "advanced option", reserved for use by very > experienced developers who want to squeeze every drop of performance, and > have intimate knowledge of their object graphs and know what this demands of > their system in terms of serialization. > > But further, we should also have some logging in the marshaller - probably > TRACE level, maybe JMX, disabled by default - to monitor samples and gather > statistics on inefficiently configured buffer sizes and policies, perhaps > even log marshalled types and resulting sizes. This could be run during a > stress test on a staging environment to help determine how to tune > marshalling based on the policies above. > > WDYT? I think the benefit of making this pluggable is that (a) it can be > done piece-meal - one policy at a time and (b) each one is easier to unit > test, so fewer bugs in, say, a reservoir sampling impl. > > Cheers > Manik > > > > > On 25 May 2011, at 08:45, Galder Zamarreño wrote: > >> >> On May 24, 2011, at 1:08 PM, Dan Berindei wrote: >> >>> On Tue, May 24, 2011 at 11:57 AM, Sanne Grinovero >>> wrote: 2011/5/24 Galder Zamarreño : > Guys, > > Some interesting discussions here, keep them coming! Let me summarise > what I submitted yesterday as pull req for > https://issues.jboss.org/browse/ISPN-1102 > > - I don't think users can really provide such accurate predictions of the > objects sizes because first java does not give you an easy way of > figuring out how much your object takes up and most of the people don't > have such knowledge. What I think could be more interesting is > potentially having a buffer predictor that predicts sizes per type, so > rather than calculate the next buffer size taking all objects into > account, do that per object type. To enable to do this in the future, I'm > gonna add the object to be marshalled as parameter to > https://github.com/infinispan/infinispan/pull/338/files#diff-2 - This > enhancement allows for your suggestions on externalizers providing > estimate size to be implemented, but I'm not keen on that. > > - For a solution to ISPN-1102, I've gone for a simpler adaptive buffer > size algorithm that Netty uses for determining the receiver buffer size. > The use cases are different but I liked the simplicity of the algorithm > since calculating the next buffer size was an O(1) op and can grow both > ways very easily. I agree that it might not be as exact as reservoir > sampling+percentile, but at least it's cheaper to compute and it resolves > the immediate problem of senders keeping too much memory for sent buffers > before STABLE comes around. > > - Next step would be to go and test this and compare it with Bela/Dan > were seeing (+1 to another interactive debugging session), and if we are > still not too happy about the memory consumption, maybe we can look into > providing a different implementa
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On 6/10/11 12:20 PM, Manik Surtani wrote: > I was referring to generating byte arrays (sending state), not generating > objects (receiving state). A buffer is maintained in the AbstractMarshaller > and used. > > I did see a comment from Bela on this thread about seeing this on the > receiver too though - Bela, care to clarify? This was incorrect. On the receiver side, JGroups knows exactly how long the data buffer needs to be as the size is sent as the first element in the data. The incoming packet is always copied and passed up to Infinispan. > I presume on the receiver that JGroups calls > RpcDispatcher.Marshaller2.objectFromByteBuffer() and passes in a byte array. > So perhaps you are referring to how you create these byte buffers in JGroups > for reading off a network stream? The problem - as Sanne mentioned - is the size of the generated buffer passed to JGroups. Infinispan uses a fixed buffer of 512 bytes and this buffer is increased on-demand (doubled by default, IIRC). Infinispan's marshaller then passes a ref to this buffer, plus an offset and length. However, because JGroups keeps the ref around, the entire data buffer will not get released until STABLE allows for purging of the message. This happens for sent messages only. -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
2011/6/10 Manik Surtani : > > On 10 Jun 2011, at 04:48, Tristan Tarrant wrote: > >> I don't know if I'm actually contributing something here or just creating >> noise. >> >> Are these buffers reused over time ? If not, from a GC point of view >> it would be better then not to reduce the size of the buffer just to >> save a few bytes: it would mean throwing to GC a perfectly valid bit >> of memory. Increasing the size is another matter. >> If Infinispan is not reusing buffers, why isn't it ? > > Actually this is a good point. We're looking at holding on to the buffer > size in a thread-local, but actually creating a new buffer each time. > > Somewhere in this thread there was discussion of creating a buffer per thread > (thread-local again) but was determined to be too much of a mem leak (and I > agree with this). > > Maybe it makes sense to create a pool of buffers, to be shared? It would > certainly save on gc overhead. But what about the cost of synchronizing > access to this buffer pool? Maybe allocating one large buffer and different > threads making use of different ranges here? Again, the sync could be pretty > complex. But concievably lots of benefits though. > > Thoughts and opinions? If we go for the thread owned buffer, maybe when they're small enough, I'd be inclined to like it, but nothing can be said without a very good test. As Bela already mentioned, I doubt any kind of pool shared across threads would be helpful: if we keep the buffers for a very short time they won't leave the new zone, and there's hardly any gc overhead there, actually gc is going to copy over again and again the memory regions we hold on to perform memory compaction, so it's quite likely that allocating a new one each time is quicker as we reuse the efficient Eden space. So I'd work in a direction to prevent having to resize the buffer multiple times: while it's always good to consider GC cost it doesn't look like the primary concern here. If we have to resize the buffer again and again during the same serialization, we might end up consuming the eden space quicker, so it would be nice to have a good approximation to reduce the frequency of resizing operations. Cheers, Sanne > > Cheers > Manik > -- > Manik Surtani > ma...@jboss.org > twitter.com/maniksurtani > > Lead, Infinispan > http://www.infinispan.org > > > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
> Somewhere in this thread there was discussion of creating a buffer per thread > (thread-local again) but was determined to be too much of a mem leak (and I > agree with this). We should avoid thread locals :) > Maybe it makes sense to create a pool of buffers, to be shared? It would > certainly save on gc overhead. But what about the cost of synchronizing > access to this buffer pool? Maybe allocating one large buffer and different > threads making use of different ranges here? Again, the sync could be pretty > complex. But concievably lots of benefits though. I wouldn't complicate things overmuch. I believe a simple pool based on ConcurrentLinkedQueue (with its synchronization) would be a much better alternative to the current situation of allocation, reallocation (maybe) and GC. Tristan ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On 10 Jun 2011, at 04:48, Tristan Tarrant wrote: > I don't know if I'm actually contributing something here or just creating > noise. > > Are these buffers reused over time ? If not, from a GC point of view > it would be better then not to reduce the size of the buffer just to > save a few bytes: it would mean throwing to GC a perfectly valid bit > of memory. Increasing the size is another matter. > If Infinispan is not reusing buffers, why isn't it ? Actually this is a good point. We're looking at holding on to the buffer size in a thread-local, but actually creating a new buffer each time. Somewhere in this thread there was discussion of creating a buffer per thread (thread-local again) but was determined to be too much of a mem leak (and I agree with this). Maybe it makes sense to create a pool of buffers, to be shared? It would certainly save on gc overhead. But what about the cost of synchronizing access to this buffer pool? Maybe allocating one large buffer and different threads making use of different ranges here? Again, the sync could be pretty complex. But concievably lots of benefits though. Thoughts and opinions? Cheers Manik -- Manik Surtani ma...@jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
I was referring to generating byte arrays (sending state), not generating objects (receiving state). A buffer is maintained in the AbstractMarshaller and used. I did see a comment from Bela on this thread about seeing this on the receiver too though - Bela, care to clarify? I presume on the receiver that JGroups calls RpcDispatcher.Marshaller2.objectFromByteBuffer() and passes in a byte array. So perhaps you are referring to how you create these byte buffers in JGroups for reading off a network stream? On 9 Jun 2011, at 23:45, Sanne Grinovero wrote: > Actually on this thread I keep getting confused about what is the > issue we want to solve. Initially I thought it was about allocating > the buffer to externalize known object types, as I saw the growing > buffer logic in the MarshalledValue so the discussion seemed > interesting to me, but I was corrected that it's about estimating the > receiving buffer. > > Then why don't we prefix all messages with the length, read that long > first, and then proceed by allocating exactly what we know will be > needed? > (If the point is that we can't read the long without a first buffer, > then we can't read the type for the first marshaller either, but we > can at least reduce it to single estimate & resize once to exact > size). > > Seems too simple, so I would appreciate it if somebody could recap > what the problem is about. > > Sanne > > 2011/6/9 Manik Surtani : >> >> On 25 May 2011, at 08:45, Galder Zamarreño wrote: >> Looks great Galder, although I could use some comments on how the possible buffer sizes are chosen in your algorithm :-) >>> >>> I'll ping you on IRC. >> >> Could you make sure this is properly documented in the impl classes, whether >> in Javadoc or comments? >> >> -- >> Manik Surtani >> ma...@jboss.org >> twitter.com/maniksurtani >> >> Lead, Infinispan >> http://www.infinispan.org >> >> >> >> >> ___ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Manik Surtani ma...@jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
I don't know if I'm actually contributing something here or just creating noise. Are these buffers reused over time ? If not, from a GC point of view it would be better then not to reduce the size of the buffer just to save a few bytes: it would mean throwing to GC a perfectly valid bit of memory. Increasing the size is another matter. If Infinispan is not reusing buffers, why isn't it ? Sorry if this is out of scope. Tristan On Fri, Jun 10, 2011 at 00:45, Sanne Grinovero wrote: > Actually on this thread I keep getting confused about what is the > issue we want to solve. Initially I thought it was about allocating > the buffer to externalize known object types, as I saw the growing > buffer logic in the MarshalledValue so the discussion seemed > interesting to me, but I was corrected that it's about estimating the > receiving buffer. ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
Actually on this thread I keep getting confused about what is the issue we want to solve. Initially I thought it was about allocating the buffer to externalize known object types, as I saw the growing buffer logic in the MarshalledValue so the discussion seemed interesting to me, but I was corrected that it's about estimating the receiving buffer. Then why don't we prefix all messages with the length, read that long first, and then proceed by allocating exactly what we know will be needed? (If the point is that we can't read the long without a first buffer, then we can't read the type for the first marshaller either, but we can at least reduce it to single estimate & resize once to exact size). Seems too simple, so I would appreciate it if somebody could recap what the problem is about. Sanne 2011/6/9 Manik Surtani : > > On 25 May 2011, at 08:45, Galder Zamarreño wrote: > >>> >>> Looks great Galder, although I could use some comments on how the >>> possible buffer sizes are chosen in your algorithm :-) >> >> I'll ping you on IRC. > > Could you make sure this is properly documented in the impl classes, whether > in Javadoc or comments? > > -- > Manik Surtani > ma...@jboss.org > twitter.com/maniksurtani > > Lead, Infinispan > http://www.infinispan.org > > > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On 25 May 2011, at 08:45, Galder Zamarreño wrote: >> >> Looks great Galder, although I could use some comments on how the >> possible buffer sizes are chosen in your algorithm :-) > > I'll ping you on IRC. Could you make sure this is properly documented in the impl classes, whether in Javadoc or comments? -- Manik Surtani ma...@jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On 24 May 2011, at 07:12, Bela Ban wrote: > > Ah, ok. I think we should really do what we said before JBW, namely have > an interactive debugging session, to clear this up. +1. Let me know when you guys are planning on doing this. -- Manik Surtani ma...@jboss.org twitter.com/maniksurtani Lead, Infinispan http://www.infinispan.org ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
Hi guys This is an excellent and fun discussion - very entertaining read for me. :-) So a quick summary based on everyones' ideas: I think we can't have a one size fits all solution here. I think simple array copies work well as long as the serialized forms are generally small. And while I agree with Bela that in some cases (HTTP session replication) it can be hard to determine the size of payloads, in others (Hibernate 2LC) this can be determined with a fair degree of certainty. Either way, I think this can be a bottleneck (both in terms of memory and CPU performance) if the serialized forms are large (over 100K? That's a guess... ) and buffers are sub-optimally sized. I think this should be pluggable - I haven't looked at the code paths in detail to see where the impact is, but perhaps different marshaller implementations (maybe all extending a generic JBoss Marshalling based marshaller) with different buffer/arraycopy logic? So here are the options I see: 1) Simple arraycopy (could be the default) 2) Static buffer size like we have now - but should be configurable in XML 3) Adaptive buffer (the current Netty-like policy Galder has implemented, maybe a separate one with reservoir sampling) 4) Per-Externalizer static buffer size - Externalizer to either provide a deterministic buffer size or a starting buffer size and growth factor. Option (4) would clearly be an "advanced option", reserved for use by very experienced developers who want to squeeze every drop of performance, and have intimate knowledge of their object graphs and know what this demands of their system in terms of serialization. But further, we should also have some logging in the marshaller - probably TRACE level, maybe JMX, disabled by default - to monitor samples and gather statistics on inefficiently configured buffer sizes and policies, perhaps even log marshalled types and resulting sizes. This could be run during a stress test on a staging environment to help determine how to tune marshalling based on the policies above. WDYT? I think the benefit of making this pluggable is that (a) it can be done piece-meal - one policy at a time and (b) each one is easier to unit test, so fewer bugs in, say, a reservoir sampling impl. Cheers Manik On 25 May 2011, at 08:45, Galder Zamarreño wrote: > > On May 24, 2011, at 1:08 PM, Dan Berindei wrote: > >> On Tue, May 24, 2011 at 11:57 AM, Sanne Grinovero >> wrote: >>> 2011/5/24 Galder Zamarreño : Guys, Some interesting discussions here, keep them coming! Let me summarise what I submitted yesterday as pull req for https://issues.jboss.org/browse/ISPN-1102 - I don't think users can really provide such accurate predictions of the objects sizes because first java does not give you an easy way of figuring out how much your object takes up and most of the people don't have such knowledge. What I think could be more interesting is potentially having a buffer predictor that predicts sizes per type, so rather than calculate the next buffer size taking all objects into account, do that per object type. To enable to do this in the future, I'm gonna add the object to be marshalled as parameter to https://github.com/infinispan/infinispan/pull/338/files#diff-2 - This enhancement allows for your suggestions on externalizers providing estimate size to be implemented, but I'm not keen on that. - For a solution to ISPN-1102, I've gone for a simpler adaptive buffer size algorithm that Netty uses for determining the receiver buffer size. The use cases are different but I liked the simplicity of the algorithm since calculating the next buffer size was an O(1) op and can grow both ways very easily. I agree that it might not be as exact as reservoir sampling+percentile, but at least it's cheaper to compute and it resolves the immediate problem of senders keeping too much memory for sent buffers before STABLE comes around. - Next step would be to go and test this and compare it with Bela/Dan were seeing (+1 to another interactive debugging session), and if we are still not too happy about the memory consumption, maybe we can look into providing a different implementation for BufferSizePredictor that uses R sampling. - Finally, I think once ISPN-1102 is in, we should make the BufferSizePredictor implementation configurable programmatically and via XML - I'll create a separate JIRA for this. >>> >>> great wrap up, +1 on all points. >>> BTW I definitely don't expect every user to be able to figure out the >>> proper size, just that some of them might want (need?) to provide >>> hints. >>> >> >> Looks great Galder, although I could use some comments on how the >> possible buffer sizes are chosen in your algorithm :-) > > I'll ping you on IRC. > >> I guess we were thinki
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On May 24, 2011, at 1:08 PM, Dan Berindei wrote: > On Tue, May 24, 2011 at 11:57 AM, Sanne Grinovero > wrote: >> 2011/5/24 Galder Zamarreño : >>> Guys, >>> >>> Some interesting discussions here, keep them coming! Let me summarise what >>> I submitted yesterday as pull req for >>> https://issues.jboss.org/browse/ISPN-1102 >>> >>> - I don't think users can really provide such accurate predictions of the >>> objects sizes because first java does not give you an easy way of figuring >>> out how much your object takes up and most of the people don't have such >>> knowledge. What I think could be more interesting is potentially having a >>> buffer predictor that predicts sizes per type, so rather than calculate the >>> next buffer size taking all objects into account, do that per object type. >>> To enable to do this in the future, I'm gonna add the object to be >>> marshalled as parameter to >>> https://github.com/infinispan/infinispan/pull/338/files#diff-2 - This >>> enhancement allows for your suggestions on externalizers providing estimate >>> size to be implemented, but I'm not keen on that. >>> >>> - For a solution to ISPN-1102, I've gone for a simpler adaptive buffer size >>> algorithm that Netty uses for determining the receiver buffer size. The use >>> cases are different but I liked the simplicity of the algorithm since >>> calculating the next buffer size was an O(1) op and can grow both ways very >>> easily. I agree that it might not be as exact as reservoir >>> sampling+percentile, but at least it's cheaper to compute and it resolves >>> the immediate problem of senders keeping too much memory for sent buffers >>> before STABLE comes around. >>> >>> - Next step would be to go and test this and compare it with Bela/Dan were >>> seeing (+1 to another interactive debugging session), and if we are still >>> not too happy about the memory consumption, maybe we can look into >>> providing a different implementation for BufferSizePredictor that uses R >>> sampling. >>> >>> - Finally, I think once ISPN-1102 is in, we should make the >>> BufferSizePredictor implementation configurable programmatically and via >>> XML - I'll create a separate JIRA for this. >> >> great wrap up, +1 on all points. >> BTW I definitely don't expect every user to be able to figure out the >> proper size, just that some of them might want (need?) to provide >> hints. >> > > Looks great Galder, although I could use some comments on how the > possible buffer sizes are chosen in your algorithm :-) I'll ping you on IRC. > I guess we were thinking of different things with the externalizer > extension. I was imagining something like an ObjectOutput > implementation that doesn't really write anything but instead it just > records the size of the object that would be written. That way the > size estimate would always be accurate, but of course the performance > wouldn't be very good for complex object graphs. > > Still I'd like to play with something like this to see if we can > estimate the memory usage of the cache and base the eviction on the > (estimated) memory usage instead of a fixed number of entries, it > seems to me like that's the first question people ask when they start > using Infinispan. Sure, this is something we have considered in the past, and a cache that stores everything as binary is the easiest of the use cases to provide this type of calculation. In the case where store-as-binary is off, doing this is more complicated because even if you can marshall things at some point (i.e. at replication time), the space taken by the object in memory vs it's binary form are different. > > Cheers > Dan > > >> Cheers, >> Sanne >> >>> >>> Cheers, >>> >>> On May 24, 2011, at 8:12 AM, Bela Ban wrote: >>> On 5/23/11 11:09 PM, Dan Berindei wrote: >> No need to expose the ExposedByteArrayOutputStream, a byte[] buffer, >> offset and length will do it, and we already use this today. >> >> >>> In case the value is not stored in binary form, the expected life of >>> the stream is very short anyway, after being pushed directly to >>> network buffers we don't need it anymore... couldn't we pass the >>> non-truncated stream directly to JGroups without this final size >>> adjustement ? >> > > The problem is that byte[] first has to be copied to another buffer > together with the rest of the ReplicableCommand before getting to > JGroups. AFAIK in JGroups you must have 1 buffer for each message. If you use ExposedByteArrayOutputStream, you should have access to the underlying buffer, so you don't need to copy it. >> You do that, yes. >> >> However, afair, the issue is not on the *sending*, but on the >> *receiving* side. That's where the larger-than-needed buffer sticks >> around. On the sending side, as you mentioned, Infinispan passes a >> buffer/offset
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On Tue, May 24, 2011 at 11:57 AM, Sanne Grinovero wrote: > 2011/5/24 Galder Zamarreño : >> Guys, >> >> Some interesting discussions here, keep them coming! Let me summarise what I >> submitted yesterday as pull req for https://issues.jboss.org/browse/ISPN-1102 >> >> - I don't think users can really provide such accurate predictions of the >> objects sizes because first java does not give you an easy way of figuring >> out how much your object takes up and most of the people don't have such >> knowledge. What I think could be more interesting is potentially having a >> buffer predictor that predicts sizes per type, so rather than calculate the >> next buffer size taking all objects into account, do that per object type. >> To enable to do this in the future, I'm gonna add the object to be >> marshalled as parameter to >> https://github.com/infinispan/infinispan/pull/338/files#diff-2 - This >> enhancement allows for your suggestions on externalizers providing estimate >> size to be implemented, but I'm not keen on that. >> >> - For a solution to ISPN-1102, I've gone for a simpler adaptive buffer size >> algorithm that Netty uses for determining the receiver buffer size. The use >> cases are different but I liked the simplicity of the algorithm since >> calculating the next buffer size was an O(1) op and can grow both ways very >> easily. I agree that it might not be as exact as reservoir >> sampling+percentile, but at least it's cheaper to compute and it resolves >> the immediate problem of senders keeping too much memory for sent buffers >> before STABLE comes around. >> >> - Next step would be to go and test this and compare it with Bela/Dan were >> seeing (+1 to another interactive debugging session), and if we are still >> not too happy about the memory consumption, maybe we can look into providing >> a different implementation for BufferSizePredictor that uses R sampling. >> >> - Finally, I think once ISPN-1102 is in, we should make the >> BufferSizePredictor implementation configurable programmatically and via XML >> - I'll create a separate JIRA for this. > > great wrap up, +1 on all points. > BTW I definitely don't expect every user to be able to figure out the > proper size, just that some of them might want (need?) to provide > hints. > Looks great Galder, although I could use some comments on how the possible buffer sizes are chosen in your algorithm :-) I guess we were thinking of different things with the externalizer extension. I was imagining something like an ObjectOutput implementation that doesn't really write anything but instead it just records the size of the object that would be written. That way the size estimate would always be accurate, but of course the performance wouldn't be very good for complex object graphs. Still I'd like to play with something like this to see if we can estimate the memory usage of the cache and base the eviction on the (estimated) memory usage instead of a fixed number of entries, it seems to me like that's the first question people ask when they start using Infinispan. Cheers Dan > Cheers, > Sanne > >> >> Cheers, >> >> On May 24, 2011, at 8:12 AM, Bela Ban wrote: >> >>> >>> >>> On 5/23/11 11:09 PM, Dan Berindei wrote: >>> > No need to expose the ExposedByteArrayOutputStream, a byte[] buffer, > offset and length will do it, and we already use this today. > > >> In case the value is not stored in binary form, the expected life of >> the stream is very short anyway, after being pushed directly to >> network buffers we don't need it anymore... couldn't we pass the >> non-truncated stream directly to JGroups without this final size >> adjustement ? > The problem is that byte[] first has to be copied to another buffer together with the rest of the ReplicableCommand before getting to JGroups. AFAIK in JGroups you must have 1 buffer for each message. >>> >>> >>> If you use ExposedByteArrayOutputStream, you should have access to the >>> underlying buffer, so you don't need to copy it. >>> >>> > You do that, yes. > > However, afair, the issue is not on the *sending*, but on the > *receiving* side. That's where the larger-than-needed buffer sticks > around. On the sending side, as you mentioned, Infinispan passes a > buffer/offset/length to JGroups and JGroups passes this right on to the > network layer, which copies that data into a buffer. > I don't think so... on the receiving size the buffer size is controlled exclusively by JGroups, the unmarshaller doesn't create any buffers. The only buffers on the receiving side are those created by JGroups, and JGroups knows the message size before creating the buffer so it doesn't have to worry about predicting buffer sizes. On sending however I understood that JGroups keeps the buffer with the offset and length in the NakReceivingWindow exactly as it got it fr
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
2011/5/24 Galder Zamarreño : > Guys, > > Some interesting discussions here, keep them coming! Let me summarise what I > submitted yesterday as pull req for https://issues.jboss.org/browse/ISPN-1102 > > - I don't think users can really provide such accurate predictions of the > objects sizes because first java does not give you an easy way of figuring > out how much your object takes up and most of the people don't have such > knowledge. What I think could be more interesting is potentially having a > buffer predictor that predicts sizes per type, so rather than calculate the > next buffer size taking all objects into account, do that per object type. To > enable to do this in the future, I'm gonna add the object to be marshalled as > parameter to https://github.com/infinispan/infinispan/pull/338/files#diff-2 - > This enhancement allows for your suggestions on externalizers providing > estimate size to be implemented, but I'm not keen on that. > > - For a solution to ISPN-1102, I've gone for a simpler adaptive buffer size > algorithm that Netty uses for determining the receiver buffer size. The use > cases are different but I liked the simplicity of the algorithm since > calculating the next buffer size was an O(1) op and can grow both ways very > easily. I agree that it might not be as exact as reservoir > sampling+percentile, but at least it's cheaper to compute and it resolves the > immediate problem of senders keeping too much memory for sent buffers before > STABLE comes around. > > - Next step would be to go and test this and compare it with Bela/Dan were > seeing (+1 to another interactive debugging session), and if we are still not > too happy about the memory consumption, maybe we can look into providing a > different implementation for BufferSizePredictor that uses R sampling. > > - Finally, I think once ISPN-1102 is in, we should make the > BufferSizePredictor implementation configurable programmatically and via XML > - I'll create a separate JIRA for this. great wrap up, +1 on all points. BTW I definitely don't expect every user to be able to figure out the proper size, just that some of them might want (need?) to provide hints. Cheers, Sanne > > Cheers, > > On May 24, 2011, at 8:12 AM, Bela Ban wrote: > >> >> >> On 5/23/11 11:09 PM, Dan Berindei wrote: >> No need to expose the ExposedByteArrayOutputStream, a byte[] buffer, offset and length will do it, and we already use this today. > In case the value is not stored in binary form, the expected life of > the stream is very short anyway, after being pushed directly to > network buffers we don't need it anymore... couldn't we pass the > non-truncated stream directly to JGroups without this final size > adjustement ? >>> >>> The problem is that byte[] first has to be copied to another buffer >>> together with the rest of the ReplicableCommand before getting to >>> JGroups. AFAIK in JGroups you must have 1 buffer for each message. >> >> >> If you use ExposedByteArrayOutputStream, you should have access to the >> underlying buffer, so you don't need to copy it. >> >> You do that, yes. However, afair, the issue is not on the *sending*, but on the *receiving* side. That's where the larger-than-needed buffer sticks around. On the sending side, as you mentioned, Infinispan passes a buffer/offset/length to JGroups and JGroups passes this right on to the network layer, which copies that data into a buffer. >>> >>> I don't think so... on the receiving size the buffer size is >>> controlled exclusively by JGroups, the unmarshaller doesn't create any >>> buffers. The only buffers on the receiving side are those created by >>> JGroups, and JGroups knows the message size before creating the buffer >>> so it doesn't have to worry about predicting buffer sizes. >>> >>> On sending however I understood that JGroups keeps the buffer with the >>> offset and length in the NakReceivingWindow exactly as it got it from >>> Infinispan, without any trimming, until it receives a STABLE message >>> from all the other nodes in the cluster. >> >> >> Ah, ok. I think we should really do what we said before JBW, namely have >> an interactive debugging session, to clear this up. >> >> -- >> Bela Ban >> Lead JGroups / Clustering Team >> JBoss >> ___ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- > Galder Zamarreño > Sr. Software Engineer > Infinispan, JBoss Cache > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
Guys, Some interesting discussions here, keep them coming! Let me summarise what I submitted yesterday as pull req for https://issues.jboss.org/browse/ISPN-1102 - I don't think users can really provide such accurate predictions of the objects sizes because first java does not give you an easy way of figuring out how much your object takes up and most of the people don't have such knowledge. What I think could be more interesting is potentially having a buffer predictor that predicts sizes per type, so rather than calculate the next buffer size taking all objects into account, do that per object type. To enable to do this in the future, I'm gonna add the object to be marshalled as parameter to https://github.com/infinispan/infinispan/pull/338/files#diff-2 - This enhancement allows for your suggestions on externalizers providing estimate size to be implemented, but I'm not keen on that. - For a solution to ISPN-1102, I've gone for a simpler adaptive buffer size algorithm that Netty uses for determining the receiver buffer size. The use cases are different but I liked the simplicity of the algorithm since calculating the next buffer size was an O(1) op and can grow both ways very easily. I agree that it might not be as exact as reservoir sampling+percentile, but at least it's cheaper to compute and it resolves the immediate problem of senders keeping too much memory for sent buffers before STABLE comes around. - Next step would be to go and test this and compare it with Bela/Dan were seeing (+1 to another interactive debugging session), and if we are still not too happy about the memory consumption, maybe we can look into providing a different implementation for BufferSizePredictor that uses R sampling. - Finally, I think once ISPN-1102 is in, we should make the BufferSizePredictor implementation configurable programmatically and via XML - I'll create a separate JIRA for this. Cheers, On May 24, 2011, at 8:12 AM, Bela Ban wrote: > > > On 5/23/11 11:09 PM, Dan Berindei wrote: > >>> No need to expose the ExposedByteArrayOutputStream, a byte[] buffer, >>> offset and length will do it, and we already use this today. >>> >>> In case the value is not stored in binary form, the expected life of the stream is very short anyway, after being pushed directly to network buffers we don't need it anymore... couldn't we pass the non-truncated stream directly to JGroups without this final size adjustement ? >>> >> >> The problem is that byte[] first has to be copied to another buffer >> together with the rest of the ReplicableCommand before getting to >> JGroups. AFAIK in JGroups you must have 1 buffer for each message. > > > If you use ExposedByteArrayOutputStream, you should have access to the > underlying buffer, so you don't need to copy it. > > >>> You do that, yes. >>> >>> However, afair, the issue is not on the *sending*, but on the >>> *receiving* side. That's where the larger-than-needed buffer sticks >>> around. On the sending side, as you mentioned, Infinispan passes a >>> buffer/offset/length to JGroups and JGroups passes this right on to the >>> network layer, which copies that data into a buffer. >>> >> >> I don't think so... on the receiving size the buffer size is >> controlled exclusively by JGroups, the unmarshaller doesn't create any >> buffers. The only buffers on the receiving side are those created by >> JGroups, and JGroups knows the message size before creating the buffer >> so it doesn't have to worry about predicting buffer sizes. >> >> On sending however I understood that JGroups keeps the buffer with the >> offset and length in the NakReceivingWindow exactly as it got it from >> Infinispan, without any trimming, until it receives a STABLE message >> from all the other nodes in the cluster. > > > Ah, ok. I think we should really do what we said before JBW, namely have > an interactive debugging session, to clear this up. > > -- > Bela Ban > Lead JGroups / Clustering Team > JBoss > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño Sr. Software Engineer Infinispan, JBoss Cache ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On 5/23/11 11:09 PM, Dan Berindei wrote: >> No need to expose the ExposedByteArrayOutputStream, a byte[] buffer, >> offset and length will do it, and we already use this today. >> >> >>> In case the value is not stored in binary form, the expected life of >>> the stream is very short anyway, after being pushed directly to >>> network buffers we don't need it anymore... couldn't we pass the >>> non-truncated stream directly to JGroups without this final size >>> adjustement ? >> > > The problem is that byte[] first has to be copied to another buffer > together with the rest of the ReplicableCommand before getting to > JGroups. AFAIK in JGroups you must have 1 buffer for each message. If you use ExposedByteArrayOutputStream, you should have access to the underlying buffer, so you don't need to copy it. >> You do that, yes. >> >> However, afair, the issue is not on the *sending*, but on the >> *receiving* side. That's where the larger-than-needed buffer sticks >> around. On the sending side, as you mentioned, Infinispan passes a >> buffer/offset/length to JGroups and JGroups passes this right on to the >> network layer, which copies that data into a buffer. >> > > I don't think so... on the receiving size the buffer size is > controlled exclusively by JGroups, the unmarshaller doesn't create any > buffers. The only buffers on the receiving side are those created by > JGroups, and JGroups knows the message size before creating the buffer > so it doesn't have to worry about predicting buffer sizes. > > On sending however I understood that JGroups keeps the buffer with the > offset and length in the NakReceivingWindow exactly as it got it from > Infinispan, without any trimming, until it receives a STABLE message > from all the other nodes in the cluster. Ah, ok. I think we should really do what we said before JBW, namely have an interactive debugging session, to clear this up. -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On Tue, May 24, 2011 at 12:13 AM, Sanne Grinovero wrote: > 2011/5/23 Bela Ban : >> >> >> On 5/23/11 8:42 PM, Dan Berindei wrote: >>> On Mon, May 23, 2011 at 7:44 PM, Sanne Grinovero >>> wrote: To keep stuff simple, I'd add an alternative feature instead: have the custom externalizers to optionally recommend an allocation buffer size. In my experience people use a set of well known types for the key, and maybe for the value as well, for which they actually know the output byte size, so there's no point in Infinispan to try guessing the size and then adapting on it; an exception being the often used Strings, even in composite keys, but again as user of the API I have a pretty good idea of the size I'm going to need, for each object I store. >>> >>> Excellent idea, if the custom externalizer can give us the exact size >>> of the serialized object we wouldn't need to do any guesswork. >>> I'm a bit worried about over-zealous externalizers that will spend >>> just as much computing the size of a complex object graph as they >>> spend on actually serializing the whole thing, but as long as our >>> internal externalizers are good examples I think we're ok. >>> >>> Big plus: we could use the size of the serialized object to estimate >>> the memory usage of each cache entry, so maybe with this we could >>> finally constrain the cache to use a fixed amount of memory :) >> >> >> I don't think this is a good idea because most people won't be able to >> guess the right buffer sizes. Giving inncorrect buffer sizes might even >> lead to performance degradation, until the buffers have expanded... >> >> For example, would you guys be able to guess the buffer sizes of >> Infinispan used in JBoss AS ? We're placing not just session data, but >> all sorts of crap into the cache, so I for one wouldn't be able to even >> give you a best estimate... > > that's right, I have no clue on that. As you guessed, I was referring > to the buffers being created to marshall and send and object, > and proposing an optional method: in fact for my keys I can provide a > good estimate and the current defaults are quite far from it. > The subject of the thread is "Adaptive marshaller buffer sizes", so I'm pretty sure Galder was thinking about that as well :-) He probably wasn't thinking about MarshalledValue, but of AbstractMarshaller, where we use a default estimate of 512 bytes. I guess the fact that we have two defaults (128 and 512) for essentially the same thing proves that we need something better... For some objects, like the Lucene keys and values, it's really easy to compute the size (relying on the Infinispan internal externalizers to provide the serialized size for primitives and JDK classes). For others, navigating the object graph to get a reliable size is too complicated and we may want to use an estimate instead. > But going back to your question, it would be neat if people could > "profile" their use case by enabling some logger to output needed data > from > stress runs, and based on that provide a simple hint like initial > buffer size & thresholds to apply. > Also being able to plug in some "smart" implementation as proposed by > Galder in the first post could be useful for some, even though we > might want to avoid that in the default configuration. > > When used as Hibernate second level cache, sizes for both keys and > values are quite well defined for every cache region; could be an > interesting case to automate such optimizations. > In the general case you could have the application using 1MB values in the first 1 minute and then 1K values for the rest of the application's lifecycle, so a single estimate size won't be good for the entire lifecycle of the application. That's why I liked the reservoir sampling idea so much when I read about it, your value size can change dramatically and the buffer size will still adapt to the new size - but the later sizes carry a smaller weight, so the estimated size will change less and less. It's just like profiling, only you don't have to compromise on a single estimate. The other thing about using percentiles is that you know exactly what you're getting: if you set the estimate buffer size to the 90th percentile then you know that 90% of the requests will not require a buffer resize and 10% will. Sure, it's not bulletproof, but it's better than any hardcoded value could ever hope to be. Perhaps we can combine the two and use adaptive size estimation only if the custom externalizer can't provide an accurate size (or providing an accurate size would be too costly). Or maybe an externalizer for complex objects could use our buffer size predictor so that it doesn't have to compute the size every time. I don't know, but I know I'd really like to play with this stuff ;-) Cheers Dan > Cheers, > Sanne > >> >> -- >> Bela Ban >> Lead JGroups / Clustering Team >> JBoss >> ___ >> infi
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On Mon, May 23, 2011 at 11:55 PM, Bela Ban wrote: > > > On 5/23/11 6:50 PM, Dan Berindei wrote: > >>> From my experience, reusing and syncing on a buffer will be slower than >>> making a simple arraycopy. I used to reuse buffers in JGroups, but got >>> better perf when I simply copied the buffer. >> >> We wouldn't need any synchronization if we reused one buffer per thread ;-) > > > Dangerous for 2 reasons. First a reused buffer can grow: for example if > you send 2K messages all the time, then 1 5M message, then back to 2K, > you might have a 5M sized buffer around, unless you do resizing every > now and then. Second, you could end up with many threads, therefore many > buffers, and this is unpredictable. As I mentioned in my previous email, > the buffers we're talking about are on the receiver side, and if someone > configures a large thread pool, you could end up with many buffers. > Configuration of thread pools is outside of our control. > We can reuse just the byte[] with the estimated size, which never grows. If the buffer needs to grow we create a new byte[] which is discarded immediately after the marshalling is done. > I suggest that - whatever you guys do - measure the impact on > performance and memory usage. As I said before, my money's on simple > copying... :-) > You may be right, but we can't know for sure until we prototype it and see how it performs... > > > -- > Bela Ban > Lead JGroups / Clustering Team > JBoss > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
2011/5/23 Bela Ban : > > > On 5/23/11 8:42 PM, Dan Berindei wrote: >> On Mon, May 23, 2011 at 7:44 PM, Sanne Grinovero >> wrote: >>> To keep stuff simple, I'd add an alternative feature instead: >>> have the custom externalizers to optionally recommend an allocation buffer >>> size. >>> >>> In my experience people use a set of well known types for the key, and >>> maybe for the value as well, for which they actually know the output >>> byte size, so there's no point in Infinispan to try guessing the size >>> and then adapting on it; an exception being the often used Strings, >>> even in composite keys, but again as user of the API I have a pretty >>> good idea of the size I'm going to need, for each object I store. >>> >> >> Excellent idea, if the custom externalizer can give us the exact size >> of the serialized object we wouldn't need to do any guesswork. >> I'm a bit worried about over-zealous externalizers that will spend >> just as much computing the size of a complex object graph as they >> spend on actually serializing the whole thing, but as long as our >> internal externalizers are good examples I think we're ok. >> >> Big plus: we could use the size of the serialized object to estimate >> the memory usage of each cache entry, so maybe with this we could >> finally constrain the cache to use a fixed amount of memory :) > > > I don't think this is a good idea because most people won't be able to > guess the right buffer sizes. Giving inncorrect buffer sizes might even > lead to performance degradation, until the buffers have expanded... > > For example, would you guys be able to guess the buffer sizes of > Infinispan used in JBoss AS ? We're placing not just session data, but > all sorts of crap into the cache, so I for one wouldn't be able to even > give you a best estimate... that's right, I have no clue on that. As you guessed, I was referring to the buffers being created to marshall and send and object, and proposing an optional method: in fact for my keys I can provide a good estimate and the current defaults are quite far from it. But going back to your question, it would be neat if people could "profile" their use case by enabling some logger to output needed data from stress runs, and based on that provide a simple hint like initial buffer size & thresholds to apply. Also being able to plug in some "smart" implementation as proposed by Galder in the first post could be useful for some, even though we might want to avoid that in the default configuration. When used as Hibernate second level cache, sizes for both keys and values are quite well defined for every cache region; could be an interesting case to automate such optimizations. Cheers, Sanne > > -- > Bela Ban > Lead JGroups / Clustering Team > JBoss > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On Mon, May 23, 2011 at 11:50 PM, Bela Ban wrote: > > > On 5/23/11 6:44 PM, Sanne Grinovero wrote: >> To keep stuff simple, I'd add an alternative feature instead: >> have the custom externalizers to optionally recommend an allocation buffer >> size. >> >> In my experience people use a set of well known types for the key, and >> maybe for the value as well, for which they actually know the output >> byte size, so there's no point in Infinispan to try guessing the size >> and then adapting on it; an exception being the often used Strings, >> even in composite keys, but again as user of the API I have a pretty >> good idea of the size I'm going to need, for each object I store. >> >> Also in MarshalledValue I see that an ExposedByteArrayOutputStream is >> created, and after serialization if the buffer is found to be bigger >> than the buffer we're referencing a copy is made to create an exact >> matching byte[]. >> What about revamping the interface there, to expose the >> ExposedByteArrayOutputStream instead of byte[], up to the JGroups >> level? > > > No need to expose the ExposedByteArrayOutputStream, a byte[] buffer, > offset and length will do it, and we already use this today. > > >> In case the value is not stored in binary form, the expected life of >> the stream is very short anyway, after being pushed directly to >> network buffers we don't need it anymore... couldn't we pass the >> non-truncated stream directly to JGroups without this final size >> adjustement ? > The problem is that byte[] first has to be copied to another buffer together with the rest of the ReplicableCommand before getting to JGroups. AFAIK in JGroups you must have 1 buffer for each message. > > You do that, yes. > > However, afair, the issue is not on the *sending*, but on the > *receiving* side. That's where the larger-than-needed buffer sticks > around. On the sending side, as you mentioned, Infinispan passes a > buffer/offset/length to JGroups and JGroups passes this right on to the > network layer, which copies that data into a buffer. > I don't think so... on the receiving size the buffer size is controlled exclusively by JGroups, the unmarshaller doesn't create any buffers. The only buffers on the receiving side are those created by JGroups, and JGroups knows the message size before creating the buffer so it doesn't have to worry about predicting buffer sizes. On sending however I understood that JGroups keeps the buffer with the offset and length in the NakReceivingWindow exactly as it got it from Infinispan, without any trimming, until it receives a STABLE message from all the other nodes in the cluster. > -- > Bela Ban > Lead JGroups / Clustering Team > JBoss > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
2011/5/23 Bela Ban : > > > On 5/23/11 6:50 PM, Dan Berindei wrote: > >>> From my experience, reusing and syncing on a buffer will be slower than >>> making a simple arraycopy. I used to reuse buffers in JGroups, but got >>> better perf when I simply copied the buffer. >> >> We wouldn't need any synchronization if we reused one buffer per thread ;-) > > > Dangerous for 2 reasons. First a reused buffer can grow: for example if > you send 2K messages all the time, then 1 5M message, then back to 2K, > you might have a 5M sized buffer around, unless you do resizing every > now and then. Second, you could end up with many threads, therefore many > buffers, and this is unpredictable. As I mentioned in my previous email, > the buffers we're talking about are on the receiver side, and if someone > configures a large thread pool, you could end up with many buffers. > Configuration of thread pools is outside of our control. > > I suggest that - whatever you guys do - measure the impact on > performance and memory usage. As I said before, my money's on simple > copying... :-) I'm sorry, it seems I was affecting this thread with ideas about sending, as you say; in fact I don't know how receiving works. +1 for simplicity: I can bet blindly on that anyway, unless we come up with very persuasive evidence. Sanne > > > > -- > Bela Ban > Lead JGroups / Clustering Team > JBoss > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On 5/23/11 8:42 PM, Dan Berindei wrote: > On Mon, May 23, 2011 at 7:44 PM, Sanne Grinovero > wrote: >> To keep stuff simple, I'd add an alternative feature instead: >> have the custom externalizers to optionally recommend an allocation buffer >> size. >> >> In my experience people use a set of well known types for the key, and >> maybe for the value as well, for which they actually know the output >> byte size, so there's no point in Infinispan to try guessing the size >> and then adapting on it; an exception being the often used Strings, >> even in composite keys, but again as user of the API I have a pretty >> good idea of the size I'm going to need, for each object I store. >> > > Excellent idea, if the custom externalizer can give us the exact size > of the serialized object we wouldn't need to do any guesswork. > I'm a bit worried about over-zealous externalizers that will spend > just as much computing the size of a complex object graph as they > spend on actually serializing the whole thing, but as long as our > internal externalizers are good examples I think we're ok. > > Big plus: we could use the size of the serialized object to estimate > the memory usage of each cache entry, so maybe with this we could > finally constrain the cache to use a fixed amount of memory :) I don't think this is a good idea because most people won't be able to guess the right buffer sizes. Giving inncorrect buffer sizes might even lead to performance degradation, until the buffers have expanded... For example, would you guys be able to guess the buffer sizes of Infinispan used in JBoss AS ? We're placing not just session data, but all sorts of crap into the cache, so I for one wouldn't be able to even give you a best estimate... -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On 5/23/11 6:50 PM, Dan Berindei wrote: >> From my experience, reusing and syncing on a buffer will be slower than >> making a simple arraycopy. I used to reuse buffers in JGroups, but got >> better perf when I simply copied the buffer. > > We wouldn't need any synchronization if we reused one buffer per thread ;-) Dangerous for 2 reasons. First a reused buffer can grow: for example if you send 2K messages all the time, then 1 5M message, then back to 2K, you might have a 5M sized buffer around, unless you do resizing every now and then. Second, you could end up with many threads, therefore many buffers, and this is unpredictable. As I mentioned in my previous email, the buffers we're talking about are on the receiver side, and if someone configures a large thread pool, you could end up with many buffers. Configuration of thread pools is outside of our control. I suggest that - whatever you guys do - measure the impact on performance and memory usage. As I said before, my money's on simple copying... :-) -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On Mon, May 23, 2011 at 10:12 PM, Sanne Grinovero wrote: > 2011/5/23 Dan Berindei : >> On Mon, May 23, 2011 at 7:44 PM, Sanne Grinovero >> wrote: >>> To keep stuff simple, I'd add an alternative feature instead: >>> have the custom externalizers to optionally recommend an allocation buffer >>> size. >>> >>> In my experience people use a set of well known types for the key, and >>> maybe for the value as well, for which they actually know the output >>> byte size, so there's no point in Infinispan to try guessing the size >>> and then adapting on it; an exception being the often used Strings, >>> even in composite keys, but again as user of the API I have a pretty >>> good idea of the size I'm going to need, for each object I store. >>> >> >> Excellent idea, if the custom externalizer can give us the exact size >> of the serialized object we wouldn't need to do any guesswork. >> I'm a bit worried about over-zealous externalizers that will spend >> just as much computing the size of a complex object graph as they >> spend on actually serializing the whole thing, but as long as our >> internal externalizers are good examples I think we're ok. > > I'd expect many to return a static constant, not to compute anything > actually. Didn't prototype it, I'm just tossing some ideas so I might > miss something. > I guess it depends on how we document it... >> Big plus: we could use the size of the serialized object to estimate >> the memory usage of each cache entry, so maybe with this we could >> finally constrain the cache to use a fixed amount of memory :) > > that's very interesting! > >>> Also in MarshalledValue I see that an ExposedByteArrayOutputStream is >>> created, and after serialization if the buffer is found to be bigger >>> than the buffer we're referencing a copy is made to create an exact >>> matching byte[]. >>> What about revamping the interface there, to expose the >>> ExposedByteArrayOutputStream instead of byte[], up to the JGroups >>> level? >>> >>> In case the value is not stored in binary form, the expected life of >>> the stream is very short anyway, after being pushed directly to >>> network buffers we don't need it anymore... couldn't we pass the >>> non-truncated stream directly to JGroups without this final size >>> adjustement ? >>> >>> Of course when values are stored in binary form it might make sense to >>> save some memory, but again if that was an option I'd make use of it; >>> in case of Lucene I can guess the size with a very good estimate (some >>> bytes off), compared to buffer sizes of potentially many megabytes >>> which I'd prefer to avoid copying - especially not interested in it to >>> safe 2 bytes even if I where to store values in binary. >>> >> >> Yeah, but ExposedByteArrayOutputStream grows by 100% percent, so if >> you're off by 1 in your size estimate you'll waste 50% of the memory >> by keeping that buffer around. > > Not really, the current implementation has some smart logic in that > area which reduces the potential waste to 20%; > there's a "max doubling threshold" in bytes, we could set that to the > expected value or work on the idea. > The "max doubling threshold" is now at 4MB, so that wouldn't kick in in your case. But since we know we are are starting from a good enough estimate I guess we could get rid of it and always grow the buffer by 25%. > Anyway if I happen to be often "off by one", I'd better add +1 to what > my externalizer computes as estimate ;) > I was thinking more of what happens when the structure of the object changes (e.g. a new field is added) and the externalizer is not updated. > >> >> Even if your estimate is perfect you're still wasting at least 32 >> bytes on a 64-bit machine: 16 bytes for the buffer object header + 8 >> for the array reference + 4 (rounded up to 8) for the count, though I >> guess you could get that down to 4 bytes by keeping the byte[] and >> count as members of MarshalledValue. > > this is true, but only if you have to keep the > ExposedByteArrayOutputStream for longer after the RPC, otherwise the > overhead > is zero as it's being thrown away very quickly - exactly as is done now. > > Maybe we should write a couple of alternative implementations for > MarshalledValue, depending on which options are enabled as it's > getting complex. > >> Besides, for Lucene couldn't you store the actual data separately as a >> byte[] so that Infinispan doesn't wrap it in a MarshalledValue? > > exactly as we do ;) so in the Lucene case these optimizations apply to > the keys only, which are mostly tuples like "long, int, String", > considering that the strings contain very short filenames I'd love to: > a) default to something smaller than the current 128B > b) no need to make another copy if I've overestimated by a couple of > bytes, the buffer is very short lived anyway. > Hmm, if your keys are much shorter then I'm not sure eliminating the arraycopy is going to help performance that much anyway. > Cheers, > Sanne > >>
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On 5/23/11 6:44 PM, Sanne Grinovero wrote: > To keep stuff simple, I'd add an alternative feature instead: > have the custom externalizers to optionally recommend an allocation buffer > size. > > In my experience people use a set of well known types for the key, and > maybe for the value as well, for which they actually know the output > byte size, so there's no point in Infinispan to try guessing the size > and then adapting on it; an exception being the often used Strings, > even in composite keys, but again as user of the API I have a pretty > good idea of the size I'm going to need, for each object I store. > > Also in MarshalledValue I see that an ExposedByteArrayOutputStream is > created, and after serialization if the buffer is found to be bigger > than the buffer we're referencing a copy is made to create an exact > matching byte[]. > What about revamping the interface there, to expose the > ExposedByteArrayOutputStream instead of byte[], up to the JGroups > level? No need to expose the ExposedByteArrayOutputStream, a byte[] buffer, offset and length will do it, and we already use this today. > In case the value is not stored in binary form, the expected life of > the stream is very short anyway, after being pushed directly to > network buffers we don't need it anymore... couldn't we pass the > non-truncated stream directly to JGroups without this final size > adjustement ? You do that, yes. However, afair, the issue is not on the *sending*, but on the *receiving* side. That's where the larger-than-needed buffer sticks around. On the sending side, as you mentioned, Infinispan passes a buffer/offset/length to JGroups and JGroups passes this right on to the network layer, which copies that data into a buffer. -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
2011/5/23 Dan Berindei : > On Mon, May 23, 2011 at 7:44 PM, Sanne Grinovero > wrote: >> To keep stuff simple, I'd add an alternative feature instead: >> have the custom externalizers to optionally recommend an allocation buffer >> size. >> >> In my experience people use a set of well known types for the key, and >> maybe for the value as well, for which they actually know the output >> byte size, so there's no point in Infinispan to try guessing the size >> and then adapting on it; an exception being the often used Strings, >> even in composite keys, but again as user of the API I have a pretty >> good idea of the size I'm going to need, for each object I store. >> > > Excellent idea, if the custom externalizer can give us the exact size > of the serialized object we wouldn't need to do any guesswork. > I'm a bit worried about over-zealous externalizers that will spend > just as much computing the size of a complex object graph as they > spend on actually serializing the whole thing, but as long as our > internal externalizers are good examples I think we're ok. I'd expect many to return a static constant, not to compute anything actually. Didn't prototype it, I'm just tossing some ideas so I might miss something. > Big plus: we could use the size of the serialized object to estimate > the memory usage of each cache entry, so maybe with this we could > finally constrain the cache to use a fixed amount of memory :) that's very interesting! >> Also in MarshalledValue I see that an ExposedByteArrayOutputStream is >> created, and after serialization if the buffer is found to be bigger >> than the buffer we're referencing a copy is made to create an exact >> matching byte[]. >> What about revamping the interface there, to expose the >> ExposedByteArrayOutputStream instead of byte[], up to the JGroups >> level? >> >> In case the value is not stored in binary form, the expected life of >> the stream is very short anyway, after being pushed directly to >> network buffers we don't need it anymore... couldn't we pass the >> non-truncated stream directly to JGroups without this final size >> adjustement ? >> >> Of course when values are stored in binary form it might make sense to >> save some memory, but again if that was an option I'd make use of it; >> in case of Lucene I can guess the size with a very good estimate (some >> bytes off), compared to buffer sizes of potentially many megabytes >> which I'd prefer to avoid copying - especially not interested in it to >> safe 2 bytes even if I where to store values in binary. >> > > Yeah, but ExposedByteArrayOutputStream grows by 100% percent, so if > you're off by 1 in your size estimate you'll waste 50% of the memory > by keeping that buffer around. Not really, the current implementation has some smart logic in that area which reduces the potential waste to 20%; there's a "max doubling threshold" in bytes, we could set that to the expected value or work on the idea. Anyway if I happen to be often "off by one", I'd better add +1 to what my externalizer computes as estimate ;) > > Even if your estimate is perfect you're still wasting at least 32 > bytes on a 64-bit machine: 16 bytes for the buffer object header + 8 > for the array reference + 4 (rounded up to 8) for the count, though I > guess you could get that down to 4 bytes by keeping the byte[] and > count as members of MarshalledValue. this is true, but only if you have to keep the ExposedByteArrayOutputStream for longer after the RPC, otherwise the overhead is zero as it's being thrown away very quickly - exactly as is done now. Maybe we should write a couple of alternative implementations for MarshalledValue, depending on which options are enabled as it's getting complex. > Besides, for Lucene couldn't you store the actual data separately as a > byte[] so that Infinispan doesn't wrap it in a MarshalledValue? exactly as we do ;) so in the Lucene case these optimizations apply to the keys only, which are mostly tuples like "long, int, String", considering that the strings contain very short filenames I'd love to: a) default to something smaller than the current 128B b) no need to make another copy if I've overestimated by a couple of bytes, the buffer is very short lived anyway. Cheers, Sanne > >> Then if we just keep the ExposedByteArrayOutputStream around in the >> MarshalledValue, we could save some copying by replacing the >> "output.write(raw)" in writeObject(ObjectOutput output, >> MarshalledValue mv) by using a >> output.write( byte[] , offset, length ); >> >> Cheers, >> Sanne >> >> >> 2011/5/23 Bela Ban : >>> >>> >>> On 5/23/11 6:15 PM, Dan Berindei wrote: >>> I totally agree, combining adaptive size with buffer reuse would be really cool. I imagine when passing the buffer to JGroups we'd still make an arraycopy, but we'd get rid of a lot of arraycopy calls to resize the buffer when the average object size is> 500 bytes. At the same time, if a small percentage of
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On Mon, May 23, 2011 at 7:44 PM, Sanne Grinovero wrote: > To keep stuff simple, I'd add an alternative feature instead: > have the custom externalizers to optionally recommend an allocation buffer > size. > > In my experience people use a set of well known types for the key, and > maybe for the value as well, for which they actually know the output > byte size, so there's no point in Infinispan to try guessing the size > and then adapting on it; an exception being the often used Strings, > even in composite keys, but again as user of the API I have a pretty > good idea of the size I'm going to need, for each object I store. > Excellent idea, if the custom externalizer can give us the exact size of the serialized object we wouldn't need to do any guesswork. I'm a bit worried about over-zealous externalizers that will spend just as much computing the size of a complex object graph as they spend on actually serializing the whole thing, but as long as our internal externalizers are good examples I think we're ok. Big plus: we could use the size of the serialized object to estimate the memory usage of each cache entry, so maybe with this we could finally constrain the cache to use a fixed amount of memory :) > Also in MarshalledValue I see that an ExposedByteArrayOutputStream is > created, and after serialization if the buffer is found to be bigger > than the buffer we're referencing a copy is made to create an exact > matching byte[]. > What about revamping the interface there, to expose the > ExposedByteArrayOutputStream instead of byte[], up to the JGroups > level? > > In case the value is not stored in binary form, the expected life of > the stream is very short anyway, after being pushed directly to > network buffers we don't need it anymore... couldn't we pass the > non-truncated stream directly to JGroups without this final size > adjustement ? > > Of course when values are stored in binary form it might make sense to > save some memory, but again if that was an option I'd make use of it; > in case of Lucene I can guess the size with a very good estimate (some > bytes off), compared to buffer sizes of potentially many megabytes > which I'd prefer to avoid copying - especially not interested in it to > safe 2 bytes even if I where to store values in binary. > Yeah, but ExposedByteArrayOutputStream grows by 100% percent, so if you're off by 1 in your size estimate you'll waste 50% of the memory by keeping that buffer around. Even if your estimate is perfect you're still wasting at least 32 bytes on a 64-bit machine: 16 bytes for the buffer object header + 8 for the array reference + 4 (rounded up to 8) for the count, though I guess you could get that down to 4 bytes by keeping the byte[] and count as members of MarshalledValue. Besides, for Lucene couldn't you store the actual data separately as a byte[] so that Infinispan doesn't wrap it in a MarshalledValue? > Then if we just keep the ExposedByteArrayOutputStream around in the > MarshalledValue, we could save some copying by replacing the > "output.write(raw)" in writeObject(ObjectOutput output, > MarshalledValue mv) by using a > output.write( byte[] , offset, length ); > > Cheers, > Sanne > > > 2011/5/23 Bela Ban : >> >> >> On 5/23/11 6:15 PM, Dan Berindei wrote: >> >>> I totally agree, combining adaptive size with buffer reuse would be >>> really cool. I imagine when passing the buffer to JGroups we'd still >>> make an arraycopy, but we'd get rid of a lot of arraycopy calls to >>> resize the buffer when the average object size is> 500 bytes. At the >>> same time, if a small percentage of the objects are much bigger than >>> the rest, we wouldn't reuse those huge buffers so we wouldn't waste >>> too much memory. >> >> >> From my experience, reusing and syncing on a buffer will be slower than >> making a simple arraycopy. I used to reuse buffers in JGroups, but got >> better perf when I simply copied the buffer. >> Plus the reservoir sampling's complexity is another source of bugs... >> >> -- >> Bela Ban >> Lead JGroups / Clustering Team >> JBoss >> ___ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On Mon, May 23, 2011 at 7:20 PM, Bela Ban wrote: > > > On 5/23/11 6:15 PM, Dan Berindei wrote: > >> I totally agree, combining adaptive size with buffer reuse would be >> really cool. I imagine when passing the buffer to JGroups we'd still >> make an arraycopy, but we'd get rid of a lot of arraycopy calls to >> resize the buffer when the average object size is> 500 bytes. At the >> same time, if a small percentage of the objects are much bigger than >> the rest, we wouldn't reuse those huge buffers so we wouldn't waste >> too much memory. > > > From my experience, reusing and syncing on a buffer will be slower than > making a simple arraycopy. I used to reuse buffers in JGroups, but got > better perf when I simply copied the buffer. We wouldn't need any synchronization if we reused one buffer per thread ;-) While working on optimizing the hotrod client I also thought about reusing the buffer inside the marshaller and passing a copy to the transport, but I didn't implement it because I wasn't sure how I could reuse just the default size buffer and drop any buffers created with bigger sizes. > Plus the reservoir sampling's complexity is another source of bugs... > Adaptive buffer size can also help during the creation of the buffer: if the value size is routinely > 5KB, in order to grow the buffer from 500 bytes to 5000 we'd have at least 3 arraycopy calls for each marshall operation. With an estimated size of 4000 we'd have just one arraycopy call. But Sanne's suggestion of an externalizer method to suggest the size may be even better. > -- > Bela Ban > Lead JGroups / Clustering Team > JBoss > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
To keep stuff simple, I'd add an alternative feature instead: have the custom externalizers to optionally recommend an allocation buffer size. In my experience people use a set of well known types for the key, and maybe for the value as well, for which they actually know the output byte size, so there's no point in Infinispan to try guessing the size and then adapting on it; an exception being the often used Strings, even in composite keys, but again as user of the API I have a pretty good idea of the size I'm going to need, for each object I store. Also in MarshalledValue I see that an ExposedByteArrayOutputStream is created, and after serialization if the buffer is found to be bigger than the buffer we're referencing a copy is made to create an exact matching byte[]. What about revamping the interface there, to expose the ExposedByteArrayOutputStream instead of byte[], up to the JGroups level? In case the value is not stored in binary form, the expected life of the stream is very short anyway, after being pushed directly to network buffers we don't need it anymore... couldn't we pass the non-truncated stream directly to JGroups without this final size adjustement ? Of course when values are stored in binary form it might make sense to save some memory, but again if that was an option I'd make use of it; in case of Lucene I can guess the size with a very good estimate (some bytes off), compared to buffer sizes of potentially many megabytes which I'd prefer to avoid copying - especially not interested in it to safe 2 bytes even if I where to store values in binary. Then if we just keep the ExposedByteArrayOutputStream around in the MarshalledValue, we could save some copying by replacing the "output.write(raw)" in writeObject(ObjectOutput output, MarshalledValue mv) by using a output.write( byte[] , offset, length ); Cheers, Sanne 2011/5/23 Bela Ban : > > > On 5/23/11 6:15 PM, Dan Berindei wrote: > >> I totally agree, combining adaptive size with buffer reuse would be >> really cool. I imagine when passing the buffer to JGroups we'd still >> make an arraycopy, but we'd get rid of a lot of arraycopy calls to >> resize the buffer when the average object size is> 500 bytes. At the >> same time, if a small percentage of the objects are much bigger than >> the rest, we wouldn't reuse those huge buffers so we wouldn't waste >> too much memory. > > > From my experience, reusing and syncing on a buffer will be slower than > making a simple arraycopy. I used to reuse buffers in JGroups, but got > better perf when I simply copied the buffer. > Plus the reservoir sampling's complexity is another source of bugs... > > -- > Bela Ban > Lead JGroups / Clustering Team > JBoss > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
On 5/23/11 6:15 PM, Dan Berindei wrote: > I totally agree, combining adaptive size with buffer reuse would be > really cool. I imagine when passing the buffer to JGroups we'd still > make an arraycopy, but we'd get rid of a lot of arraycopy calls to > resize the buffer when the average object size is> 500 bytes. At the > same time, if a small percentage of the objects are much bigger than > the rest, we wouldn't reuse those huge buffers so we wouldn't waste > too much memory. From my experience, reusing and syncing on a buffer will be slower than making a simple arraycopy. I used to reuse buffers in JGroups, but got better perf when I simply copied the buffer. Plus the reservoir sampling's complexity is another source of bugs... -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
Hi Galder Sorry I'm replying so late On Thu, May 19, 2011 at 2:02 PM, Galder Zamarreño wrote: > Hi all, > > Re: https://issues.jboss.org/browse/ISPN-1102 > > First of all thanks to Dan for his suggestion on reservoir > sampling+percentiles, very good suggestion:). So, I'm looking into this and > Trustin's > http://docs.jboss.org/netty/3.2/api/org/jboss/netty/channel/AdaptiveReceiveBufferSizePredictor.html > but in this email I wanted to discuss the reservoir sampling mechanism > (http://en.wikipedia.org/wiki/Reservoir_sampling). > > So, the way I look at it, to implement this you'd keep track of N buffer > sizes used so far, and out of those chose K samples based on reservoir > sampling, and then of those K samples left, take the 90th percentile. > > Calculating the percentile is easy with those K samples stored in an ordered > collection. Now, my problem with this is that reservoir sampling is an O(n) > operation and you would not want to be doing that per each request for a > buffer that comes in. > The way I see it the R algorithm is only O(n) because you have to do a random(1, i) call for each request i. Because random(1, i) tends to return bigger and bigger values with every request, the probability of actually modifying the collection of samples becomes smaller and smaller. So it would be ok to compute the buffer size estimate on every modification because those modifications are very rare after startup. > One option I can think of that instead of ever letting a user thread > calculate this, the user thread could just feed the buffer size collection (a > concurrent collection) and we could have a thread in the background that > periodically or based on some threshold calculates the reservoir sample + > percentile and this is what's used as next buffer size. My biggest problem > here is the concurrent collection in the middle. You could have a priority > queue ordered by buffer sizes but it's unbounded. The concurrent collection > does not require to be ordered though, the reservoir sampling could do that, > but you want it the collection bounded. But if bounded and the limit is hit, > you would not want it to block but instead override values remove the last > element and insert again. You only care about the last N relevant buffer > sizes... > Based on my assumption that modifications become very rare after startup I was thinking of using a fixed size array with a synchronized block around each access (we'd probably need synchronization for the Random, too). Note that you must use an array or a list, because you have to replace the element random(1, i) and not the smallest/greatest/something else. > Another option that would avoid the use of a concurrent collection would be > if this was calculated per thread and stored in a thread local. The > calculation could be done every X requests still in the client thread, or > could be sent to a separate thread wrapping it around a callable and keeping > the future as thread local, you could query it next time the thread wants to > marshall something. > True, per thread statistics could remove the need of synchronization completely. But using a separate thread for doing the computations would require synchronization again, and I don't think there's enough work to be done to justify it. > I feel a bit more inclined towards the latter option although it limits the > calculation to be per-thread for several reasons: > - We already have org.jboss.marshalling.Marshaller and > org.jboss.marshalling.Unmarshaller instances as thread local which have > proven to perform well. > - So we could tap into this set up to maintain not only the marshaller, but > the size of the buffer too. > - It could offer the possibility of being extended further to avoid creating > buffers all the time and instead reuse them as long as the size is constant. > After a size recalculation we'd ditch it and create a new one. > I totally agree, combining adaptive size with buffer reuse would be really cool. I imagine when passing the buffer to JGroups we'd still make an arraycopy, but we'd get rid of a lot of arraycopy calls to resize the buffer when the average object size is > 500 bytes. At the same time, if a small percentage of the objects are much bigger than the rest, we wouldn't reuse those huge buffers so we wouldn't waste too much memory. > Thoughts? > -- > Galder Zamarreño > Sr. Software Engineer > Infinispan, JBoss Cache > > > ___ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev > ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
Assuming escape analysis does its job, Bela's idea makes sense. But, I'm not sure it's always enabled in Java 6 or non-Oracle VMs. What about using adaptive prediction, and copy the buffer to the array of right size when prediction is way off? On 05/19/2011 10:35 PM, Bela Ban wrote: > Have we actually measured performance when we simply do an array copy of > [offset-length] and pass the copy to JGroups ? This generates a little > more garbage, but most of that is collected in eden, which is very fast. > Reservoir sampling might be an overkill here, and complicates the code, too. > > I'll take a look at the size of a PUT on the sender and receiver side in > the debugger this or next week. I believe the problem is on the receiver > side, but I need to verify this. > > > On 5/19/11 1:02 PM, Galder Zamarreño wrote: >> Hi all, >> >> Re: https://issues.jboss.org/browse/ISPN-1102 >> >> First of all thanks to Dan for his suggestion on reservoir >> sampling+percentiles, very good suggestion:). So, I'm looking into this and >> Trustin's >> http://docs.jboss.org/netty/3.2/api/org/jboss/netty/channel/AdaptiveReceiveBufferSizePredictor.html >> but in this email I wanted to discuss the reservoir sampling mechanism >> (http://en.wikipedia.org/wiki/Reservoir_sampling). >> >> So, the way I look at it, to implement this you'd keep track of N buffer >> sizes used so far, and out of those chose K samples based on reservoir >> sampling, and then of those K samples left, take the 90th percentile. >> >> Calculating the percentile is easy with those K samples stored in an ordered >> collection. Now, my problem with this is that reservoir sampling is an O(n) >> operation and you would not want to be doing that per each request for a >> buffer that comes in. >> >> One option I can think of that instead of ever letting a user thread >> calculate this, the user thread could just feed the buffer size collection >> (a concurrent collection) and we could have a thread in the background that >> periodically or based on some threshold calculates the reservoir sample + >> percentile and this is what's used as next buffer size. My biggest problem >> here is the concurrent collection in the middle. You could have a priority >> queue ordered by buffer sizes but it's unbounded. The concurrent collection >> does not require to be ordered though, the reservoir sampling could do that, >> but you want it the collection bounded. But if bounded and the limit is hit, >> you would not want it to block but instead override values remove the last >> element and insert again. You only care about the last N relevant buffer >> sizes... >> >> Another option that would avoid the use of a concurrent collection would be >> if this was calculated per thread and stored in a thread local. The >> calculation could be done every X requests still in the client thread, or >> could be sent to a separate thread wrapping it around a callable and keeping >> the future as thread local, you could query it next time the thread wants to >> marshall something. >> >> I feel a bit more inclined towards the latter option although it limits the >> calculation to be per-thread for several reasons: >> - We already have org.jboss.marshalling.Marshaller and >> org.jboss.marshalling.Unmarshaller instances as thread local which have >> proven to perform well. >> - So we could tap into this set up to maintain not only the marshaller, but >> the size of the buffer too. >> - It could offer the possibility of being extended further to avoid creating >> buffers all the time and instead reuse them as long as the size is constant. >> After a size recalculation we'd ditch it and create a new one. > -- Trustin Lee, http://gleamynode.net/ ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev
Re: [infinispan-dev] Adaptive marshaller buffer sizes - ISPN-1102
Have we actually measured performance when we simply do an array copy of [offset-length] and pass the copy to JGroups ? This generates a little more garbage, but most of that is collected in eden, which is very fast. Reservoir sampling might be an overkill here, and complicates the code, too. I'll take a look at the size of a PUT on the sender and receiver side in the debugger this or next week. I believe the problem is on the receiver side, but I need to verify this. On 5/19/11 1:02 PM, Galder Zamarreño wrote: > Hi all, > > Re: https://issues.jboss.org/browse/ISPN-1102 > > First of all thanks to Dan for his suggestion on reservoir > sampling+percentiles, very good suggestion:). So, I'm looking into this and > Trustin's > http://docs.jboss.org/netty/3.2/api/org/jboss/netty/channel/AdaptiveReceiveBufferSizePredictor.html > but in this email I wanted to discuss the reservoir sampling mechanism > (http://en.wikipedia.org/wiki/Reservoir_sampling). > > So, the way I look at it, to implement this you'd keep track of N buffer > sizes used so far, and out of those chose K samples based on reservoir > sampling, and then of those K samples left, take the 90th percentile. > > Calculating the percentile is easy with those K samples stored in an ordered > collection. Now, my problem with this is that reservoir sampling is an O(n) > operation and you would not want to be doing that per each request for a > buffer that comes in. > > One option I can think of that instead of ever letting a user thread > calculate this, the user thread could just feed the buffer size collection (a > concurrent collection) and we could have a thread in the background that > periodically or based on some threshold calculates the reservoir sample + > percentile and this is what's used as next buffer size. My biggest problem > here is the concurrent collection in the middle. You could have a priority > queue ordered by buffer sizes but it's unbounded. The concurrent collection > does not require to be ordered though, the reservoir sampling could do that, > but you want it the collection bounded. But if bounded and the limit is hit, > you would not want it to block but instead override values remove the last > element and insert again. You only care about the last N relevant buffer > sizes... > > Another option that would avoid the use of a concurrent collection would be > if this was calculated per thread and stored in a thread local. The > calculation could be done every X requests still in the client thread, or > could be sent to a separate thread wrapping it around a callable and keeping > the future as thread local, you could query it next time the thread wants to > marshall something. > > I feel a bit more inclined towards the latter option although it limits the > calculation to be per-thread for several reasons: > - We already have org.jboss.marshalling.Marshaller and > org.jboss.marshalling.Unmarshaller instances as thread local which have > proven to perform well. > - So we could tap into this set up to maintain not only the marshaller, but > the size of the buffer too. > - It could offer the possibility of being extended further to avoid creating > buffers all the time and instead reuse them as long as the size is constant. > After a size recalculation we'd ditch it and create a new one. -- Bela Ban Lead JGroups / Clustering Team JBoss ___ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev