On Sun, Mar 18, 2012 at 12:51 PM, Bela Ban <[email protected]> wrote: > > Let's take a look at an example: > > - Nodes are N1-N5, keys are 1-10 > > With the current state transfer, we'd serialize the 10 keys into a > buffer of (say) 10. That buffer is then sent to (say) all nodes, so we > have 5 messages of 10 = 50. > > Now say the keys are only sent to the nodes which should store them: > - N1: 1, 5, 9 > - N2: 1,2, 5,6, 9,10 > - N3: 2,3 6,7, 10 > - N4: 3,4 7,8 > - N5: 4,8 > > This is a total of 5 messages of sizes 3, 6, 5, 4 and 2 = 20. The > biggest message is 6, and the total size of half of the current approach. > > Wrt serialization: we could maintain a simple hashmap of the current > keys serialized, e.g. > - N1: we serialize keys 1, 5 and 9 and add them to the hashmap > - N2: we serialize keys 2, 6 and 10 and add them to the hashmap. Keys 1, > 5 and 9 have already been serialized and the byte buffer of their values > can just be reused from the hashmap. > > > If we do this intelligently, we could sort this temp hashmap (we do know > the hashwheel after all, don't we), and remove keys/values that were > already sent, as they're not going to be used again. This would reduce > the max memory spike for the temp hashmap. > > >> So when the cluster is small and all the owners would receive most of >> the keys anyway, the savings in deserialization are offset by the >> increased costs in serialization. > > > You don't serialize the same key twice, see above. You just copy the > buffer into the resulting buffer, this is very efficient. With > scatter/gather (NIO2, JGroups 3.3), we would not even have to copy the > buffers, but pass all of them into the JGroups message. >
Ok, you can avoid the multiple serialization of entries (although I think that will need some pretty big changes to our command serialization framework). But you'll still have numOwners+1 copies of the entries in memory at the same time on the command originator... > >>> If we have a TX which touches a lot of keys, unless we break the key set >>> down into their associated targets, we would have to potentially >>> transfer a *big* state ! >>> >> >> If we serialize the keys separately for each target, the originator >> will have to hold in memory numOwners * numKeys serialized entries >> until the messages are garbage collected. > > > That's 20, correct. But with the current approach, it is 50 (key set of > 10 sent to all 5 nodes). > Actually no: we use the same buffer in all 5 messages, so it's just 10. > >> With the current approach, >> we only have to hold in memory numKeys serialized entries on the >> originator - at the expense of extra load on the recipients. > > > You're sending the 10 serialized keys to 5 nodes, so you'll have 5 > messages with 10 each in memory, that's worse then my suggested approach. > There is only one copy of the buffer on the originator, although there are 5 copies on the network and on the recipients. _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
