On Jan 28, 2013, at 12:43 PM, Manik Surtani <[email protected]> wrote:
> > On 28 Jan 2013, at 12:35, Dan Berindei <[email protected]> wrote: > >> >> >> On Mon, Jan 28, 2013 at 1:56 PM, Manik Surtani <[email protected]> wrote: >> Let me clarify a few things on this thread. THere seems to be a bit of >> confusion here. :) >> >> storeAsBinary in Infinispan was designed with the following purposes in >> mind, in order of importance: >> >> 1) Performance. Prevent serialising/deserializing an entry multiple times >> (e.g., to write through to disk, to replicate over the network, concurrent >> threads needing to read the object representation). >> >> >> TBH I don't think storeAsBinary as it works now is that good for >> performance, because MarshalledValueInterceptor compacts keys/values after >> every operation (see MarshalledValueInterceptor.java:320 and its callers). >> Once a key/value is deserialized, its serialized form is deleted, and it has >> to be serialized again if a remote node asks for it. > > That is only correct on the node where you're running the operation. The > remote node has different characteristics. The byte array is never > deserialized when reading off the wire, always kept as a byte array, and when > asked for again (via a remote GET) it just needs to do a buffer copy. Now > this is breaks the moment a thread local to that remote node looks up an > entry, but if you have some form of key affinity then you really see this > benefit. > >> >> So it would save at most one serialization compared to storing the entries >> as references (and only if the entry also needs to be written to a cache >> store). Instead it adds a bit of overhead on each operation to keep track of >> the marshalled value status. > > Well, if you have > 1 cache store enabled, etc etc. > >> 2) Classloader isolation (as Galder mentioned). This became a secondary >> purpose of this feature (originally observed as a side-effect). Enhanced by >> allowing storeKeyAsBinary and storeValueAsBinary options for more >> fine-grained control of this behaviour. >> >> >> I'd say this has become the reason most people use this feature now, even >> though a regular cache should work fine in AS7 there are still other >> environments where it is needed. > > Yes, they are both as important today; I was just stating what the original > intentions were. :) > >> >> >> Now lets consider what JSR 107 needs. Similarly named, the feature in JSR >> 107 serves a completely different purpose, and this is referential >> integrity. Think database-style isolation (repeatable read, etc) where >> concurrent threads holding object references to the same value, and mutating >> the same value, are not visible until a commit. >> >> I originally thought that Infinispan's storeAsBinary can be used for this, >> but apparently not without some additional changes/tweaks. Maybe we need: >> >> 1) A new config option for this behaviour. <storeAsBinary defensive="true" >> /> ? >> 2) If enabled, maybe use a subclass of MarshalledValue >> (DefensiveMarshalledValue?) that *always* stores a byte[] and never caches >> the object representation? >> >> >> I think we'd still need to cache the object instance while the command is >> executing, otherwise we'll have too many deserializations. But perhaps the >> new setting could control whether MarshalledValueInterceptor calls >> MarshalledValue.compact with preferSerializedRepresentation == true instead >> of false, as it does now. > > Well, you will want eager serialisation too, even in local mode. So that > would have to be built in. So maybe rather than a MarshalledValue subclass, > we really need a MarshalledValueInterceptor subclass. Even easier/better > encapsulated. :) My preference is for a new interceptor rather than changing/subclassing MarshalledValue. Created https://issues.jboss.org/browse/ISPN-2769 > >> >> >> What do you think? >> >> Cheers >> Manik >> >> On 28 Jan 2013, at 10:00, Sanne Grinovero <[email protected]> wrote: >> >> > I remember Manik and me pair-programming on that class to simplify it >> > a bit - especially as there are some performance complexities - but we >> > ended up not touching it as any change would have violated some >> > expectations of one feature or another. >> > >> > Let's put this on the list of cleanups to be performed for 6.0? >> > >> > On 28 January 2013 09:14, Galder Zamarreño <[email protected]> wrote: >> >> >> >> On Jan 25, 2013, at 11:37 AM, Sanne Grinovero <[email protected]> >> >> wrote: >> >> >> >>> On 25 January 2013 11:11, Galder Zamarreño <[email protected]> wrote: >> >>>> >> >>>> On Jan 24, 2013, at 4:26 PM, Sanne Grinovero <[email protected]> >> >>>> wrote: >> >>>> >> >>>>> It's important to note that Infinispan's implementation of storing as >> >>>>> binary isn't guaranteeing different instances of objects are returned >> >>>>> to different get() invocations (especially when they happen in >> >>>>> parallel). >> >>>> >> >>>> ^ Do you have a test for this? >> >>> >> >>> No, it's self-evident by reading the code. I'd venture saying it's a >> >>> design choice: the option was not designed to provide isolation, >> >>> people should not abuse of it for a different purpose. >> >>> >> >>>> Could this be related to the fact that a get(), unless it had received >> >>>> that entry from another node, will held as reference? >> >>>> >> >>>> It'd be interesting if that test works if after a put() you call >> >>>> compact()... >> >>>> >> >>>>> This is the reason for example that Hibernate OGM can't use this flag >> >>>>> to have safe and independent instances, but needs to make defensive >> >>>>> copies if returned values. As I read in your first post, you want to >> >>>>> use this for defensive copies: that doesn't work, especially if the >> >>>>> TCK is performing concurrent requests. >> >>>> >> >>>> ^ As I said, the storeAsBinary feature is heavily optimised for >> >>>> performance, hence why it initially keeps instances as references, so >> >>>> that if another thread requests the entry soon later, a reference is >> >>>> sent back (no need to serialize/deserialize the entry just put) >> >>> >> >>> As you say "the reference is sent back", even if it's the same >> >>> instance as a previous request. I have no doubt that's for performance >> >>> reasons: I patched that code myself and have carefully kept that >> >>> "feature" of instance reuse available. >> >>> I'm not sure it can provide much of a benefit generally speaking, but >> >>> this has always been like that and I guess there could be specific >> >>> access patterns in which this is very useful. >> >> >> >> The reason we have storeAsBinary is due to lazyDeserialization. The >> >> latter was a solution we designed to get around deserialization issues on >> >> app server environments where JGroups would attempt to deserialize data >> >> with the wrong classloader. >> >> >> >> The idea at the time was that deserialization would be delayed until a >> >> thread with the correct classloader in context would come and deserialize >> >> data, hence the name: lazy deserialization. This was needed in AS4/5/6. >> >> >> >> The design has always been the same, make sure data is kept in binary >> >> format in the receiver and only deserialize when needed. >> >> >> >> This lazy deserialization is no longer needed in AS7 cos a particular >> >> plugin is set in JBoss Marshaller which adds modular classloader info to >> >> serialized data. So, when data arrives in the receiver, it can be >> >> deserialized directly cos the classloader info allows for the correct >> >> classloader to be found. >> >> >> >> The naming change of lazyDeserialization to storeAsBinary was on purpouse >> >> and precisely with the aim of it becoming a way to provide store-as-value >> >> capabilities. The problem is that as you and Vladimir have spotted, this >> >> doesn't really work like that. >> >> >> >>> >> >>> Cheers, >> >>> Sanne >> >>> >> >>>> >> >>>>> >> >>>>> >> >>>>> >> >>>>> On 24 January 2013 16:09, Manik Surtani <[email protected]> wrote: >> >>>>>> >> >>>>>> On 24 Jan 2013, at 15:39, Vladimir Blagojevic <[email protected]> >> >>>>>> wrote: >> >>>>>> >> >>>>>>> No valid reason Manik. In summary I thought I would have gotten our >> >>>>>>> keys/values serialized even in local VM if I turn on storeAsBinary >> >>>>>>> but that does not seem to be the case. >> >>>>>> >> >>>>>> Is it not? Perhaps it is only serialised the first time a serial >> >>>>>> form is necessary. You can get around this by calling compact() >> >>>>>> >> >>>>>> http://docs.jboss.org/infinispan/5.1/apidocs/org/infinispan/Cache.html#compact() >> >>>>>> >> >>>>>> But this definitely isn't the most optimal way of doing things. >> >>>>>> Perhaps a new config option for eager serialisation might be >> >>>>>> necessary, but for now calling compact() should work. >> >>>>>> >> >>>>>>> I need to use storeAsBinary to complete a feature of JSR 107 that >> >>>>>>> allows storing of key/value pairs as serialized values rather than >> >>>>>>> simple references. >> >>>>>> >> >>>>>> Yup, I realise. >> >>>>>> >> >>>>>>> >> >>>>>>> TBH, I am not sure how can we do this given mechanisms we have in >> >>>>>>> place. I would have to implement serialization/deserialization in >> >>>>>>> our jsr 107 project but that would be a wrong path if we can somehow >> >>>>>>> turn on our own existing storeAsBinary for in VM stored objects (see >> >>>>>>> Galder's email on what is currently done) >> >>>>>>> >> >>>>>>> >> >>>>>>> Regards, >> >>>>>>> Vladimir >> >>>>>>> On 13-01-24 7:09 AM, Manik Surtani wrote: >> >>>>>>>> JSR 107's storeAsBinary and our storeAsBinary are conceptually the >> >>>>>>>> same. You get a defensive copy and this should work. >> >>>>>>>> >> >>>>>>>> But see my comment below: >> >>>>>>>> >> >>>>>>>> Also adding Mircea in cc. Any reason why you're not using >> >>>>>>>> infinispan-dev for this? >> >>>>>>>> >> >>>>>>>> On 24 Jan 2013, at 12:00, Galder Zamarreño <[email protected]> >> >>>>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> Hey Vladimir, >> >>>>>>>>> >> >>>>>>>>> IIRC, for performance reasons, even with storeAsBinary, Infinispan >> >>>>>>>>> keeps the data as normal instance locally. When data is serialized >> >>>>>>>>> and sent to other nodes, again for performance reasons, it keeps >> >>>>>>>>> it as raw or byte[] format. >> >>>>>>>>> >> >>>>>>>>> So, storing objects by value only happens in counted occassions >> >>>>>>>>> when storeAsBinary is enabled. >> >>>>>>>>> >> >>>>>>>>> You can track it by using a debugger and see how the the >> >>>>>>>>> MarshalledValue instances are created. >> >>>>>>>>> >> >>>>>>>>> Not sure how to fix this without some extra configuration option. >> >>>>>>>>> >> >>>>>>>>> Cheers, >> >>>>>>>>> >> >>>>>>>>> On Jan 23, 2013, at 5:38 PM, Vladimir Blagojevic >> >>>>>>>>> <[email protected]> wrote: >> >>>>>>>>> >> >>>>>>>>>> Galder, >> >>>>>>>>>> >> >>>>>>>>>> A quick search of help from you beacuse you are more familiar >> >>>>>>>>>> with this area (storeAsBinary) than I am. There is a tck test >> >>>>>>>>>> that checks storing of objects by value not by reference in the >> >>>>>>>>>> cache [1]. I thought that if we set our underlying cache to be >> >>>>>>>>>> storeAsBinary we would handle this tck requirement (store by >> >>>>>>>>>> value if neeed rather than by reference). However, >> >>>>>>>>>> StoreByValueTest fails although I set our underlying Infinispan >> >>>>>>>>>> cache to be storeAsBinary. I am using local cache athough I tried >> >>>>>>>>>> with transport and dist_async setup as well - same result. Any >> >>>>>>>>>> ideas what is going on? >> >>>>>>>>>> >> >>>>>>>>>> Have a look at the test [1] , result I get are below: >> >>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>>> ------------------------------------------------------- >> >>>>>>>>>> Running org.jsr107.tck.StoreByValueTest >> >>>>>>>>>> Jan 23, 2013 12:35:29 PM org.jsr107.tck.util.ExcludeList <init> >> >>>>>>>>>> INFO: ===== ExcludeList >> >>>>>>>>>> url=file:/Users/vladimir/workspace/jsr107/jsr107tck/implementation-tester/target/test-classes/ExcludeList >> >>>>>>>>>> Defined org.jsr107.tck.StoreByValueTest config >> >>>>>>>>>> StoreAsBinaryConfiguration{enabled=true, storeKeysAsBinary=true, >> >>>>>>>>>> storeValuesAsBinary=true} >> >>>>>>>>>> Tests run: 6, Failures: 6, Errors: 0, Skipped: 0, Time elapsed: >> >>>>>>>>>> 21.852 sec <<< FAILURE! >> >>>>>>>>>> >> >>>>>>>>>> Results : >> >>>>>>>>>> >> >>>>>>>>>> Failed tests: >> >>>>>>>>>> get_Existing_MutateValue(org.jsr107.tck.StoreByValueTest): >> >>>>>>>>>> expected: java.util.Date<Wed Jan 23 12:35:34 EST 2013> but was: >> >>>>>>>>>> java.util.Date<Wed Jan 23 12:35:34 EST 2013> >> >>>>>>>> ?? These seem the same to me? How is the TCK testing for these >> >>>>>>>> two values? By reference? Or using .equals()? >> >>>>>>>> >> >>>>>>>>>> get_Existing_MutateKey(org.jsr107.tck.StoreByValueTest): >> >>>>>>>>>> expected:<Wed Jan 23 12:35:38 EST 2013> but was:<null> >> >>>>>>>> This seems a bigger issue. You might want to look at Infinispan >> >>>>>>>> logs here? >> >>>>>>>> >> >>>>>>>>>> getAndPut_NotThere(org.jsr107.tck.StoreByValueTest): expected: >> >>>>>>>>>> java.util.Date<Wed Jan 23 12:35:41 EST 2013> but was: >> >>>>>>>>>> java.util.Date<Wed Jan 23 12:35:41 EST 2013> >> >>>>>>>> Again, see my first comment. >> >>>>>>>> >> >>>>>>>>>> getAndPut_Existing_MutateValue(org.jsr107.tck.StoreByValueTest): >> >>>>>>>>>> expected: java.util.Date<Wed Jan 23 12:35:45 EST 2013> but was: >> >>>>>>>>>> java.util.Date<Wed Jan 23 12:35:45 EST 2013> >> >>>>>>>> Again, see my first comment. >> >>>>>>>> >> >>>>>>>>>> getAndPut_Existing_NonSameKey_MutateValue(org.jsr107.tck.StoreByValueTest): >> >>>>>>>>>> expected: java.util.Date<Wed Jan 23 12:35:48 EST 2013> but was: >> >>>>>>>>>> java.util.Date<Wed Jan 23 12:35:48 EST 2013> >> >>>>>>>> Again, see my first comment. >> >>>>>>>> >> >>>>>>>>>> getAndPut_Existing_NonSameKey_MutateKey(org.jsr107.tck.StoreByValueTest): >> >>>>>>>>>> expected:<Wed Jan 23 12:35:51 EST 2013> but was:<null> >> >>>>>>>>>> >> >>>>>>>>>> Tests run: 6, Failures: 6, Errors: 0, Skipped: 0 >> >>>>>>>>>> >> >>>>>>>>>> [1] >> >>>>>>>>>> https://github.com/jsr107/jsr107tck/blob/master/cache-tests/src/test/java/org/jsr107/tck/StoreByValueTest.java >> >>>>>>>>> >> >>>>>>>>> -- >> >>>>>>>>> Galder Zamarreño >> >>>>>>>>> [email protected] >> >>>>>>>>> twitter.com/galderz >> >>>>>>>>> >> >>>>>>>>> Project Lead, Escalante >> >>>>>>>>> http://escalante.io >> >>>>>>>>> >> >>>>>>>>> Engineer, Infinispan >> >>>>>>>>> http://infinispan.org >> >>>>>>>>> >> >>>>>>>> -- >> >>>>>>>> Manik Surtani >> >>>>>>>> [email protected] >> >>>>>>>> twitter.com/maniksurtani >> >>>>>>>> >> >>>>>>>> Platform Architect, JBoss Data Grid >> >>>>>>>> http://red.ht/data-grid >> >>>>>>>> >> >>>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> Manik Surtani >> >>>>>> [email protected] >> >>>>>> twitter.com/maniksurtani >> >>>>>> >> >>>>>> Platform Architect, JBoss Data Grid >> >>>>>> http://red.ht/data-grid >> >>>>>> >> >>>>>> >> >>>>>> _______________________________________________ >> >>>>>> infinispan-dev mailing list >> >>>>>> [email protected] >> >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >>>>> >> >>>>> _______________________________________________ >> >>>>> infinispan-dev mailing list >> >>>>> [email protected] >> >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >>>> >> >>>> >> >>>> -- >> >>>> Galder Zamarreño >> >>>> [email protected] >> >>>> twitter.com/galderz >> >>>> >> >>>> Project Lead, Escalante >> >>>> http://escalante.io >> >>>> >> >>>> Engineer, Infinispan >> >>>> http://infinispan.org >> >>>> >> >>>> >> >>>> _______________________________________________ >> >>>> infinispan-dev mailing list >> >>>> [email protected] >> >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >>> >> >>> _______________________________________________ >> >>> infinispan-dev mailing list >> >>> [email protected] >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> >> -- >> >> Galder Zamarreño >> >> [email protected] >> >> twitter.com/galderz >> >> >> >> Project Lead, Escalante >> >> http://escalante.io >> >> >> >> Engineer, Infinispan >> >> http://infinispan.org >> >> >> >> >> >> _______________________________________________ >> >> infinispan-dev mailing list >> >> [email protected] >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > >> > _______________________________________________ >> > infinispan-dev mailing list >> > [email protected] >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Manik Surtani >> [email protected] >> twitter.com/maniksurtani >> >> Platform Architect, JBoss Data Grid >> http://red.ht/data-grid >> >> >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > -- > Manik Surtani > [email protected] > twitter.com/maniksurtani > > Platform Architect, JBoss Data Grid > http://red.ht/data-grid > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño [email protected] twitter.com/galderz Project Lead, Escalante http://escalante.io Engineer, Infinispan http://infinispan.org _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
