I don't doubt Radim's code :) but I'm pretty confident that even that implementation is limited by the constraints of the general-purpose API.
For example it seems Bela will soon allow more flexibility in JGroups regarding buffer representations. We need to commit on a stable API for end user integrations (shared cachestore implementors), but we also need to keep options open to soon play with other approaches. That's why I think this separation should be done before Infinispan 8.0.0.Final even if I don't have a concrete proposal for how this other API should look like: I don't presume to be able to anticipate which API exactly will be best, but I think we can all see that we will want to change that. There should be a private internal contract which we can change even in micro versions without concerns of compatibility, so to allow R&D progress in the most performance sensitive areas w/o this being a problem for integrators and users. Better configuration validations are additional (strong) benefits: we've seen lots of misunderstandings about which CacheStores / configuration combinations are valid. Thanks, Sanne On 5 August 2015 at 22:13, Dan Berindei <dan.berin...@gmail.com> wrote: > On Fri, Jul 31, 2015 at 3:30 PM, Sanne Grinovero <sa...@infinispan.org> wrote: >> On 20 July 2015 at 11:02, Dan Berindei <dan.berin...@gmail.com> wrote: >>> Sanne, I think changing the cache store API is actually the most >>> painful part, so we should only do it if we gain a concrete advantage >>> from doing it. From a compatibility point of view, implementing a new >>> interface vs implementing the same interface with completely different >>> methods is just as bad. >> >> Right, from that perspective it's a quite horrible proposal. >> >> But I think we can agree that only the "SharedCacheStore" deserves to >> be considered an SPI, right? >> That's the one people will normally customize to map stuff to other >> stores one might have. >> >> I think it's important that beyond Infinispan 8.0 API's freeze, we can >> make any change to the non-shared SPI >> without affecting users who implement a custom shared cachestore. >> >> I highly doubt someone will implement a high-performance custom off >> heap swap strategy, but if someone does he should contribute it and >> will probably need to make integration level changes. >> >> We probably won't have the time to implement a new super efficient >> local-only cachestore to replace the leveldb one, but I'd like to keep >> the possibility open to do that beyond 8.0, *especially* without >> breaking compatibility for other people. > > We already have a new super efficient local-only cachestore :) > > https://github.com/infinispan/infinispan/tree/master/persistence/soft-index > > >> >> Sanne >> >> >>> >>> On Mon, Jul 20, 2015 at 12:41 PM, Sanne Grinovero <sa...@infinispan.org> >>> wrote: >>>> +1 for incremental changes.. >>>> >>>> I'd see the first step as defining two different interfaces; >>>> essentially we need to choose two good names. >>>> >>>> Then we could have both interfaces still implement the same identical >>>> methods, but go through each implementation and decide to "mark" it as >>>> shared-only or never-shared. >>>> >>>> That would make it simpler to make concrete change proposals on each >>>> of them and start taking some advantage from the split. I think you'll >>>> need the two different interfaces to implement the validations you >>>> mentioned. >>>> >>>> For Infinispan 8's goals, I'd be happy enough to keep the >>>> "shared-only" interface quite similar to the current one, but mark the >>>> never-shared one as a private or experimental SPI to allow ourselves >>>> some more flexibility in performance oriented changes. >>>> >>>> Thanks, >>>> Sanne >>>> >>>> On 20 July 2015 at 10:07, Tristan Tarrant <ttarr...@redhat.com> wrote: >>>>> Sanne, well written. >>>>> Before actually implementing any of the optimizations/changes you >>>>> mention, I think the lowest-hanging fruit we should grab now is just to >>>>> add checks to all of our cachestores to actually throw an exception when >>>>> they are being enabled in unsupported configurations. >>>>> >>>>> I've created [1] to get us started >>>>> >>>>> Tristan >>>>> >>>>> [1] https://issues.jboss.org/browse/ISPN-5617 >>>>> >>>>> On 16/07/2015 15:32, Sanne Grinovero wrote: >>>>>> I would like to propose a clear cut separation between our shared and >>>>>> non-shared CacheStores, >>>>>> in all terms such as: >>>>>> - Configuration options >>>>>> - Integration contracts (Split the CacheStore SPI) >>>>>> - Implementations >>>>>> - Terminology, to avoid any further confusion around valid >>>>>> configurations and sensible architectures >>>>>> >>>>>> We have loads of examples of users who get in trouble by configuring >>>>>> one incorrectly, but also there are plenty of efficiency improvements >>>>>> we could take advantage of by clearly splitting the integration points >>>>>> and the implementations in two categories. >>>>>> >>>>>> Not least, it's a very common and dangerous pitfall to assume that >>>>>> Infinispan is able to restore a consistent state after having stopped >>>>>> a DIST cluster which passivated into non-shared CacheStore instances, >>>>>> or even REPL clusters when they don't shutdown all at the same exact >>>>>> time (and "exact same time" is a strange concept at least..). We need >>>>>> to clarify the different options, tradeoffs and their consequences.. >>>>>> to users and ourselves, as a clearly defined use case will avoid bugs >>>>>> and simplify implementations. >>>>>> >>>>>> # The purpose of each >>>>>> I think that people should use a non-shared (local?) CacheStore for >>>>>> the sole purpose of expanding to storage capacity of each single >>>>>> node.. be it because you don't have enough memory at all, or be it >>>>>> because you prefer some extra safety margin because either your >>>>>> estimates are complex, or maybe because we live in a real world were >>>>>> the hashing function might not be perfect in practice. I hope we all >>>>>> agree that Infinispan should be able to take such situations with at >>>>>> worst a graceful performance degradatation, rather than complain >>>>>> sending OOMs to the admin and setting the service on strike. >>>>>> >>>>>> A Shared CacheStore is useful for very different purposes; primarily >>>>>> to implement a Cache on some other service - for example your (single, >>>>>> shared) RDBMs, a slow (or expensive) webservice your organization has >>>>>> to call frequently, etc.. Or it's useful even as a write-through cache >>>>>> on a similar service, maybe internal but not able to handle the high >>>>>> variation of load spikes which Infinsipan can handle better. >>>>>> Finally, a great use case is to have a consistent backup of all your >>>>>> data-grid content, possibly in some "reference" form such as JPA >>>>>> mapped entities. >>>>>> >>>>>> # Benefits of a Non-Shared >>>>>> A non-shared CacheStore implementor should be able to take advantage >>>>>> of *its purpose*, among the big ones I see: >>>>>> - Exclusive usage -> locking of a specific entry can be handled at >>>>>> datacontainer level, can simplify quite some internal code. >>>>>> - Reliability -> since a clustered node needs to wipe its state at >>>>>> reboot (after a crash), it's much simpler to code any such CacheStore >>>>>> to avoid any form of disk synch or persistance guarantees. >>>>>> - Encoding format -> this can be controlled entirely by Infinispan, >>>>>> and no need to take factors like rolling upgrade compatible encodings >>>>>> in mind. JBoss Marshalling would be good enough, or some >>>>>> implementations might not need to serialize at all. >>>>>> >>>>>> Our non-shared CacheStore implentation(s) could take advantage of >>>>>> lower level more complex code optimisations and interfaces, as users >>>>>> would rarely want to customize one of these, while the use case of >>>>>> mapping data to a shared service needs a more user friendly SPI so to >>>>>> keep it simple to plug in custom stores: custom data formats, custom >>>>>> connectors, get some help in implementing concurrency correctly. >>>>>> Proper Transaction integration for the CacheStore has been on our >>>>>> wishlist for some time too, I suspect that accepting that we have been >>>>>> mixing up two different things under a same name so far, would make it >>>>>> simpler to implement further improvements such as transactions: the >>>>>> way to do such a thing is very different in each of these use cases, >>>>>> so it would help at least to implement it on a subset first, or maybe >>>>>> only if it turns out there's no need for such things in the context of >>>>>> the local-only-dedicated "swapfile". >>>>>> >>>>>> # Mixed types should be killed >>>>>> I'm aware that some of our current implementations _could_ work both as >>>>>> shared or non-shared, for example the JDBC or JPACacheStore or the >>>>>> Remote Cachestore.. but in most cases it doesn't make much sense. Why >>>>>> would you ever want to use the JPACacheStore if not to share data with >>>>>> a _shared_ database? >>>>>> >>>>>> We should take such options away, and by doing so focus on the use >>>>>> cases which actually matter and simplify the implementations and >>>>>> improve the configuration validations. >>>>>> >>>>>> If ever a compelling storage technology is identified which we'd like to >>>>>> offer as an option for both shared or non-shared, I would still >>>>>> recommend to make two different implementations, as there certainly are >>>>>> different requirements and assumptions when coding such a thing. >>>>>> >>>>>> Not least, I would very like to see a default local CacheStore: >>>>>> picking one for local "emergency swapping" should be a no-brainer for >>>>>> users; we could setup one by default and not bother newcomers with >>>>>> complex choices. >>>>>> >>>>>> If we simplify the requirement of such a thing, it should be easy to >>>>>> write one on standard Java NIO2 APIs and get rid of the complexities of >>>>>> maintaining the native integration with things like LevelDB, not least >>>>>> the inefficiency of Java to make such native calls. >>>>>> >>>>>> Then as a second step, we should attack the other use case: backups; >>>>>> from a *purpose driven perspective* I'd then see us revive the Cassandra >>>>>> integration; obviously as a shared-only option. >>>>>> >>>>>> Cheers, >>>>>> Sanne >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> infinispan-dev@lists.jboss.org >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>> >>>>> >>>>> -- >>>>> Tristan Tarrant >>>>> Infinispan Lead >>>>> JBoss, a division of Red Hat >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> infinispan-dev@lists.jboss.org >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> infinispan-dev@lists.jboss.org >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> infinispan-dev@lists.jboss.org >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev