On Oct 16, 2013, at 3:11 PM, Sanne Grinovero <[email protected]> wrote:
> On 16 October 2013 12:21, Galder Zamarreño <[email protected]> wrote: >> >> Guys, there's some context to this topic that Sanne has left out. >> >> This discussion origins comes from Scott Marlow who's found a bug in a >> Hibernate 2LC test. The test itself can be found in [1]. >> >> Basically, the test loads a couple of entities into the 2LC, and then calls >> evictAll in the cache that evicts all entities from the 2LC. Then, it >> expects the size of the cache to be 0 and entities to be gone from the 2LC. >> >> The 2LC has a peculiar way to deal with evictEntityRegions or evictAll, it >> doesn't just call clear() and hope for the best. Instead, it marks the cache >> as invalid, and next time someone queries the cache, it clears it in a >> separate transaction. This is done cluster wide too by sending messages to >> other node that the cache region is invalid. >> >> The problem here is that because it's done in a separate transaction, it >> doesn't have an impact on the on going transaction, so the assertions after >> evict fail. IOW, we implemented evict as a "eventually clear", where >> eventually all transactions will stop using the cached data, but any current >> transactions will still see data. We did this for performance reasons and to >> avoids the issues Sanne mentiones below. >> >> The problem is that we didn't envision this evict method to be used in such >> way. I always assumed this would be run outside of a transaction. IOW, you'd >> use this when you've updated the database separately (i.e. script, PL/SQL) >> and you want to clear the 2LC. I always thought that'd be something run via >> JMX or separate from another transaction. > > That's indeed one of the use cases for a clear operation, but in this > case - which I'd classify as maintenance/manual control - I don't > think the user even expects a transaction to happen. > It could also happen by explicit invocation of user code, but I think > we're in the same use case. > > >> But, as Scott has found out, the TCK has some particular expectations of >> what happens after a evictEntityRegions or evictAll call within a >> transaction :( > > To clarify, that's not just a TCK weirdness that we need to "hack > around" to pass it, it's an important use case, and as you say very > different than the "manual" control. > The scenario is that the user is invoking some batch update statement > on the database which potentially changes a set of entities, whose ids > might be unknown. To make a simple example, say the statement adds one > day to the age of all users and you want such a statement to run on > each midnight. > Such statements exist and are quite common (probably within the scope > of a better use case), and the point is that in such cases, and > evictAll(users) need to happen, but it should happen in the scope of > the transaction: isolation-wise, the current Session shouldn't be > loading any more stale data - and IF it loads from the DB fresh data, > this newly cached entries need to be confined to the current TX, and > the changes need to be made visible to other nodes and Sessions only > at commit: in other words, ideally you should roll back the eviction > from happening if the statement is rolled back. > > Still considering it's a cache, with "ideally" I mean that it would > still be correct - although suboptimal - to evict all caches > immediately (out of transaction): worse case other sessions will have > slightly more cache misses, but they would still be correct. > > So to summarise: Hibernate expects to be able to use it both in a > transactional mode and in a non-transactional mode. Steve Ebersole > pointed out that he would be open to discuss using two differently > named methods to make the expectation clear, as it currently just > relies on the fact that a transaction is active or not. > > Dan has a great point about JCache, still we'd need to investigate > what the spec is saying about transactionality of such a method, it > would surprise me if it mandated transactionality? ^ Transactions have been left out of the spec [1]. > > I'll call myself out on how you're all thinking to implement it, as > long as the requirements described above are clear: I think they make > much sense for any general purpose cache and it's not just a weirdness > of the TCK designer or Hibernate. So the trouble for us is that > Infinispan no longer can operate on a cache mixing transactional and > non-transactional operations. That looks like something that needs to > be addressed, and considering the general use case for a clear I think > it would suffice to clearly mark clear as a non-transactional > operation. > > BTW If I really need to clear it.. I don't care about it to be > consistently applied "all or nothing", it would be much more of a sane > decision to apply it at least where it worked fine. Being this an XA > operation seems nonsense. > > That said, in the way a distributed key value store which aims at > bolting on eventual consistency & co, I would consider it damn > dangerous to embark in designing operations whose scope is not > restricted to a well defined set of keys. I tried to pitch my concern > at that level rather than jumping in the details of how to make the > TCK work. Conceptually, I think it's very tricky as you won't be able > to track any form of causality between this operation and subsequent > write operations. At some point you'll need to introduce tombstone > markers, and you need a to define what is your "keyset universe" to > implement such a method. [1] https://groups.google.com/forum/?fromgroups=#!topic/jsr107/bEHJ5m10HT4 > > Cheers, > Sanne > >> >> With this in mind, let me comment on the specifics below: >> >> [1] >> https://github.com/scottmarlow/wildfly/blob/b61dc1dddb1f70847d1d82206a366eedf6bc80db/testsuite/integration/basic/src/test/java/org/jboss/as/test/integration/jpa/secondlevelcache/SFSB2LC.java#L82 >> >> On Oct 2, 2013, at 1:03 AM, Sanne Grinovero <[email protected]> wrote: >> >>> I'd love to brainstorm about the clear() operation and what it means >>> on Infinispan. >>> >>> I'm not sure to what extent, but it seems that clear() is designed to >>> work in a TX, or even create an implicit transaction if needed, but >>> I'm not understanding how that can work. >>> >>> Obviously a clear() operation isn't listing all keys explicitly. Which >>> implies that it's undefined on which keys it's going to operate when >>> it's fired.. that seems like terribly wrong in a distributed key/value >>> store as we can't possibly freeze the global state and somehow define >>> a set of keys which are going to be affected, while an explicit >>> enumeration is needed to acquire the needed locks. >>> >>> It might give a nice safe feeling that, when invoking a clear() >>> operation in a transaction, I can still abort the transaction to make >>> it cancel the operation; that's the only good part I can think of: we >>> can cancel it. >>> >>> I don't think it has anything to do with consistency though? To make >>> sure you're effectively involving all replicas of all entries in a >>> consistent way, a lock would need to be acquired on each affected key, >>> which again implies a need to enumerate all keys, including the >>> unknown keys which might be hiding in a CacheStore: it's not enough to >>> broadcast the clear() operation to all nodes and have them simply wipe >>> their local state as that's never going to deal correctly >>> (consistently) with in-flight transactions working on different nodes >>> at different times (I guess enabling Total Order could help but you'd >>> need to make it mandatory). >>> >>> So let's step back a second and consider what is the use case for >>> clear() ? I suspect it's primarily a method needed during testing, or >>> maybe cleanup before a backup is restored (operations), maybe a >>> manually activated JMX operation to clear the cache in exceptional >>> cases. >> >> ^ Well, that's precisely the problem of second level cache >> evictEntityRegions or evictAll. In some scenarios, the clear is expected to >> affect the ongoing transaction too, see above use case from the TCK. >> >>> I don't think there would ever be a need for a clear() operation to >>> interact with other transactions, so I'd rather make it illegal to >>> invoke a clear() inside a transaction, >> >> -1, unless you have other ideas to deal with evict calls that I mentioned >> above that satisfy what the TCK is after. >> >>> or simply ignore the >>> transactional scope and have an immediate and distributed effect. >> >> That'd be more inline with what's needed above. >> >>> I'm likely missing something. What terrible consequences would this have? >> >> p.s. For the 2LC use case, I'm trying to see if the clear() could be done in >> the same transaction and see if that helps, but no luck yet as that causes >> other failures. >> >>> >>> Cheers, >>> Sanne >>> _______________________________________________ >>> infinispan-dev mailing list >>> [email protected] >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> -- >> Galder Zamarreño >> [email protected] >> twitter.com/galderz >> >> Project Lead, Escalante >> http://escalante.io >> >> Engineer, Infinispan >> http://infinispan.org >> >> >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Galder Zamarreño [email protected] twitter.com/galderz Project Lead, Escalante http://escalante.io Engineer, Infinispan http://infinispan.org _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
