Considering the frequency of "How do I get the number of entries in cache", "How do I get all keys" on all forums, I think that backing to runtime exception would not satisfy the users.
On 10/07/2014 03:16 PM, Sanne Grinovero wrote: > Considering all these very valid concerns I'd return on my proposal > for throwing runtime exceptions via an (optional) decorator. > > I'd have such a decorator in place by default, so that we make it very > clear that - while you can remove it - the behaviour of such methods > is "unusual" and that a user would be better off avoiding them unless > he's into the advanced stuff. > > As said before, that worked very well for me in the past and it was > great that - even while I did know - I had a safety guard to highlight > unintended refactorings by others on my team who didn't know the black > art of using Infinispan correctly. > > Sanne > > > > On 7 October 2014 13:43, Radim Vansa <[email protected]> wrote: >> On 10/07/2014 02:21 PM, William Burns wrote: >>> On Tue, Oct 7, 2014 at 7:32 AM, Radim Vansa <[email protected]> wrote: >>>> If you have one local and one shared cache store, how should the command >>>> behave? >>>> >>>> a) distexec/MR sum of cache.withFlags(SKIP_REMOTE_LOOKUP, >>>> SKIP_BACKUP_ENTRIES).size() from all nodes? (note that there's no >>>> SKIP_BACKUP_ENTRIES flag right now), where this method returns >>>> localStore.size() for first non-shared cache store + passivation ? >>>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0) >>> Calling the size method in either distexec or MR will give you >>> inflated numbers as you need to pay attention to the numOwners to get >>> a proper count. >> That's what I meant by the SKIP_BACKUP_ENTRIES - dataContainer should be >> able to report only primary-owned entries, or we have to iterate and >> apply the filtering outside. >> >>>> b) distexec/MR sum of sharedStore.size() + passivation ? sum of >>>> dataContainer.size(SKIP_BACKUP_ENTRIES) : 0 >>> Calling the size on a shared cache actually should work somewhat well >>> (assuming all entries are stored in the shared cache). The problem is >>> if passivation is enabled as you point out because you also have to >>> check the data container which means you can also have an issue with >>> concurrent activations and passivations (which you can't verify >>> properly in either case without knowing the keys). >>> >>>> c) MR that would count the entries >>> This is the only reliable way to do this with MR. And unfortunately >>> if a rehash occurs I am not sure if you would get inconsistent numbers >>> or an Exception. In the latter at least you should be able to make >>> sure that you have the proper number when it does return without >>> exception. I can't say how it works with multiple loaders though, my >>> guess is that it may process the entry more than once so it depends on >>> if your mapper is smart enough to realize it. >> I don't think that reporting incorrect size is *that* harmful - even >> ConcurrentMap interface says that it's just a wild guess and when things >> are changing, you can't rely on that. >> >>>> d) wrapper on distributed entry iteration with converters set to return >>>> 0-sized entries >>> Entry iterator can't return 0 sized entries (just the values). The >>> keys are required to make sure that the count is correct and also to >>> ensure that if a rehash happens in the middle it can properly continue >>> to operate without having to start over. Entry iterator should work >>> properly irrespective of the number of stores/loaders that are >>> configured, since it keep track of already seen keys (so duplicates >>> are ignored). >> Ok, I was simplifying that a bit. And by the way, I don't really like >> the fact that for distributed entry iteration you need to be able to >> keep all keys from one segment at one moment in memory. But fine - >> distributed entry iteration is probably not the right way. >> >>> >>>> And what about nodes with different configuration? >>> Hard to know without knowing what the differences are. >> I had in my mind different loaders and passivation configuration (e.g. >> some node could use shared store and some don't - do we want to handle >> such obscure configs? Can we design that without the need to have >> complicated decision trees what to include and what not?). >> >> Radim >> >>>> Radim >>>> >>>> On 10/06/2014 01:57 PM, Sanne Grinovero wrote: >>>>> On 6 October 2014 12:44, Tristan Tarrant <[email protected]> wrote: >>>>>> I think we should provide correct implementations of size() (and others) >>>>>> and provide shortcut implementations using our usual Flag API (e.g. >>>>>> SKIP_REMOTE_LOOKUP). >>>>> Right that would be very nice. Same for CacheStore interaction: all >>>>> cachestores should be included unless skipped explicitly. >>>>> >>>>> Sanne >>>>> >>>>>> Tristan >>>>>> >>>>>> On 06/10/14 12:57, Sanne Grinovero wrote: >>>>>>> On 3 October 2014 18:38, Dennis Reed <[email protected]> wrote: >>>>>>>> Since size() is defined by the ConcurrentMap interface, it already has >>>>>>>> a >>>>>>>> precisely defined meaning. The only "correct" implementation is E. >>>>>>> +1 >>> This is one of the things I have been wanting to do is actually >>> implement the other Map methods across the entire cache. However to >>> do a lot of these in a memory conscious way they would need to be ran >>> ignoring any ongoing transactions. Actually having this requirement >>> allows these methods to be implemented quite easily especially in >>> conjunction with the EntryIterator. I almost made a PR for it a while >>> back, but it seemed a little zealous to do at the same time and it >>> didn't seem that people were pushing for it very hard (maybe that was >>> a wrong assumption). Also I wasn't quite sure the transactional part >>> not being functional anymore would be a deterrent. >>> >>>>>>>> The current non-correct implementation was just because it's expensive >>>>>>>> to calculate correctly. I'm not sure the current impl is really that >>>>>>>> useful for anything. >>>>>>> +1 >>>>>>> >>>>>>> And not just size() but many others from ConcurrentMap. >>>>>>> The question is if we should drop the interface and all the methods >>>>>>> which aren't efficiently implementable, or fix all those methods. >>>>>>> >>>>>>> In the past I loved that I could inject "Infinispan superpowers" into >>>>>>> an application making extensive use of Map and ConcurrentMap without >>>>>>> changes, but that has been deceiving and required great care such as >>>>>>> verifying that these features would not be used anywhere in the code. >>>>>>> I ended up wrapping the Cache implementation in a custom adapter which >>>>>>> would also implement ConcurrentMap but would throw a RuntimeException >>>>>>> if any of the "unallowed" methods was called, at least I would detect >>>>>>> violations safely. >>>>>>> >>>>>>> I still think that for the time being - until a better solution is >>>>>>> planned - we should throw exceptions.. alas that's an old conversation >>>>>>> and it was never done. >>>>>>> >>>>>>> Sanne >>>>>>> >>>>>>> >>>>>>>> -Dennis >>>>>>>> >>>>>>>> On 10/03/2014 03:30 AM, Radim Vansa wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> recently we had a discussion about what size() returns, but I've >>>>>>>>> realized there are more things that users would like to know. My >>>>>>>>> question is whether you think that they would really appreciate it, or >>>>>>>>> whether it's just my QA point of view where I sometimes compute the >>>>>>>>> 'checksums' of cache to see if I didn't lost anything. >>>>>>>>> >>>>>>>>> There are those sizes: >>>>>>>>> A) number of owned entries >>>>>>>>> B) number of entries stored locally in memory >>>>>>>>> C) number of entries stored in each local cache store >>>>>>>>> D) number of entries stored in each shared cache store >>>>>>>>> E) total number of entries in cache >>>>>>>>> >>>>>>>>> So far, we can get >>>>>>>>> B via withFlags(SKIP_CACHE_LOAD).size() >>>>>>>>> (passivation ? B : 0) + firstNonZero(C, D) via size() >>>>>>>>> E via distributed iterators / MR >>>>>>>>> A via data container iteration + distribution manager query, but only >>>>>>>>> without cache store >>>>>>>>> C or D through >>>>>>>>> getComponentRegistry().getLocalComponent(PersistenceManager.class).getStores() >>>>>>>>> >>>>>>>>> I think that it would go along with users' expectations if size() >>>>>>>>> returned E and for the rest we should have special methods on >>>>>>>>> AdvancedCache. That would of course change the meaning of size(), but >>>>>>>>> I'd say that finally to something that has firm meaning. >>>>>>>>> >>>>>>>>> WDYT? >>>>>>>>> >>>>>>>>> Radim >>>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> infinispan-dev mailing list >>>>>>>> [email protected] >>>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>>> _______________________________________________ >>>>>>> infinispan-dev mailing list >>>>>>> [email protected] >>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>>>> >>>>>>> >>>>>> _______________________________________________ >>>>>> infinispan-dev mailing list >>>>>> [email protected] >>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>>> _______________________________________________ >>>>> infinispan-dev mailing list >>>>> [email protected] >>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>>> -- >>>> Radim Vansa <[email protected]> >>>> JBoss DataGrid QA >>>> >>>> _______________________________________________ >>>> infinispan-dev mailing list >>>> [email protected] >>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >>> _______________________________________________ >>> infinispan-dev mailing list >>> [email protected] >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> -- >> Radim Vansa <[email protected]> >> JBoss DataGrid QA >> >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev -- Radim Vansa <[email protected]> JBoss DataGrid QA _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
