Doing the spill/unspill option could be pretty tricky to implement, so you have to do a lot of fancy logic in the transition period. I think Jason's suggestion of configuring things might make more sense.
-Dan On Wed, Feb 15, 2017 at 1:12 PM, Jason Huynh <jhu...@pivotal.io> wrote: > With the suggestion from Wes, the constraint on the names would have to > apply for both small and large. We wouldn't want the thing to explode when > it gets converted... > > Is there a way to just make it configurable? If they know they want a > "large" set, somehow let them specify it. Otherwise go with the "small" > set? > > On Wed, Feb 15, 2017 at 1:01 PM Real Wes <thereal...@outlook.com> wrote: > > > Thinking about this, I think that the “spill”/ “unspill” option may > > actually be the best solution. If the criteria waffles back and forth > > along the threshold, well, that’s the acceptable worst case. > > > > How’s this?: > > > > 1) Create a separate region for the collection key > > - for fat collections that are updated frequently > > ADVANTAGE: speed of replication > > DISADVANTAGE: constraint on key name > > > > 2) Put the collection as an entry value: > > - for small collections and read-only fat collections > > ADVANTAGE: no need to create a separate region > > > > We would track the metrics and automatically convert based on a > > combination of frequency of updates and size. > > > > We next define what a fat collection is, such as over nnMB. > > > > > > On Feb 14, 2017, at 8:12 PM, Jason Huynh <jhu...@pivotal.io<mailto: > > jhu...@pivotal.io>> wrote: > > > > The concern about the threshold to spill over would be do you "unspill" > > over? Like what if the collection contracts under the threshold and > > teeters around the threshold. If the user can configure this size, then > > wouldn't they just know they want a "large" vs a "small?" > > > > I think Swapnil makes a good point that our value add would be that we > can > > scale those structures, whereas redis can already do what the "new" > > implementation is doing. > > > > > > > > On Tue, Feb 14, 2017 at 4:59 PM Galen M O'Sullivan < > gosulli...@pivotal.io > > <mailto:gosulli...@pivotal.io>> wrote: > > If we put them in separate regions, we'll have the overhead of looking up > > in two regions added to each and every operation, and the overhead of > > creating all these regions. > > > > If we really wanted to we could have some threshold at which we spill > > collections over into their own regions, and have something like the best > > of both worlds. It's more complex, though, and I don't know how many > people > > actually use truly huge collections. > > > > On Tue, Feb 14, 2017 at 4:21 PM, Hitesh Khamesra < > > hitesh...@yahoo.com.invalid<mailto:hitesh...@yahoo.com.invalid>> wrote: > > > > > Jason/Dan: Sorry to hear about that. But both of you have asked the > right > > > question. > > > it depends on your use-case(item 2,3,4,5) . For example "hashes" can be > > > use to define key-value pair or java bean. In this case probably it is > > > better to keep that hash at region-entry level. But if you want to > know > > > top 10 tweets which are trending then probably you want use > > > partition-region for "sorted-set". > > > > > > > > > From: Jason Huynh <jhu...@pivotal.io<mailto:jhu...@pivotal.io>> > > > To: dev@geode.apache.org<mailto:dev@geode.apache.org>; " > > u...@geode.apache.org<mailto:u...@geode.apache.org>" < > > u...@geode.apache.org<mailto:u...@geode.apache.org>>; > > > Hitesh Khamesra <hitesh...@yahoo.com<mailto:hitesh...@yahoo.com>> > > > Sent: Tuesday, February 14, 2017 3:15 PM > > > Subject: Re: GeodeRedisAdapter improvments/feedback > > > > > > Hi Hitesh, > > > > > > Not sure about everyone else, but I had a hard time reading this, > > however > > > I think I figured out what you were describing... the only part I still > > am > > > unsure about is Feedback/vote: both behaviour is desirable. Do you > mean > > > you want feedback and voting on whether both behaviors are desired? As > > in > > > old implementation and new implementation? > > > > > > 2,3,4) The new implementation would mean all the data for a specific > > data > > > structure is contained in a single bucket. So the individual data > > > structures are not quite scalable. How would you allow scaling of a > > single > > > data structure? > > > > > > On Tue, Feb 14, 2017 at 3:05 PM Real Wes <thereal...@outlook.com< > mailto: > > thereal...@outlook.com>> wrote: > > > > > > > In what format do you want the feedback Hitesh? For now I’ll just > > > comment: > > > > > > > > 1. Redis Type String > > > > No comments except that a future Geode value-add would be to extend > the > > > > Jedis client so that the K/V’s are not compressed. In this way OQL > and > > CQ > > > > will work. The tradeoff of this is that the data cannot be read by a > > > > native redis client but for Geode users it’s great. Call the new > client > > > > Geodis. > > > > > > > > 2. List/ Hash/ Set/ SortedSet > > > > Creating a separate region for each creates a constraint that the > keys > > > are > > > > limited to the characters for region names, which are A-z/0-9/ - and > _. > > > > Everything else is out. Redis users might start asking questions why > > > their > > > > list named ++^^/## throws an error. Your suggestion to make it a key > > > rather > > > > than a region solves this. Furthermore, creating a new region every > > time > > > a > > > > new Redis collection is created is going to be slow. I’m not sure > why a > > > > region was created but I’m sure it made sense to the developer at the > > > time. > > > > > > > > 7. Default Config > > > > Can’t we configure a gfsh option to default to the region types we > > want? > > > > Customer A will want PARTITION but Customer B will want > > > > PARTITION_REDUNDANT_EXPIRATION_PERSISTENT. I wonder if we can > consider > > > a > > > > geode> create region —redisType=PARTITION_REDUNDANT_EXPIRATION_ > > > PERSISTENT > > > > that makes _all_ Redis regions of that type? > > > > > > > > > > > > > > > > On Feb 14, 2017, at 5:36 PM, Hitesh Khamesra <hitesh...@yahoo.com > > <mailto:hitesh...@yahoo.com> > > > <mailto: > > > > hitesh...@yahoo.com<mailto:hitesh...@yahoo.com>>> wrote: > > > > > > > > Current GeodeRedisAdapter implementation is based on > > > > https://cwiki.apache.org/confluence/display/GEODE/ > > > Geode+Redis+Adapter+Proposal > > > > . > > > > We are looking for some feedback on Redis commands and their mapping > to > > > > geode region. > > > > > > > > 1. Redis Type String > > > > a. Usage Set k1 v1 > > > > b. Current implementation creates "STRING_REGION" > > geode-partition-region > > > > upfront > > > > c. This k1/v1 are geode-region key/value > > > > d. Any feedback? > > > > > > > > 2. List Type > > > > a. usage "rpush mylist A" > > > > b. Current implementation maps each list to > > geode-partition-region(i.e. > > > > mylist is geode-partition-region); with the ability to get item from > > > > head/tail > > > > c. Feedback/vote > > > > -- List type operation at region-entry level; > > > > -- region-key = "mylist" > > > > -- region-value = Arraylist (will support all redis list ops) > > > > d. Feedback/vote: both behavior is desirable > > > > > > > > > > > > 3. Hashes > > > > a. this represents field-value or java bean object > > > > b. usage "hmset user1000 username antirez birthyear 1977 verified 1" > > > > c. Current implementation maps each hashes to > > > > geode-partition-region(i.e. user1000 is geode-partition-region) > > > > d. Feedback/vote > > > > -- Should we map hashes to region-entry > > > > -- region-key = user1000 > > > > -- region-value = map > > > > -- This will provide java bean sort to behaviour with 10s of > > > > field-value > > > > -- Personally I would prefer this.. > > > > e. Feedback/vote: both behaviour is desirable > > > > > > > > 4. Sets > > > > a. This represents unique keys in set > > > > b. usage "sadd myset 1 2 3" > > > > c. Current implementation maps each sadd to > > geode-partition-region(i.e. > > > > myset is geode-partition-region) > > > > d. Feedback/vote > > > > -- Should we map set to region-entry > > > > -- region-key = myset > > > > -- region-value = Hashset > > > > e. Feedback/vote: both behaviour is desirable > > > > > > > > 5. SortedSets > > > > a. This represents unique keys in set with score (usecase Query > > top-10) > > > > b. usage "zadd hackers 1940 "Alan Kay"" > > > > c. Current implementation maps each zadd to > > geode-partition-region(i.e. > > > > hackers is geode-partition-region) > > > > d. Feedback/vote > > > > -- Should we map set to region-entry > > > > -- region-key = hackers > > > > -- region-value = Sorted Hashset > > > > e. Feedback/vote: both behaviour is desirable > > > > > > > > 6. HyperLogLogs > > > > a. A HyperLogLog is a probabilistic data structure used in order to > > > > count unique things (technically this is referred to estimating the > > > > cardinality of a set). > > > > b. usage "pfadd hll a b c d" > > > > c. Current implementation creates "HLL_REGION" > geode-partition-region > > > > upfront > > > > d. hll becomes region-key and value is HLL object > > > > e. any feedback? > > > > > > > > 7. Default config for geode-region (vote) > > > > a. partition region > > > > b. 1 redundant copy > > > > c. Persistence > > > > d. Eviction > > > > e. Expiration > > > > f. ? > > > > > > > > 8. It seems; redis knows type(list, hashes, string ,set ..) of each > > key. > > > > Thus for each operation we need to make sure type of key. In current > > > > implementation we have different region for each redis type. Thus we > > have > > > > another region(metaTypeRegion) which keeps type for each key. This > > makes > > > > any operation in geode slow as it needs to verify that type. For > > > instance, > > > > creating new key need to make sure its already there or not. Whether > we > > > > should allow type change or not. > > > > a. Feedback/vote > > > > -- type change of key > > > > -- Can we allow two key with same name but two differnt type (as > > it > > > > will endup in two different geode-region) > > > > String type "key1" in string region > > > > HLL type "key1" in HLL region > > > > b. any other feedback > > > > > > > > 9. Transactions: > > > > a. we will not support transaction in redisAdapter as geode > > transaction > > > > are limited to single node. > > > > b. feedback? > > > > > > > > 10. Redis COMMAND (https://redis.io/commands/command) > > > > a. should we implement this "COMMAND" ? > > > > > > > > 11. Any other redis command we should consider? > > > > > > > > > > > > Thanks. > > > > Hitesh > > > > > > > > > > > > > > > > > > > > > > > > > >