Here is something of a quick summary of our current implementation to start
conversation on requirements:
- Disk store remains as is; all other persistent stores become pluggable
- Overflow is not changed- uses disk store always
- CacheStoreManager is created to manage persistence stores
- Cache stores implement a defined interface
- Cache store interface includes methods for getting details on cache store and
for statistics
- Define categories of storage types (e.g., distributed FS).
- Cache store interface supports standard configuration interface for a given
store type type
- Cache stores are added in configuration via classpath and loaded by storage
manager
- Storage manager expose JMX interfaces such that gfsh and dashboards can: list
storage types available; list storage configuration parameters for a give
store; list details and statistics for a given store.
The overflow is a question where we decided that since it is a disk cache
rather than a persistent store we should leave it as is, but this is a question
to be discussed.
Here is a short sequence illustrating the model it in code:
------
CacheFactory cacheFactory = new CacheFactory();
Cache cache = cacheFactory.create();
//get CacheStoreManager and create persistent store
CacheStoreManager cacheStoreManager = cache.getCacheStoreManager();
CacheStore cacheStore =
cacheStoreManager.CreateCacheStore(CacheStoreType.HBASE, new
HBaseStoreConfig());
//create region and attach the newly created cachestore to the region
RegionFactory regionFactory =
cache.createRegionFactory(RegionShortcut.PARTITION);
regionFactory.setCacheStore(cacheStore);
Region region = regionFactory.create("test”);
I’ll let the developers chime in to provide more detail :)
Bob
On 3/20/16, 5:23 PM, "Udo Kohlmeyer" <[email protected]> wrote:
>+1 Bob. I think making persistence a pluggable module would be awesome.
>
>I think persistence should be configured on a disk store level. Which
>raises a few more questions like, how would HDFS storage work for
>overflow vs traditional persistence? Given large blocksizes, HDFS does
>becomes less useful with smaller commits and smaller reads. Which
>undoubtedly could impact OLTP.
>
>--Udo
>
>On 19/03/2016 11:41 am, Dan Smith wrote:
>> Hi Bob,
>>
>> Pluggable persistence sounds like a great feature! Help cleaning up this
>> HDFS feature to be more pluggable is also most welcome :)
>>
>> I'm interested to hear more about what your ideas are for pluggable
>> persistence - such as whether you're thinking about swapping out the
>> persistence at an individual node level (redundancy is still managed be
>> geode) or at the cluster level (like this HDFS layer, redundancy of the
>> persistent data is managed elsewhere). I'd love to see your proposal!
>>
>> -Dan
>>
>> On Fri, Mar 18, 2016 at 4:24 PM, robert geiger <[email protected]> wrote:
>>
>>> +1
>>>
>>> Also would like to see the storage layer move to a pluggable model for
>>> stores other than disk (we would like to contribute this). Would be willing
>>> to take on the work of turing HDFS into a separate pluggable module as part
>>> of this effort. If responses are positive will open a Jira to capture the
>>> pluggable store proposal.
>>>
>>> Bob
>>>
>>>
>>>
>>> On 3/18/16, 4:15 PM, "William Markito" <[email protected]> wrote:
>>>
>>>> +1 to move it to "HDFS feature branch". I'd rather have a eviction
>>>> class(es) specific to HDFS.
>>>>
>>>> On Fri, Mar 18, 2016 at 4:03 PM, Dan Smith <[email protected]> wrote:
>>>>
>>>>> While looking to the HDFS related changes, I noticed that a new custom
>>>>> eviction feature was also added related to those changes. Unlike the
>>>>> already existing CustomExpiry which returns an expiration time for a
>>> single
>>>>> key, this takes an EvictionCriteria that is polled periodically and
>>> returns
>>>>> return a list of keys to evict.
>>>>>
>>>>> I noticed we currently have no tests for this so I'm not sure if it
>>>>> actually works or not. Is this something we actually want in geode or
>>>>> should it get removed? My inclination is to move to the HDFS branch,
>>>>> asssuming we create one, since it came in with that functionality. And
>>> then
>>>>> not merge it back to develop until there are tests associated with it.
>>>>>
>>>>> -Dan
>>>>>
>>>>> /**
>>>>> * Set custom {@link EvictionCriteria} for the region with start time
>>> and
>>>>> * interval of evictor task to be run in milliseconds, or evict
>>> incoming
>>>>> rows
>>>>> * in case both start and frequency are specified as zero.
>>>>> *
>>>>> * @param criteria
>>>>> * an {@link EvictionCriteria} to be used for eviction for
>>> HDFS
>>>>> * persistent regions
>>>>> * @param start
>>>>> * the start time at which periodic evictor task should be
>>> first
>>>>> * fired to apply the provided {@link EvictionCriteria}; if
>>> this
>>>>> is
>>>>> * zero then current time is used for the first invocation of
>>>>> evictor
>>>>> * @param interval
>>>>> * the periodic frequency at which to run the evictor task
>>> after
>>>>> the
>>>>> * initial start; if this is if both start and frequency are
>>>>> zero
>>>>> * then {@link EvictionCriteria} is applied on incoming
>>>>> insert/update
>>>>> * to determine whether it is to be retained
>>>>> */
>>>>> public RegionFactory<K, V> setCustomEvictionAttributes(
>>>>> EvictionCriteria<K, V> criteria, long start, long interval) {
>>>>>
>>>>> /**
>>>>> * Interface implemented by an EVICTION BY CRITERIA of
>>>>> * {@link CustomEvictionAttributes}. This will be invoked by periodic
>>>>> evictor
>>>>> * task that will get the keys to be evicted using this and then destroy
>>>>> from
>>>>> * the region to which this is attached.
>>>>> *
>>>>> * @author swale
>>>>> * @since gfxd 1.0
>>>>> */
>>>>> public interface EvictionCriteria<K, V> {
>>>>>
>>>>> /**
>>>>> * Get the (key, routing object) of the entries to be evicted from
>>> region
>>>>> * satisfying EVICTION BY CRITERIA at this point of time.
>>>>> * <p>
>>>>> * The returned Map.Entry object by the Iterator may be reused
>>> internally
>>>>> so
>>>>> * caller must extract the key, routing object from the entry on each
>>>>> * iteration.
>>>>> */
>>>>> Iterator<Map.Entry<K, Object>> getKeysToBeEvicted(long currentMillis,
>>>>> Region<K, V> region);
>>>>>
>>>>> /**
>>>>> * Last moment check if an entry should be evicted or not applying the
>>>>> * EVICTION BY CRITERIA again under the region entry lock in case the
>>>>> entry
>>>>> * has changed after the check in {@link #getKeysToBeEvicted}.
>>>>> */
>>>>> boolean doEvict(EntryEvent<K, V> event);
>>>>>
>>>>> /**
>>>>> * Return true if this eviction criteria is equivalent to the other
>>> one.
>>>>> This
>>>>> * is used to ensure that custom eviction is configured identically on
>>>>> all the
>>>>> * nodes of a cluster hosting the region to which this eviction
>>> criteria
>>>>> has
>>>>> * been attached.
>>>>> */
>>>>> boolean isEquivalent(EvictionCriteria<K, V> other);
>>>>> }
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> ~/William
>