Got it, thanks a lot. I guess my set up will be as following: - OVERFLOW_TO_DISK - Small Oplogs size (about 500Mb) - compaction-threshold = 95% (experiment with the number) - try to add listener on Low Disk Size warning (if possible, if not, just check the disk size periodically) and find LRU objects to delete them synchronously (concurrency-checks = false) to keep free disk space ~2Gb.
This way I'll be able to use as much disk space as possible. Worrying about the performance, but will see how it will go. Another quick question is: I've created a CacheLoader, which gets files from AWS S3, and provides them as byte[], if the Geode cluster would receive several requests to get the object from different nodes, would the CacheLoader lock the responses, download the file from S3, and distribute it to all clients? In other words, would I have only one call to AWS S3 origin even if many requests would come for the object at the same time? Thanks a lot, Eugene On Wed, Apr 20, 2016 at 1:42 PM, Dan Smith <[email protected]> wrote: > > My cache could grow infinitely, so I need some mechanism to evict the > objects from overflow space as well, not just from memory. > > Unfortunately, I don't think there is a built in way to evict entries when > you disk space is starting to get low. For your eviction action, you > basically have a choice of whether to evict an entry from memory > (OVERFLOW_TO_DISK) or completely destroy the entry "LOCAL_DESTROY". > > I suppose you have your own thread that is watching the disk space and > starts issuing destroys if the disk space gets low. > > -Dan > > On Wed, Apr 20, 2016 at 10:12 AM, Darrel Schneider <[email protected]> > wrote: > >> Something to keep in mind is that when you have an LRU whose eviction >> action is overflow to disk then each eviction does not do a delete. After >> an overflow to disk the region entry and its key are still the jvm >> consuming memory; only the entry value overflowed to disk. >> >> When you say "I need some mechanism to evict the objects from overflow >> space as well" are you saying that you no longer want that object in your >> region at all? The way to do that is to do an entry delete operation on the >> region. That will mark the value that overflowed to disk as being deleted >> and the entry and key will be removed from memory. (Actually if >> concurrency-checks=true on your region then the delete operation does not >> immediately remove from entry and key. Instead it changes the value of the >> region to a special value we call a TOMBSTONE. Eventually a background >> process will remove the entry and key of these tombstones). >> >> >> On Wed, Apr 20, 2016 at 6:40 AM, Eugene Strokin <[email protected]> >> wrote: >> >>> Udo, thanks for the link. But my concern was not about the memory but >>> disk space. My cache could grow infinitely, so I need some mechanism to >>> evict the objects from overflow space as well, not just from memory. >>> I couldn't fins any pointers that Geode could do this out of the box, or >>> even the way to implement this myself. >>> If you do know something about this, please let me know. Looks like >>> Geode could do everything what I need but this one thing. >>> >>> Thanks, >>> Eugene >>> >>> On Tue, Apr 19, 2016 at 9:47 PM, Udo Kohlmeyer <[email protected]> >>> wrote: >>> >>>> Hi there Eugene, >>>> >>>> Please look at >>>> http://geode.docs.pivotal.io/docs/reference/topics/cache_xml.html#lru-memory-size >>>> . >>>> When configuring this eviction policy, you should be able to specify >>>> the amount of memory that this region holds in memory before it overflows >>>> the value. >>>> >>>> I am at this stage uncertain if this policy only takes the size of the >>>> value into account, or if this value would be inclusive of the key as well. >>>> If so, this setting might cause the region to keep fewer and fewer values >>>> in-memory, as the number of entries in the region increase. >>>> >>>> --Udo >>>> >>>> On 20/04/2016 11:39 am, Eugene Strokin wrote: >>>> >>>> Dan, thanks for the response. Yes you right, 512 Mb of course. My >>>> mistake. >>>> The idea is to use as much disk space as possible. I understand the >>>> downside of using high compaction threshold. I'll play with that, and see >>>> how bad it could be. >>>> But what about eviction? Would Geode remove objects from the overflow >>>> automatically once it would reach a certain size? >>>> Ideally, I'd like to set the Geode to start kicking LRU objects out >>>> once the free disk space would reach 1Gb. Is it possible? If so, please >>>> point me to the right direction. >>>> >>>> Thanks again, >>>> Eugene >>>> >>>> >>>> On Tue, Apr 19, 2016 at 8:25 PM, Dan Smith <[email protected]> wrote: >>>> >>>>> I'm guessing you mean 512MB of RAM, not KB? Otherwise, you are >>>>> definitely going to have problems :) >>>>> >>>>> Regarding conserving disk space - I think only allowing for 1 GB free >>>>> space is probably going to run into issues. I think you would be better >>>>> off >>>>> having fewer droplets with more space if that's possible. And only leaving >>>>> 5% disk space for compaction and as a buffer to avoid running out of disk >>>>> is probably not enough. >>>>> >>>>> By default, geode will compact oplogs when they get to be 50% garbage, >>>>> which means needing maybe 2X the amount of actual disk space. You can >>>>> configure the compaction-threshold to something like 95%, but that means >>>>> geode will be doing a lot of extra work clean up garbage on disk. >>>>> Regardless, you'll probably want to tune down the max-oplog-size to >>>>> something much smaller than 1GB. >>>>> >>>>> -Dan >>>>> >>>>> On Tue, Apr 19, 2016 at 4:26 PM, Eugene Strokin < >>>>> <[email protected]>[email protected]> wrote: >>>>> >>>>>> Hello, I'm seriously consider to use Geode as a core for distributed >>>>>> file cache system. But I have a few questions. >>>>>> But first, this is what needs to be done: Scalable file system with >>>>>> LRU eviction policy utilizing the disc space as much as possible. The >>>>>> idea >>>>>> is to have around 50 small Droplets from DigitalOcean, which provides >>>>>> 512Kb >>>>>> RAM and 20Gb Storage. The client should call the cluster and get a byte >>>>>> array by a key. If needed, the cluster should be expanded. The origin of >>>>>> the byte arrays are files from AWS S3. >>>>>> Looks like everything could be done using Geode, but: >>>>>> - it looks like the compaction requires a lot of free hard drive >>>>>> space. All I can allow is about 1Gb. Would this work in my case? How >>>>>> could >>>>>> it be done. >>>>>> - Is the objects would be evicted automatically from overflow storage >>>>>> using LRU policy? >>>>>> >>>>>> Thanks in advance for your answers, ideas, suggestions. >>>>>> Eugene >>>>>> >>>>> >>>>> >>>> >>>> >>> >> >
