> In other words, would I have only one call to AWS S3 origin even if many
requests would come for the object at the same time?

Yes, you'll only have a single call to S3. Geode keeps track of in progress
loads and another request for the same key will wait for the in progress
load to complete.

-Dan


On Wed, Apr 20, 2016 at 11:00 AM, Eugene Strokin <[email protected]>
wrote:

> Got it, thanks a lot. I guess my set up will be as following:
> - OVERFLOW_TO_DISK
> - Small Oplogs size (about 500Mb)
> - compaction-threshold = 95% (experiment with the number)
> - try to add listener on Low Disk Size warning (if possible, if not, just
> check the disk size periodically) and find LRU objects to delete them
> synchronously (concurrency-checks = false) to keep free disk space ~2Gb.
>
> This way I'll be able to use as much disk space as possible. Worrying
> about the performance, but will see how it will go.
>
> Another quick question is: I've created a CacheLoader, which gets files
> from AWS S3, and provides them as byte[], if the Geode cluster would
> receive several requests to get the object from different nodes, would the
> CacheLoader lock the responses, download the file from S3, and distribute
> it to all clients? In other words, would I have only one call to AWS S3
> origin even if many requests would come for the object at the same time?
>
> Thanks a lot,
> Eugene
>
> On Wed, Apr 20, 2016 at 1:42 PM, Dan Smith <[email protected]> wrote:
>
>> > My cache could grow infinitely, so I need some mechanism to evict the
>> objects from overflow space as well, not just from memory.
>>
>> Unfortunately, I don't think there is a built in way to evict entries
>> when you disk space is starting to get low. For your eviction action, you
>> basically have a choice of whether to evict an entry from memory
>> (OVERFLOW_TO_DISK) or completely destroy the entry "LOCAL_DESTROY".
>>
>> I suppose you have your own thread that is watching the disk space and
>> starts issuing destroys if the disk space gets low.
>>
>> -Dan
>>
>> On Wed, Apr 20, 2016 at 10:12 AM, Darrel Schneider <[email protected]
>> > wrote:
>>
>>> Something to keep in mind is that when you have an LRU whose eviction
>>> action is overflow to disk then each eviction does not do a delete. After
>>> an overflow to disk the region entry and its key are still the jvm
>>> consuming memory; only the entry value overflowed to disk.
>>>
>>> When you say "I need some mechanism to evict the objects from overflow
>>> space as well" are you saying that you no longer want that object in your
>>> region at all? The way to do that is to do an entry delete operation on the
>>> region. That will mark the value that overflowed to disk as being deleted
>>> and the entry and key will be removed from memory. (Actually if
>>> concurrency-checks=true on your region then the delete operation does not
>>> immediately remove from entry and key. Instead it changes the value of the
>>> region to a special value we call a TOMBSTONE. Eventually a background
>>> process will remove the entry and key of these tombstones).
>>>
>>>
>>> On Wed, Apr 20, 2016 at 6:40 AM, Eugene Strokin <[email protected]>
>>> wrote:
>>>
>>>> Udo, thanks for the link. But my concern was not about the memory but
>>>> disk space. My cache could grow infinitely, so I need some mechanism to
>>>> evict the objects from overflow space as well, not just from memory.
>>>> I couldn't fins any pointers that Geode could do this out of the box,
>>>> or even the way to implement this myself.
>>>> If you do know something about this, please let me know. Looks like
>>>> Geode could do everything what I need but this one thing.
>>>>
>>>> Thanks,
>>>> Eugene
>>>>
>>>> On Tue, Apr 19, 2016 at 9:47 PM, Udo Kohlmeyer <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi there Eugene,
>>>>>
>>>>> Please look at
>>>>> http://geode.docs.pivotal.io/docs/reference/topics/cache_xml.html#lru-memory-size
>>>>> .
>>>>> When configuring this eviction policy, you should be able to specify
>>>>> the amount of memory that this region holds in memory before it overflows
>>>>> the value.
>>>>>
>>>>> I am at this stage uncertain if this policy only takes the size of the
>>>>> value into account, or if this value would be inclusive of the key as 
>>>>> well.
>>>>> If so, this setting might cause the region to keep fewer and fewer values
>>>>> in-memory, as the number of entries in the region increase.
>>>>>
>>>>> --Udo
>>>>>
>>>>> On 20/04/2016 11:39 am, Eugene Strokin wrote:
>>>>>
>>>>> Dan, thanks for the response. Yes you right, 512 Mb of course. My
>>>>> mistake.
>>>>> The idea is to use as much disk space as possible. I understand the
>>>>> downside of using high compaction threshold. I'll play with that, and see
>>>>> how bad it could be.
>>>>> But what about eviction? Would Geode remove objects from the overflow
>>>>> automatically once it would reach a certain size?
>>>>> Ideally, I'd like to set the Geode to start kicking LRU objects out
>>>>> once the free disk space would reach 1Gb. Is it possible? If so, please
>>>>> point me to the right direction.
>>>>>
>>>>> Thanks again,
>>>>> Eugene
>>>>>
>>>>>
>>>>> On Tue, Apr 19, 2016 at 8:25 PM, Dan Smith <[email protected]> wrote:
>>>>>
>>>>>> I'm guessing you mean 512MB of RAM, not KB? Otherwise, you are
>>>>>> definitely going to have problems :)
>>>>>>
>>>>>> Regarding conserving disk space - I think only allowing for 1 GB free
>>>>>> space is probably going to run into issues. I think you would be better 
>>>>>> off
>>>>>> having fewer droplets with more space if that's possible. And only 
>>>>>> leaving
>>>>>> 5% disk space for compaction and as a buffer to avoid running out of disk
>>>>>> is probably not enough.
>>>>>>
>>>>>> By default, geode will compact oplogs when they get to be 50%
>>>>>> garbage, which means needing maybe 2X the amount of actual disk space. 
>>>>>> You
>>>>>> can configure the compaction-threshold to something like 95%, but that
>>>>>> means geode will be doing a lot of extra work clean up garbage on disk.
>>>>>> Regardless, you'll probably want to tune down the max-oplog-size to
>>>>>> something much smaller than 1GB.
>>>>>>
>>>>>> -Dan
>>>>>>
>>>>>> On Tue, Apr 19, 2016 at 4:26 PM, Eugene Strokin <
>>>>>> <[email protected]>[email protected]> wrote:
>>>>>>
>>>>>>> Hello, I'm seriously consider to use Geode as a core for distributed
>>>>>>> file cache system. But I have a few questions.
>>>>>>> But first, this is what needs to be done: Scalable file system with
>>>>>>> LRU eviction policy utilizing the disc space as much as possible. The 
>>>>>>> idea
>>>>>>> is to have around 50 small Droplets from DigitalOcean, which provides 
>>>>>>> 512Kb
>>>>>>> RAM and 20Gb Storage. The client should call the cluster and get a byte
>>>>>>> array by a key. If needed, the cluster should be expanded. The origin of
>>>>>>> the byte arrays are files from AWS S3.
>>>>>>> Looks like everything could be done using Geode, but:
>>>>>>> - it looks like the compaction requires a lot of free hard drive
>>>>>>> space. All I can allow is about 1Gb. Would this work in my case? How 
>>>>>>> could
>>>>>>> it be done.
>>>>>>> - Is the objects would be evicted automatically from overflow
>>>>>>> storage using LRU policy?
>>>>>>>
>>>>>>> Thanks in advance for your answers, ideas, suggestions.
>>>>>>> Eugene
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to