Thanks a lot!!.
Just want to make sure I understood you correctly: if I set Compaction
Threshold to 5% and the garbage is only 3% of the data, and I'm
forcing the compaction it will not remove those 3%? So forcing the
compaction is only suggesting to check the threshold right now, no more?
I've studied the documents, and ended up with such configuration:
DiskStore diskStore = cache.createDiskStoreFactory()
.setMaxOplogSize(512)
.setDiskDirsAndSizes(new File[] { new File("/opt/ccio/geode/store") },
new int[] { 18000 })
.setAllowForceCompaction(true)
.setCompactionThreshold(5)
.create("-ccio-store");
I know it looks dangerous, but in my case the cache constantly grows,
no updates, no deletes, just writes and many reads. So,
auto-compaction should nether happen until my custom Disk Space
Checker would detect free disk space is less than 1Gb, and it will
kick scan of the local view of the region and finds LRU records and
delete them. Looks like at this point the optlogs would only grow more
consuming even more space, but I'm forcing the compaction right off.
Using setting above I'm planing to utilize as much disk space as possible.
Will be testing on our staging, and tweak the number. Will see how it
would go.
Thanks again! Geode looks very promising! I've tried several solutions
before I've ended up with Geode. All of them had problems I couldn't
ditch, and looks like the Geode is the one I'll be married to :-))
Eugene
On Mon, May 2, 2016 at 6:52 PM, Barry Oglesby <[email protected]
<mailto:[email protected]>> wrote:
Answers / comments below.
Thanks,
Barry Oglesby
On Mon, May 2, 2016 at 8:58 AM, Eugene Strokin
<[email protected] <mailto:[email protected]>> wrote:
Barry, I've tried your code.
It looks like the function call is actually waiting till all
the nodes would complete the function, which I don't really
need, but it was fun to watch how everything works in the
cluster.
Yes, the way that function is currently implemented causes it to
wait for all results.
Functions can either return a result or not. If they return a
result, then the caller will wait for that result from all members
processing that function.
You can change the function to not return a result
(fire-and-forget) by:
- changing hasResult to return false
- not returning a result from the execute method (remove
context.getResultSender().lastResult(true)
- not expecting a result in the client (remove collector.getResult())
Even though I didn't use the function call, everything else
works just fine.
I was able to iterate through the local cache and find the LRU
entity.
Because I had to finish the whole loop before actually
destroying the item from the cache, I used:
region.destroy(toBeDeleted);
"region" is the region I've created using cache, not the
region I used for iterating the data:
Region<String, byte[]> localView =
PartitionRegionHelper.getLocalPrimaryData(region);
"localView" contains the local data which I actually iterate
through. I've tried to do:
localView.destroy(toBeDeleted);
But it didn't work for some reason. "region.destroy" works,
but I'm not sure this is the right way to do this. If not,
please let me know.
PartitionRegionHelper.getLocalPrimaryData(region) just returns a
LocalDataSet the wraps the local primary buckets. Most operations
on it (including destroy) are delegated to the underlying
partitioned region.
So, invoking destroy on either region should work. What exception
are you seeing with localView.destroy(toBeDeleted)?
The main problem is that even though I'm destroying some data
from the Cache, but I don't see the available hard drive space
is getting bigger, even when I force compaction every time I
destroy an Item.
I've destroyed about 300 items, no free disc space gained.
I'm guessing if I delete enough items from cache it will
actually free up some space on disk. But what is this magical
number? Is it the size of a bucket or anything else?
Whenever a new cache operation occurs, a record is added to the
end of the current oplog for that operation. Any previous
record(s) for that entry are no longer valid, but they still exist
in the oplogs. For example, a create followed by a destroy will
cause the oplog to contain 2 records for that entry.
The invalid records aren't removed until (a) the oplog containing
the invalid records is a configurable (default=50) percent garbage
and (b) a compaction occurs.
So, forcing a compaction after each destroy probably won't do much
(as you've seen). The key is to get the oplog to be N% garbage so
that when a compaction occurs, it is actually compacted.
The percentage is configurable via the compaction-threshold
attribute. The lower you set this attribute, the faster oplogs
will be compacted. You need to be a bit careful though. If you set
this attribute too low, you'll be constantly copying data between
oplogs.
Check out these docs pages regarding the compaction-threshold and
compaction:
http://geode.docs.pivotal.io/docs/managing/disk_storage/disk_store_configuration_params.html
http://geode.docs.pivotal.io/docs/managing/disk_storage/compacting_disk_stores.html
Thanks,
Eugene
On Thu, Apr 28, 2016 at 1:53 PM, Barry Oglesby
<[email protected] <mailto:[email protected]>> wrote:
I think I would use a function to iterate all the local
region entries pretty much like Udo suggested.
I attached an example that iterates all the local primary
entries and, based on the last accessed time, removes
them. In this example the test is '< now', so all entries
are removed. Of course, you do whatever you want with that
test.
The call to PartitionRegionHelper.getLocalDataForContext
returns only primary entries since optimizeForWrite
returns true.
This function currently returns 'true', but it could
easily be changed to return an info object containing the
number of entries checked and removed (or something similar)
Execute it on the region like:
Execution execution = FunctionService.onRegion(this.region);
ResultCollector collector =
execution.execute("CheckLastAccessedTimeFunction");
Object result = collector.getResult();
Thanks,
Barry Oglesby
On Wed, Apr 27, 2016 at 5:36 PM, Eugene Strokin
<[email protected] <mailto:[email protected]>> wrote:
Udo, thanks a lot. Yes, I do have the same idea to run
the process on each node, and once it finds that there
is not much space left it would kick old records out
on that server. I'll give your code a try first thing
tomorrow. Looks like this is exactly what I need.
Anil, Udo is right, I've managed to set up eviction
from heap to overflow disk storage. It looks fine now.
I'm running a performance test currently and it looks
stable so far. But my cache is ever growing, and I
could run out of space. The nature of the data allows
me to remove old cached items without any problem, and
if they are needed again, I could always get them from
a storage.
So, Geode evicts from memory to overflow, but I also
need to evict the items completly off the cache
On Apr 27, 2016 6:02 PM, "Udo Kohlmeyer"
<[email protected] <mailto:[email protected]>>
wrote:
Anil,
Eugene's usecase is such that his memory is low
(300Mb) but larger diskspace.
He has already configured eviciton to manage the
memory aspect. He is just trying to clean up some
local disk space. This is a continuation of a
previous thread "System Out of Memory".
But yes, eviciton could fulfill the same
requirement if his memory was larger.
--Udo
On 28/04/2016 7:41 am, Anilkumar Gingade wrote:
Any reason why the supported eviction/expiration
does not work for your case...
-Anil.
On Wed, Apr 27, 2016 at 1:49 PM, Udo Kohlmeyer
<[email protected]
<mailto:[email protected]>> wrote:
Hi there Eugene,
The free space checking code, is that running
as a separate process or as part of each of
the server jvms?
I would run the free space checking as part
of each server(deployed as part of the server
code). This way each server will monitor it's
own free space.
I'm not sure how to get the last access time
of each item, but if you can get hold of that
information, then you can run some code that
will use the
PartitionRegionHelper.getLocalData(Region) or
PartitionRegionHelper.getLocalPrimaryData(Region)
to get the local data.
Then you could remove/invalidate the data entry.
Also disk store compaction now plays a role.
So you might have to trigger a compaction of
the diskstore in order to avoid unnecessary
data being held in the diskstores.
The simplest way you could do this is by
running the following: (as per the DiskStore
API
<http://geode.incubator.apache.org/releases/latest/javadoc/com/gemstone/gemfire/cache/DiskStore.html>)
Cache cache = CacheFactory.getAnyInstance();
DiskStore diskstore =
cache.findDiskStore("diskStoreName");
diskstore.forceCompaction();
The forceCompaction method is blocking, so
please do not make this code as part of some
critical processing step.
--Udo
On 28/04/2016 6:25 am, Eugene Strokin wrote:
I'm running a periodic check of the free
space on each node of my cluster. The
cluster contains a partitioned region.
If some node is getting full, I'd like to
remove least recently used items to free up
the space. New items are getting loaded
constantly.
I've enabled statistics, so it looks like I
can get last access time of each item, but
I'd like to iterate through only "local"
items, the items which are stored on the
local node only. I'm trying different
things, but none of them seems right.
Is it even possible? If so, could you please
point me to the right direction?
Thank you,
Eugene