How about this?
#!/bin/sh
...
for i in data/expiring_bitcask/* ; do $ERTS_PATH/to_erl $PIPE_DIR<<EOF
bitcask:merge("$i/",[{dead_bytes_merge_trigger, 0},{dead_bytes_threshold,
0},{small_file_threshold, 16#80000000},{expiry_secs, 86400}]).
EOF
done
I have a backend (part of a multi_backend) used exclusively for storing data
with 9-hour TTL (and so my config expires the data after 24 hours). It's
seldom that k/v's are deleted, so merging during my nightly merge window only
slightly compacts the data files into nice, huge, zero-dead-byte files that
never again will trigger a merge, and are included in merges triggered by new
files since they fall under the small_file_threshold (which is huge for this
reason).
>From app.config:
{<<"expiring_bitcask">>, riak_kv_bitcask_backend, [
{expiry_secs, 86400},
{max_file_size, 134217728},
{dead_bytes_merge_trigger, 10485760},
{dead_bytes_threshold, 10485760},
{small_file_threshold, 16#80000000},
{merge_window, {1, 5}},
{data_root, "data/expiring_bitcask"}
I use dead-byte trigger and threshold values of 0 for manual merges in order to
force a merge, and 10mb values in the config to avoid continual merging during
my overnight window.
Does this make sense? I'd have thought my app.config would keep the bitcasks
small, but it seems they just grow bigger and never get cleared of much expired
data. The manual config I use above immediately shrinks the data files
dramatically.
Any suggestions for a cleaner approach?
-Nate
________________________________
From: [email protected] [[email protected]]
On Behalf Of Anthony Molinaro [[email protected]]
Sent: Monday, August 22, 2011 10:18 AM
To: Dan Reverri
Cc: raghwani sohil; [email protected]
Subject: Re: Riak Bitcask merging
While I didn't ask this time, I'll explain why I think manual
merging as an option would be great.
As far as I know specifying a merge window doesn't guarantee the
merging happens, only that it might if other thresholds are met.
With our cassandra cluster we've ended up scheduling twice weekly
full compactions (the close equivalent to merging I believe), via
cron. The days and times are specificaly chosen based on traffic
patterns, and can be changed without restarting the servers.
We don't have this convenience with riak. We can set it up to only
merge during a window, but can't guarantee everything was merged
or even that any merges will occur. If I wanted to change when I
do the merging, I have to restart the servers to pick up the new
config (at least I think I would). I have no way to stagger the
merging (have different nodes merge at different times), unless
I have slightly different config on each node.
If there were a riak-admin command called merge/compact/cleanup/expire
or something which triggered a manual merge I know we would use it.
And while I'm pining for additional command line tools, any idea if
transfers will ever work without disrupting the actual transfers?
It's sort of annoying that it's listed as one of the steps for a
rolling upgrade/restart but if you actually use it, it can cause
upgrade or startup to take longer. Also, it tends to timeout on
a highly trafficked cluster.
Anyway, sorry about hijacking someone else's question, but
figured more information from users is usually welcome?
-Anthony
On Mon, Aug 22, 2011 at 09:14:06AM -0700, Dan Reverri wrote:
> There is no way to manually trigger a Bticask merge. What's the use case for
> needing to manually trigger the merge? Are you concerned about the size of
> the data files? Are you trying to avoid merging at a particular time?
>
> Not sure if this will help but you can restrict Bitcask merging to a
> specified window of time:
> http://wiki.basho.com/Bitcask-Configuration.html#Merge-Window
>
> Thanks,
> Dan
>
> Daniel Reverri
> Developer Advocate
> Basho Technologies, Inc.
> [email protected]
>
>
> On Sun, Aug 21, 2011 at 11:36 PM, raghwani sohil <[email protected]>wrote:
>
> >
> > Hi All,
> >
> > Is there any way to run bitcask merging process manually ?
> >
> > thanks ,
> > Sohil Raghwani .
> >
> >
> > _______________________________________________
> > riak-users mailing list
> > [email protected]
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> >
> >
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
--
------------------------------------------------------------------------
Anthony Molinaro <[email protected]>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com