Hi Steve, This Knowledge Base article may be related: https://help.basho.com/entries/20141178-why-does-it-seem-that-bitcask-merging-is-only-triggered-when-a-riak-node-is-restarted
Thanks, Dan Daniel Reverri Developer Advocate Basho Technologies, Inc. [email protected] On Mon, Jun 13, 2011 at 10:25 AM, Steve Webb <[email protected]> wrote: > Justin - > > My current bitcask settings are: > > %% Bitcask Config > {bitcask, [ > {data_root, "/var/lib/riaksearch/bitcask" }, > {dead_bytes_merge_trigger, 10242880 }, > {dead_bytes_threshold, 5242880 }, > {expiry_secs, 86400} > ]}, > > My understanding of these settings mean that the data should auto-expire > after one day. Also, once each bitcask file in > .../riaksearch/bitcask/xxx/*.data once it has 10M of "dead" or expired data > in it, should be merged, right? > > I'm collecting the spritzer twitter stream and loading it into two buckets > (one non-indexed bucket holds the full tweet, one indexed bucket holds the > tweet string, id, date and username). I used to see about 10 GB of data > total, but it's growing and currently at 26GB of data total. > > I'm seeing these in the logs: > > INFO REPORT==== 13-Jun-2011::08:28:19 === > Pid <0.6844.0> compacted 3 segments for 942232 bytes in 4.900694 seconds, > 0.18 MB/sec > > =INFO REPORT==== 13-Jun-2011::08:29:01 === > Pid <0.6267.0> compacted 3 segments for 1721790 bytes in 9.690511 seconds, > 0.17 MB/sec > > =INFO REPORT==== 13-Jun-2011::08:31:23 === > Pid <0.6924.0> compacted 3 segments for 6988416 bytes in 44.659753 seconds, > 0.15 MB/sec > > ... but I'm not seeing any "merging" related entries. > > > - Steve > > -- > Steve Webb - Senior System Administrator for gnip.com > http://twitter.com/GnipWebb > > On Wed, 8 Jun 2011, Justin Sheehy wrote: > > Hi, Steve. >> >> Check out this page: >> http://wiki.basho.com/Bitcask-Configuration.html#Disk-Usage-and-Merging-Settings >> >> Basically, a "merge trigger" must be met in order to have the merge >> process occur. When it does occur, it will affect all existing files that >> meet a "merge threshold." >> >> One note that is relevant for your specific use: the expiry_secs parameter >> will cause a given item to disappear from the client API immediately after >> expiry, and to be cleaned if it is in a file already being merged, but will >> not currently contribute toward merge triggers or thresholds on its own if >> not otherwise "dead". >> >> -Justin >> >> >> On Jun 7, 2011, at 4:29 PM, Steve Webb wrote: >> >> Hello there. >>> >>> >>> I'm curious - I'm up to about 10GB of storage and I'm guessing that I'll >>> be full in 3-4 more days of ingesting data. I have no idea if/when a merge >>> will run to expire the older data. >>> >>> I'm loading a 2-node (1GB mem, 20GB storage, vmware VMs) riaksearch >>> cluster with the spritzer twitter feed. I used the bitcask 'expiry_secs' to >>> expire data after 3 days. Q: Is there a method or command to force a merge >>> at any time? Q: Is there a way to run a merge when the storage size reaches >>> a specific threshold? >>> >>> >>> - Steve >>> >>> -- >>> Steve Webb - Senior System Administrator for gnip.com >>> http://twitter.com/GnipWebb >>> >>> _______________________________________________ >>> riak-users mailing list >>> [email protected] >>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>> >> >> > _______________________________________________ > riak-users mailing list > [email protected] > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
