Hi Steve,

This Knowledge Base article may be related:
https://help.basho.com/entries/20141178-why-does-it-seem-that-bitcask-merging-is-only-triggered-when-a-riak-node-is-restarted

Thanks,
Dan

Daniel Reverri
Developer Advocate
Basho Technologies, Inc.
[email protected]


On Mon, Jun 13, 2011 at 10:25 AM, Steve Webb <[email protected]> wrote:

> Justin -
>
> My current bitcask settings are:
>
>  %% Bitcask Config
>  {bitcask, [
>             {data_root, "/var/lib/riaksearch/bitcask" },
>             {dead_bytes_merge_trigger, 10242880 },
>             {dead_bytes_threshold, 5242880 },
>             {expiry_secs, 86400}
>           ]},
>
> My understanding of these settings mean that the data should auto-expire
> after one day.  Also, once each bitcask file in
> .../riaksearch/bitcask/xxx/*.data once it has 10M of "dead" or expired data
> in it, should be merged, right?
>
> I'm collecting the spritzer twitter stream and loading it into two buckets
> (one non-indexed bucket holds the full tweet, one indexed bucket holds the
> tweet string, id, date and username).  I used to see about 10 GB of data
> total, but it's growing and currently at 26GB of data total.
>
> I'm seeing these in the logs:
>
> INFO REPORT==== 13-Jun-2011::08:28:19 ===
> Pid <0.6844.0> compacted 3 segments for 942232 bytes in 4.900694 seconds,
> 0.18 MB/sec
>
> =INFO REPORT==== 13-Jun-2011::08:29:01 ===
> Pid <0.6267.0> compacted 3 segments for 1721790 bytes in 9.690511 seconds,
> 0.17 MB/sec
>
> =INFO REPORT==== 13-Jun-2011::08:31:23 ===
> Pid <0.6924.0> compacted 3 segments for 6988416 bytes in 44.659753 seconds,
> 0.15 MB/sec
>
> ... but I'm not seeing any "merging" related entries.
>
>
> - Steve
>
> --
> Steve Webb - Senior System Administrator for gnip.com
> http://twitter.com/GnipWebb
>
> On Wed, 8 Jun 2011, Justin Sheehy wrote:
>
>  Hi, Steve.
>>
>> Check out this page:
>> http://wiki.basho.com/Bitcask-Configuration.html#Disk-Usage-and-Merging-Settings
>>
>> Basically, a "merge trigger" must be met in order to have the merge
>> process occur.  When it does occur, it will affect all existing files that
>> meet a "merge threshold."
>>
>> One note that is relevant for your specific use: the expiry_secs parameter
>> will cause a given item to disappear from the client API immediately after
>> expiry, and to be cleaned if it is in a file already being merged, but will
>> not currently contribute toward merge triggers or thresholds on its own if
>> not otherwise "dead".
>>
>> -Justin
>>
>>
>> On Jun 7, 2011, at 4:29 PM, Steve Webb wrote:
>>
>>  Hello there.
>>>
>>>
>>> I'm curious - I'm up to about 10GB of storage and I'm guessing that I'll
>>> be full in 3-4 more days of ingesting data.  I have no idea if/when a merge
>>> will run to expire the older data.
>>>
>>> I'm loading a 2-node (1GB mem, 20GB storage, vmware VMs) riaksearch
>>> cluster with the spritzer twitter feed.  I used the bitcask 'expiry_secs' to
>>> expire data after 3 days. Q: Is there a method or command to force a merge
>>> at any time? Q: Is there a way to run a merge when the storage size reaches
>>> a specific threshold?
>>>
>>>
>>> - Steve
>>>
>>> --
>>> Steve Webb - Senior System Administrator for gnip.com
>>> http://twitter.com/GnipWebb
>>>
>>> _______________________________________________
>>> riak-users mailing list
>>> [email protected]
>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>
>>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to