Hi Jeremy,

The max_file_size parameter controls when Bitcask will close the currently
active data file and start a new data file. The active data file will not be
considered when determining if a merge should occur. The default
max_file_size is 2GBs. This means that each partition in the system can grow
to 2GBs before the data files are considered for merging. This is likely
what you are seeing in your situation.

You can lower the max_file_size in the app.config file under the bitcask
section. This parameter should be specified in bytes.

This article is related to your issue:
https://help.basho.com/entries/20141178-why-does-it-seem-that-bitcask-merging-is-only-triggered-when-a-riak-node-is-restarted

Thanks,
Dan

Daniel Reverri
Developer Advocate
Basho Technologies, Inc.
[email protected]


On Wed, Sep 14, 2011 at 8:49 AM, Jeremy Raymond <[email protected]> wrote:

> If I'm reading the docs correctly, only files smaller
> than small_file_threshold will be included in a merge. So
> if small_file_threshold must be bigger than max_file_size for a merge to
> happen?
>
> - Jeremy
>
>
>
> On Wed, Sep 14, 2011 at 10:23 AM, Jeremy Raymond <[email protected]>wrote:
>
>> Maybe I just need to tweak the Bitcask parameters to merge more often?
>>
>> I have approx 17000 keys which get overwritten once an hour. After each
>> updated the /var/lib/riak/bitcask folder grows by 20 MB (so about 1200 bytes
>> per key). With the default frag_merge_trigger at 60 I should get a merge
>> every 3 hours as I would have > 60% of the keys being dead? This would also
>> meet the default frag_threshold of 40 since > 40% of the keys are dead? I'm
>> not seeing the merging happening.
>>
>> - Jeremy
>>
>>
>>
>> On Wed, Sep 14, 2011 at 9:28 AM, Jeremiah Peschka <
>> [email protected]> wrote:
>>
>>> I would think that the InnoDB backend would be a better backend for the
>>> use case you're describing.
>>> ---
>>> Jeremiah Peschka - Founder, Brent Ozar PLF, LLC
>>> Microsoft SQL Server MVP
>>>
>>> On Sep 14, 2011, at 8:09 AM, Jeremy Raymond wrote:
>>>
>>> > Hi,
>>> >
>>> > I store data in Riak whose keys constantly get overwritten with new
>>> data. I'm currently using Bitcask as the back-end and recently noticed the
>>> Bitcask data folder grow to 24GB. After restarting the nodes, which I think
>>> triggered Bitcask merge, the data went down to 96MB. Today the data dirs are
>>> back up to around 500MB. Would an alternate backend better suit this type of
>>> use case where keys are constantly being overwritten?
>>> >
>>> > - Jeremy
>>> > _______________________________________________
>>> > riak-users mailing list
>>> > [email protected]
>>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>>>
>>>
>>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to