Ahh, OK I get it now... if _any_ of these thresholds are met _and_ the files are not active (i.e. they have grown larger than max_file_size) they'll be merged. Thanks!
- Jeremy On Wed, Sep 14, 2011 at 3:16 PM, Dan Reverri <[email protected]> wrote: > At any point in time Bitcask may have data spread across a number of data > files. Bitcask occasionally runs a merge process which reads the data from > those files and writes a merged set of data to a new file. Once completed > the old files can be removed and the new file is used for future read > operations. > > The merge process chooses which files to merge based on a number of > thresholds. The thresholds are: > > {frag_threshold, 40}, % >= 40% fragmentation > {dead_bytes_threshold, 134217728}, % Dead bytes > 128 MB > {small_file_threshold, 10485760}, % File is < 10 MB > > If a data file exceeds any of the thresholds it will be included in the > merge process. The small_file_threshold means that any inactive data file > that is less than 10MB will be included in the merge process. > > Thanks, > Dan > > > Daniel Reverri > Developer Advocate > Basho Technologies, Inc. > [email protected] > > > On Wed, Sep 14, 2011 at 9:50 AM, Jeremy Raymond <[email protected]>wrote: > >> Ok, thanks I'll give that a try. >> >> What does small_file_threshold do then? >> >> - Jeremy >> >> >> >> On Wed, Sep 14, 2011 at 12:48 PM, Dan Reverri <[email protected]> wrote: >> >>> Hi Jeremy, >>> >>> The max_file_size parameter controls when Bitcask will close the >>> currently active data file and start a new data file. The active data file >>> will not be considered when determining if a merge should occur. The default >>> max_file_size is 2GBs. This means that each partition in the system can grow >>> to 2GBs before the data files are considered for merging. This is likely >>> what you are seeing in your situation. >>> >>> You can lower the max_file_size in the app.config file under the bitcask >>> section. This parameter should be specified in bytes. >>> >>> This article is related to your issue: >>> >>> https://help.basho.com/entries/20141178-why-does-it-seem-that-bitcask-merging-is-only-triggered-when-a-riak-node-is-restarted >>> >>> Thanks, >>> Dan >>> >>> Daniel Reverri >>> Developer Advocate >>> Basho Technologies, Inc. >>> [email protected] >>> >>> >>> >>> On Wed, Sep 14, 2011 at 8:49 AM, Jeremy Raymond <[email protected]>wrote: >>> >>>> If I'm reading the docs correctly, only files smaller >>>> than small_file_threshold will be included in a merge. So >>>> if small_file_threshold must be bigger than max_file_size for a merge to >>>> happen? >>>> >>>> - Jeremy >>>> >>>> >>>> >>>> On Wed, Sep 14, 2011 at 10:23 AM, Jeremy Raymond >>>> <[email protected]>wrote: >>>> >>>>> Maybe I just need to tweak the Bitcask parameters to merge more often? >>>>> >>>>> I have approx 17000 keys which get overwritten once an hour. After each >>>>> updated the /var/lib/riak/bitcask folder grows by 20 MB (so about 1200 >>>>> bytes >>>>> per key). With the default frag_merge_trigger at 60 I should get a merge >>>>> every 3 hours as I would have > 60% of the keys being dead? This would >>>>> also >>>>> meet the default frag_threshold of 40 since > 40% of the keys are dead? >>>>> I'm >>>>> not seeing the merging happening. >>>>> >>>>> - Jeremy >>>>> >>>>> >>>>> >>>>> On Wed, Sep 14, 2011 at 9:28 AM, Jeremiah Peschka < >>>>> [email protected]> wrote: >>>>> >>>>>> I would think that the InnoDB backend would be a better backend for >>>>>> the use case you're describing. >>>>>> --- >>>>>> Jeremiah Peschka - Founder, Brent Ozar PLF, LLC >>>>>> Microsoft SQL Server MVP >>>>>> >>>>>> On Sep 14, 2011, at 8:09 AM, Jeremy Raymond wrote: >>>>>> >>>>>> > Hi, >>>>>> > >>>>>> > I store data in Riak whose keys constantly get overwritten with new >>>>>> data. I'm currently using Bitcask as the back-end and recently noticed >>>>>> the >>>>>> Bitcask data folder grow to 24GB. After restarting the nodes, which I >>>>>> think >>>>>> triggered Bitcask merge, the data went down to 96MB. Today the data dirs >>>>>> are >>>>>> back up to around 500MB. Would an alternate backend better suit this >>>>>> type of >>>>>> use case where keys are constantly being overwritten? >>>>>> > >>>>>> > - Jeremy >>>>>> > _______________________________________________ >>>>>> > riak-users mailing list >>>>>> > [email protected] >>>>>> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>>>> >>>>>> >>>>> >>>> >>>> _______________________________________________ >>>> riak-users mailing list >>>> [email protected] >>>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com >>>> >>>> >>> >> >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
