Am 26.09.2012 23:14, schrieb Anthony Molinaro:
On Wed, Sep 26, 2012 at 08:33:24PM +0200, Nico Meyer wrote:
Great, this explains why bitcask went merging crazy after we updated
from 0.14 to 1.1! The workaround was to constrain merging to the
nighttime, and since it kind of worked(TM) since then, I never came
round to look for the reason for the increased I/O load after the
update.
Yeah, this is how we noticed as well, moving from 0.14.2 to 1.1.4
caused IO to go crazy.  It was partially fixed with 1.2, but not
completely, and we ended up with the grace time fix.

I just checked, and this behaviour was introduced in 1.0.3, but
nothing was mentioned in the release notes :-(. In older versions,
key expiration really had no impact on triggering merges. This might
cause problems on its own. If say all you ever do is adding new
keys, merging is never triggered, which is bad if you are recording
time series for examples.
I assume, that this is the reason Basho changed the merge triggering
logic. No disrespect, but the current solution (including the
proposed fix) is just wrong. At least if there is a continous,
evenly spaced (in time) stream of expiring keys, as should be the
case in most scenarios where one would use key expiration, it
amounts to triggering a merge each time, just like setting
frag_merge_trigger to zero.
The proposed fix will not improve anything, if we are talking about
the expiry_grace_time option
(https://github.com/basho/bitcask/blob/master/src/bitcask.erl#L569).
It just shift time, which is all the same, if you have an infinite
stream of expiring keys.
At the very least, a freshly merged bitcask file should be exempt
from being merged for a certain time, to limit the rate of merging
the same keys.
Hmm, well we haven't actually tried the fix yet (many other things
are competing for time).  I did mention on our support ticket that
I didn't think it was a great solution, as I still suspected it
would have a stampeding herd sort of mode.  But if things are as
you suspect it won't make a difference, and we'll still see the
issue, :(
If the final fix is contained in the most recent version on github, then all it effectively does is to add an offset to the expiration time (for the purpose of triggering merges, not actual expiration).
I believe the real solution has to involve some form of counting the
number of expired keys for each file, and then adding those count to
the number of deleted/outdated keys.
If scanning the key dir in memory periodically is to expensive, it
is easy enough to do it approximately by storing a histogram of
timestamps for each file. At least if the expiration time is large
compared to the time between file rotations, this should  be fairly
accurate, and the accuracy vs. CPU/mem can be easily tuned by
changing the size of the histogram. But talk is cheap. I will have
to look into this anyway, now that I know about it, so hopefully a
patch might fall out at the end.
I'm not sure why the old behavior of only having expire matter when
you merge would be an issue.  I would think that having keys expire
from the in memory index would cause those entries in the files to
be marked as dead (as they are not indexed), then your merge trigger
is on dead byte percentage.  But maybe that's not the way indexing
works.  I unfortunately haven't and don't have the time right now to
dig in and have been leaving it up to basho support, but maybe you'll
end up with a better solution (and thus a better patch).

Good Luck,

-Anthony

The old behaviour can be a problem if you never update or delete any keys explicitly, which is mostly the case if one records time series. There is/was no background thread that scans for expired keys memory, so bitcask does not know how many keys are expired at any given moment. Background scanning would indeed be one solution. But maybe one that is too expensive CPU wise.

After sleeping over the problem, I believe indeed the easiest solution would be to look at the file creation time, and only trigger merges for files that are at least an hour old or so (could be configurable). But in your and my case this will most likely just degenerate to merging every hour anyway, which is still much better than merging every 3 minutes, as is the case now.

Come to think of it, increasing the interval between merge checks (make it configurable!) might just work as well for most people. So maybe that's the quickest solution.

Cheers,
Nico


_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to