Where by "competition", I meant "compaction". Derp. Best Regards,
Armon Dadgar On Tuesday, May 8, 2012 at 3:54 PM, Armon Dadgar wrote: > Hey Scott, > > My mistake, I was not sure if the claimant was responsible for convergence. > > If this was a competition, it was not one that would ever finish… The node > went > down at about 1AM, and by 9AM when I started to resolve the issue it was in > the same state. I was unable to investigate the state of that machine, as it > was refusing any SSH connections. > > Thanks for mentioning the key's. We've been thinking of doing just that > to get keys lexicographically near. > > Best Regards, > > Armon Dadgar > > > On Tuesday, May 8, 2012 at 3:26 PM, Scott Lystig Fritchie wrote: > > > > > > "ar" == Armon Dadgar <[email protected] > > > > > (mailto:[email protected])> wrote: > > > > > > > > > > > > > ar> All the nodes appeared to have been blocked trying to talk to riak > > ar> 001 which was the ring claimant at the time. Doing this seems to > > ar> have cleared the state enough for the cluster to make progress > > ar> again. > > > > Armon, it's quite unlikely that the ring claimant was doing anything > > special because the claimant only acts when cluster membership changes. > > > > Instead, it's quite likely that riak001 was busy doing a set of LevelDB > > compactions. There have been a number of changes recently to reduce the > > amount of time that we've seen worst-case LevelDB compaction blocking Erlang > > process schedulers which blocks *everything*, including the keep-alives > > that are sent between Erlang nodes. The longest LevelDB-related > > stoppage that I've seen was 7.5 minutes. :-( When that happens on a > > node X, then all other nodes will complain (almost simultaneously) that > > node X is down. It's not *down*, it's just reallyreallyreally slow to > > respond to messages ... which is effectively the same as being down. > > > > Checking for big LevelDB compaction storms is pretty easy using > > DTrace or SystemTap, but you're probably not using a kernel that > > has user-space SystemTap available. There are compaction messages > > in the "LOG" file of each LevelDB data directory. The hassle is the > > need to look at all of them in parallel. > > > > A secondary effect is watching write ops via "iostat -x 1": the > > amount of data written spikes much higher than writes triggered only by > > Riak client operations. (Read ops would go higher too, except that many > > files input to a compaction are already cached by the OS.) > > > > Your primary keys look UID'ish. If they are not lexigraphically adjacent > > to other keys inserted at the same time, you will cause many more LevelDB > > compaction events than if your keys were adjacent (e.g. prefixing them with > > a wall-clock timestamp). > > > > -Scott >
_______________________________________________ riak-users mailing list [email protected] http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
