I used the standard agitation intervals. I don't understand enough about the system yet to ascertain why tablets stayed unbalanced. One possibility is the timing of the checks and how that interacted with the 15-minute time allowance and minimum count:
1. The first failure condition occurred at 11:36, starting the 15-minute clock. 2. The second failure condition was at the next check 30 minutes later. 3. A rapid succession of checks in the next two minutes pushed the failure count up high enough. It's possible that the tablets became balanced, and then unbalanced again, between steps 1 and 2, so the time allowance was defeated. Anyway, I restarted the randomwalk and it ran successfully for over 24 hours with agitation. On Sun, Feb 9, 2014 at 7:25 PM, Josh Elser <[email protected]> wrote: > Interesting - I think I might have run into that once a whole bunch of RW > runs. > > I assume you didn't change the agitation intervals from what's in the > example? The parameters as they stand are, I think, acceptable. Being > unbalanced for that long doesn't seem right. Did you identify why you were > unbalanced? > > I'm not sure making that configurable is good either as you're now skewing > one randomwalk test to another (in addition to the variance you already > have from resources available). > > Personally, if you run into this, and you can identify that there was a > legitimate reason to be unbalanced across that time and those checks, I'd > be more in favor of just restarting that RW client. > > > On 2/8/14, 11:50 AM, Bill Havanki wrote: > >> While running 1.5.1 rc1 through randomwalk I hit a failure in the >> Concurrent test due to the tablet servers being "unbalanced". See >> ACCUMULO-2198 for some background on the last time I ran into this. >> >> What is the general feeling on dealing with this failure? Is a 15-minute >> period too short to wait for balancing, or three consecutive failures too >> few to allow? I'm using only a 7-node cluster with 5 tservers, maybe an >> unbalanced condition is more tolerable then? >> >> The parameters defining "unbalanced" aren't configurable at the moment, >> and >> I'm inclined to file a JIRA to make them so, to shepherd the test through, >> but I'd love to hear what you think about the importance and proper >> parameters for this check. >> >> Thanks, >> Bill >> >> -- | - - - | Bill Havanki | Solutions Architect, Cloudera Government Solutions | - - -
