*To:* user@accumulo.apache.org
*Subject:* Re: Straggler problem in Accumulo BatchScans
A new balancer is a plug-in class that instructs the Master process where
to place tablets.
If you know you need your tablets spread out over servers based on time
(row id), you can do that. It's pretty common
David,
Each tablet is hosted by one tablet server, and there's no way around
that. (This is actually quite reasonably; otherwise, we would receive
duplicate results from multiple tablet servers.)
One strategy to deal with imbalanced data is to add a random partition
prefix to your row Ids.
.
** **
D
** **
*From:* James Hughes [mailto:jn...@virginia.edu]
*Sent:* Wednesday, August 21, 2013 7:29 PM
*To:* user@accumulo.apache.org
*Subject:* Re: Straggler problem in Accumulo BatchScans
** **
David,
Each tablet is hosted by one tablet server, and there's no way around
?
** **
Thanks,
David
** **
*From:* Dave Marion [mailto:dlmar...@comcast.net]
*Sent:* Wednesday, August 21, 2013 7:29 PM
*To:* user@accumulo.apache.org
*Subject:* RE: Straggler problem in Accumulo BatchScans
** **
How is the table organized?
What percent of the table
From: Dave Marion [mailto:dlmar...@comcast.netmailto:dlmar...@comcast.net]
Sent: Wednesday, August 21, 2013 7:29 PM
To: user@accumulo.apache.orgmailto:user@accumulo.apache.org
Subject: RE: Straggler problem in Accumulo BatchScans
How is the table organized?
What percent of the table are you scanning
Sent: Wednesday, August 21, 2013 8:12:46 PM
Subject: RE: Straggler problem in Accumulo BatchScans
Thanks Eric,
Just to make sure I’m going in the right direction, this would involve
extending the TabletBalancer class, correct? How do I add it to the table after
that (and remove the old