You need to estimate the size of the split. First, get the id of the table with "tables -l" in the accumulo shell.
Then, find out the size of table in hdfs: $ hadoop fs -dus /accumulo/tables/<id> Divide by 7, and use that as the split size: shell> config -t mytable -s table.split.threshold=newsize The table will automatically split out. Afterwards, you can then raise the split size to keep it from splitting until it gets much bigger: shell> config -t mytable -s table.split.threshold=1G -Eric On Mon, May 21, 2012 at 12:24 PM, Perko, Ralph J <[email protected]>wrote: > Hi, > > I am looking for advice on how to best layout my table splits. I have a 7 > node cluster and my table contains ~10M records. I would like to split the > table equally across all the servers however I see no utility to do this in > this manner. I understand I can create splits for some letter range but I > was hoping for some way to have accumulo create "n" equal splits. Is this > possible? Right now the best way I see to handle this is to write a > utility that iterates the table, keeps a count and at some given value > (table size/ split count) spits out the beginning and end row and then I > create the split manually. > > Thanks, > Ralph > > __________________________________________________ > Ralph Perko > Pacific Northwest National Laboratory > > >
