Re: Presplitting tables for the YCSB workloads

dlmarion Fri, 18 Sep 2015 06:08:19 -0700

I don't have a script for you, but if you need to create one you could use the 
script command in the shell to do something similar to the hbase script. Some 
examples are in the comments in jira[1]. If you can figure out how you want the 
table split, and it can be scripted, I might have time this weekend to write it 
up for you.

[1] https://issues.apache.org/jira/browse/ACCUMULO-1399 

----- Original Message -----

From: "Sean Busbey" <[email protected]> 
To: "Accumulo User List" <[email protected]> 
Sent: Thursday, September 17, 2015 10:10:29 PM 
Subject: Presplitting tables for the YCSB workloads 

YCSB is gearing up for its next monthly release, and I really want to 
add in an Accumulo specific README for running workloads. 

This is generally so that folks have an easier time running tests 
themselves. It's also because I keep testing Accumulo for the YCSB 
releases and coupled with a README file we'd get an Accumulo-specific 
convenience binary. Avoiding the bulk of dependencies that get 
included in the generic YCSB distribution artifact is a big win. 

The thing I keep getting hung up on is remembering how to properly 
split the Accumulo table for YCSB workloads. The HBase README has a 
great hbase shell snippet for doing this (because users can copy/paste 
it)[1]: 

---- 
3. Create a HBase table for testing 

For best results, use the pre-splitting strategy recommended in HBASE-4163: 

hbase(main):001:0> n_splits = 200 # HBase recommends (10 * number of 
regionservers) 
hbase(main):002:0> create 'usertable', 'family', {SPLITS => 
(1..n_splits).map {|i| "user#{1000+i*(9999-1000)/n_splits}"}} 

Failing to do so will cause all writes to initially target a single 
region server. 
---- 

Anyone have a work up of an equivalent for Accumulo that I can include 
under an ASLv2 license? I seem to recall madrob had something done in 
a bash script, but I can't find it anywhere. 

[1]: http://s.apache.org/CFe 

-- 
Sean

Re: Presplitting tables for the YCSB workloads

Reply via email to