YCSB is gearing up for its next monthly release, and I really want to
add in an Accumulo specific README for running workloads.
This is generally so that folks have an easier time running tests
themselves. It's also because I keep testing Accumulo for the YCSB
releases and coupled with a README file we'd get an Accumulo-specific
convenience binary. Avoiding the bulk of dependencies that get
included in the generic YCSB distribution artifact is a big win.
The thing I keep getting hung up on is remembering how to properly
split the Accumulo table for YCSB workloads. The HBase README has a
great hbase shell snippet for doing this (because users can copy/paste
it)[1]:
----
3. Create a HBase table for testing
For best results, use the pre-splitting strategy recommended in HBASE-4163:
hbase(main):001:0> n_splits = 200 # HBase recommends (10 * number of
regionservers)
hbase(main):002:0> create 'usertable', 'family', {SPLITS =>
(1..n_splits).map {|i| "user#{1000+i*(9999-1000)/n_splits}"}}
Failing to do so will cause all writes to initially target a single
region server.
----
Anyone have a work up of an equivalent for Accumulo that I can include
under an ASLv2 license? I seem to recall madrob had something done in
a bash script, but I can't find it anywhere.
[1]: http://s.apache.org/CFe
--
Sean