I have a persistent issue I am trying to diagnose. In our use of Riak we
have multiple data creators writing into a 7 node cluster. The value size
is a bit large at around 2MB. The behavior I am seeing is that if I delete
all data out of bitcask, then test performance I get fast writes. As I keep
doing the same work of writing to the cluster, then the Riak write times
will start tailing off and getting really bad.

Initial write times seen by my application: 0.5 seconds for 100MB worth of
values (~200MB/s)
Subsequent write times: 11 seconds for 100MB worth of values (~9MB/s)

This slow down can happen over roughly 20-40 minutes of writing or about
200GB worth of key/value pairs written.

I can reset the cluster to get fast performance again by stopping Riak and
deleting the bitcask directories, then starting Riak again. This step is
not feasible for production, but during testing at least the write speed
goes up by 20x.

Watching iostat I see that every few seconds the disk io jumps to ~11%. It
doesn't seem that highly loaded from my cursory look. Watching top I see
that beam.smp runs at around 100 for CPU% or less when heavily loaded. I am
not sure how to tell what it is doing though :-)

Thanks for any suggestions!!

-Matt



================
System Description


avg value size = 2MB
Riak version = 1.4.1
n_val = 2
client threads total = 105
backend = bitcask
ring_creation_size = 128
node count = 7
node OS = RHEL 6.2
server RAM = 128GB
RAID = RAID0 across 8 SAS drives
FS = ext4
FS options = /dev/mapper/vg0-lv0 / ext4
rw,noatime,barrier=0,stripe=512,data=ordered 0 0
bitcask size on one server = 133GB
AAE = off
interface = protobuf
client library = riak java client
file-max = 65536
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to