Re: Benchmarking Cassandra with YCSB

Markus Klems Sat, 19 Feb 2011 10:30:32 -0800

Sure will do. We are currently running a couple of benchmarks on
differently configured EC2 landscapes. We will share our results in
the next weeks.


On Sat, Feb 19, 2011 at 6:53 PM, Lior Golan <lio...@taboola.com> wrote:
> Can you share what numbers you are now getting?
>
> -----Original Message-----
> From: markuskl...@gmail.com [mailto:markuskl...@gmail.com] On Behalf Of 
> Markus Klems
> Sent: Saturday, February 19, 2011 10:53 AM
> To: user@cassandra.apache.org
> Subject: Re: Benchmarking Cassandra with YCSB
>
> Hi,
>
> we sorted out the performance problems and tuned the cluster. In
> particular, we identified the following weak spot in our setup:
> ConcurrentReads and ConcurrentWrites was set to the default values
> which were much too low for our setup. Now, we get some serious
> numbers.
>
> Thanks,
>
> Markus
>
> On Tue, Feb 15, 2011 at 9:09 PM, Aaron Morton <aa...@thelastpickle.com> wrote:
>> Initial thoughts are you are overloading the cluster, are their any log 
>> lines about dropping messages?
>>
>> What is the schema, what settings do you have in Cassandra yaml  and what 
>> are CF stats telling you? E.g. Are you switching Memtables too quickly? What 
>> are the write latency numbers?
>>
>> Also 0.7 is much faster.
>>
>> Aaron
>>
>> On 16/02/2011, at 8:59 AM, Thibaut Britz <thibaut.br...@trendiction.com> 
>> wrote:
>>
>>> Cassandra is very CPU hungry so you might be hitting a CPU bottleneck.
>>> What's your CPU usage during these tests?
>>>
>>>
>>> On Tue, Feb 15, 2011 at 8:45 PM, Markus Klems <mar...@klems.eu> wrote:
>>>> Hi there,
>>>>
>>>> we are currently benchmarking a Cassandra 0.6.5 cluster with 3
>>>> High-Mem Quadruple Extra Large EC2 nodes
>>>> (http://aws.amazon.com/ec2/#instance) using Yahoo's YCSB tool
>>>> (replication factor is 3, random partitioner). We assigned 32 GB RAM
>>>> to the JVM and left 32 GB RAM for the Ubuntu Linux filesystem buffer.
>>>> We also set the user count to a very large number via ulimit -u
>>>> 999999.
>>>>
>>>> Our goal is to achieve max throughput by increasing YCSB's threadcount
>>>> parameter (i.e. the number of parallel benchmarking client threads).
>>>> However, this does only improve Cassandra throughput for low numbers
>>>> of threads. If we move to higher threadcounts, throughput does not
>>>> increase and even  decreases. Do you have any idea why this is
>>>> happening and possibly suggestions how to scale throughput to much
>>>> higher numbers? Why is throughput hitting a wall, anyways? And where
>>>> does the latency/throughput tradeoff come from?
>>>>
>>>> Here is our YCSB configuration:
>>>> recordcount=300000
>>>> operationcount=1000000
>>>> workload=com.yahoo.ycsb.workloads.CoreWorkload
>>>> readallfields=true
>>>> readproportion=0.5
>>>> updateproportion=0.5
>>>> scanproportion=0
>>>> insertproportion=0
>>>> threadcount= 500
>>>> target = 10000
>>>> hosts=EC2-1,EC2-2,EC2-3
>>>> requestdistribution=uniform
>>>>
>>>> These are typical results for threadcount=1:
>>>> Loading workload...
>>>> Starting test.
>>>>  0 sec: 0 operations;
>>>>  10 sec: 11733 operations; 1168.28 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=0.64] [READ AverageLatency(ms)=1.03]
>>>>  20 sec: 24246 operations; 1251.68 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=0.48] [READ AverageLatency(ms)=1.11]
>>>>
>>>> These are typical results for threadcount=10:
>>>> 10 sec: 30428 operations; 3029.77 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=2.11] [READ AverageLatency(ms)=4.32]
>>>>  20 sec: 60838 operations; 3041.91 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=2.15] [READ AverageLatency(ms)=4.37]
>>>>
>>>> These are typical results for threadcount=100:
>>>> 10 sec: 29070 operations; 2895.42 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=20.53] [READ AverageLatency(ms)=44.91]
>>>>  20 sec: 53621 operations; 2455.84 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=23.11] [READ AverageLatency(ms)=55.39]
>>>>
>>>> These are typical results for threadcount=500:
>>>> 10 sec: 30655 operations; 3053.59 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=72.71] [READ AverageLatency(ms)=187.19]
>>>>  20 sec: 68846 operations; 3814.14 current ops/sec; [UPDATE
>>>> AverageLatency(ms)=65.36] [READ AverageLatency(ms)=191.75]
>>>>
>>>> We never measured more than ~6000 ops/sec. Are there ways to tune
>>>> Cassandra that we are not aware of? We made some modification to the
>>>> Cassandra 0.6.5 core for experimental reasons, so it's not easy to
>>>> switch to 0.7x or 0.8x. However, if this might solve the scaling
>>>> issues, we might consider to port our modifications to a newer
>>>> Cassandra version...
>>>>
>>>> Thanks,
>>>>
>>>> Markus Klems
>>>>
>>>> Karlsruhe Institute of Technology, Germany
>>>>
>>
>
>

Re: Benchmarking Cassandra with YCSB

Reply via email to