On Wed, Dec 11, 2013 at 10:49 PM, Aaron Morton <aa...@thelastpickle.com>wrote:

> It is the write latency, read latency is ok. Interestingly the latency is
> low when there is one node. When I join other nodes the latency drops about
> 1/3. To be specific, when I start sending traffic to the other nodes the
> latency for all the nodes increases, if I stop traffic to other nodes the
> latency drops again, I checked, this is not node specific it happens to any
> node.
>
> Is this the local write latency or the cluster wide write request latency
> ?
>

This is a cluster wide write latency.


>
> What sort of numbers are you seeing ?
>


I have a custom application that writes data to the cassandra node, so the
numbers might be different than the standard stress test but it should be
good enough for comparison. With the previous release 1.0.12 I was getting
around 10K requests/ sec and with 1.2.12 I am getting around 6K requests/
sec. Everything else is the same. This is a three node cluster.

With a single node I get 3K for cassandra 1.0.12 and 1.2.12. So I suspect
there is some network chatter. I have started looking at the sources,
hoping to find something.

-sandeep


> Cheers
>
> -----------------
> Aaron Morton
> New Zealand
> @aaronmorton
>
> Co-Founder & Principal Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> On 12/12/2013, at 3:39 pm, srmore <comom...@gmail.com> wrote:
>
> Thanks Aaron
>
>
> On Wed, Dec 11, 2013 at 8:15 PM, Aaron Morton <aa...@thelastpickle.com>wrote:
>
>> Changed memtable_total_space_in_mb to 1024 still no luck.
>>
>> Reducing memtable_total_space_in_mb will increase the frequency of
>> flushing to disk, which will create more for compaction to do and result in
>> increased IO.
>>
>> You should return it to the default.
>>
>
> You are right, had to revert it back to default.
>
>
>>
>> when I send traffic to one node its performance is 2x more than when I
>> send traffic to all the nodes.
>>
>>
>>
>> What are you measuring, request latency or local read/write latency ?
>>
>> If it’s write latency it’s probably GC, if it’s read is probably IO or
>> data model.
>>
>
> It is the write latency, read latency is ok. Interestingly the latency is
> low when there is one node. When I join other nodes the latency drops about
> 1/3. To be specific, when I start sending traffic to the other nodes the
> latency for all the nodes increases, if I stop traffic to other nodes the
> latency drops again, I checked, this is not node specific it happens to any
> node.
>
> I don't see any GC activity in logs. Tried to control the compaction by
> reducing the number of threads, did not help much.
>
>
>> Hope that helps.
>>
>>  -----------------
>> Aaron Morton
>> New Zealand
>> @aaronmorton
>>
>> Co-Founder & Principal Consultant
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>> On 7/12/2013, at 8:05 am, srmore <comom...@gmail.com> wrote:
>>
>> Changed memtable_total_space_in_mb to 1024 still no luck.
>>
>>
>> On Fri, Dec 6, 2013 at 11:05 AM, Vicky Kak <vicky....@gmail.com> wrote:
>>
>>> Can you set the memtable_total_space_in_mb value, it is defaulting to
>>> 1/3 which is 8/3 ~ 2.6 gb in capacity
>>>
>>> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management
>>>
>>> The flushing of 2.6 gb to the disk might slow the performance if
>>> frequently called, may be you have lots of write operations going on.
>>>
>>>
>>>
>>> On Fri, Dec 6, 2013 at 10:06 PM, srmore <comom...@gmail.com> wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak <vicky....@gmail.com> wrote:
>>>>
>>>>> You have passed the JVM configurations and not the cassandra
>>>>> configurations which is in cassandra.yaml.
>>>>>
>>>>
>>>> Apologies, was tuning JVM and that's what was in my mind.
>>>> Here are the cassandra settings http://pastebin.com/uN42GgYT
>>>>
>>>>
>>>>
>>>>> The spikes are not that significant in our case and we are running the
>>>>> cluster with 1.7 gb heap.
>>>>>
>>>>> Are these spikes causing any issue at your end?
>>>>>
>>>>
>>>> There are no big spikes, the overall performance seems to be about 40%
>>>> low.
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Dec 6, 2013 at 9:10 PM, srmore <comom...@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak <vicky....@gmail.com>wrote:
>>>>>>
>>>>>>> Hard to say much without knowing about the cassandra configurations.
>>>>>>>
>>>>>>
>>>>>> The cassandra configuration is
>>>>>> -Xms8G
>>>>>> -Xmx8G
>>>>>> -Xmn800m
>>>>>> -XX:+UseParNewGC
>>>>>> -XX:+UseConcMarkSweepGC
>>>>>> -XX:+CMSParallelRemarkEnabled
>>>>>> -XX:SurvivorRatio=4
>>>>>> -XX:MaxTenuringThreshold=2
>>>>>> -XX:CMSInitiatingOccupancyFraction=75
>>>>>> -XX:+UseCMSInitiatingOccupancyOnly
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Yes compactions/GC's could skipe the CPU, I had similar behavior
>>>>>>> with my setup.
>>>>>>>
>>>>>>
>>>>>> Were you able to get around it ?
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> -VK
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Dec 6, 2013 at 7:40 PM, srmore <comom...@gmail.com> wrote:
>>>>>>>
>>>>>>>> We have a 3 node cluster running cassandra 1.2.12, they are pretty
>>>>>>>> big machines 64G ram with 16 cores, cassandra heap is 8G.
>>>>>>>>
>>>>>>>> The interesting observation is that, when I send traffic to one
>>>>>>>> node its performance is 2x more than when I send traffic to all the 
>>>>>>>> nodes.
>>>>>>>> We ran 1.0.11 on the same box and we observed a slight dip but not 
>>>>>>>> half as
>>>>>>>> seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM.
>>>>>>>> Changing CL to ONE make a slight improvement but not much.
>>>>>>>>
>>>>>>>> The read_Repair_chance is 0.1. We see some compactions running.
>>>>>>>>
>>>>>>>> following is my iostat -x output, sda is the ssd (for commit log)
>>>>>>>> and sdb is the spinner.
>>>>>>>>
>>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>>           66.46    0.00    8.95    0.01    0.00   24.58
>>>>>>>>
>>>>>>>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s
>>>>>>>> avgrq-sz avgqu-sz   await  svctm  %util
>>>>>>>> sda               0.00    27.60  0.00  4.40     0.00   256.00
>>>>>>>> 58.18     0.01    2.55   1.32   0.58
>>>>>>>> sda1              0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>>> sda2              0.00    27.60  0.00  4.40     0.00   256.00
>>>>>>>> 58.18     0.01    2.55   1.32   0.58
>>>>>>>> sdb               0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>>> sdb1              0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>>> dm-0              0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>>> dm-1              0.00     0.00  0.00  0.60     0.00     4.80
>>>>>>>> 8.00     0.00    5.33   2.67   0.16
>>>>>>>> dm-2              0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>>> dm-3              0.00     0.00  0.00 24.80     0.00   198.40
>>>>>>>> 8.00     0.24    9.80   0.13   0.32
>>>>>>>> dm-4              0.00     0.00  0.00  6.60     0.00    52.80
>>>>>>>> 8.00     0.01    1.36   0.55   0.36
>>>>>>>> dm-5              0.00     0.00  0.00  0.00     0.00     0.00
>>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>>> dm-6              0.00     0.00  0.00 24.80     0.00   198.40
>>>>>>>> 8.00     0.29   11.60   0.13   0.32
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I can see I am cpu bound here but couldn't figure out exactly what
>>>>>>>> is causing it, is this caused by GC or Compaction ? I am thinking it is
>>>>>>>> compaction, I see a lot of context switches and interrupts in my vmstat
>>>>>>>> output.
>>>>>>>>
>>>>>>>> I don't see GC activity in the logs but see some compaction
>>>>>>>> activity. Has anyone seen this ? or know what can be done to free up 
>>>>>>>> the
>>>>>>>> CPU.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sandeep
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>
>

Reply via email to