Thanks Aaron
On Wed, Dec 11, 2013 at 8:15 PM, Aaron Morton <aa...@thelastpickle.com>wrote: > Changed memtable_total_space_in_mb to 1024 still no luck. > > Reducing memtable_total_space_in_mb will increase the frequency of > flushing to disk, which will create more for compaction to do and result in > increased IO. > > You should return it to the default. > You are right, had to revert it back to default. > > when I send traffic to one node its performance is 2x more than when I > send traffic to all the nodes. > > > > What are you measuring, request latency or local read/write latency ? > > If it’s write latency it’s probably GC, if it’s read is probably IO or > data model. > It is the write latency, read latency is ok. Interestingly the latency is low when there is one node. When I join other nodes the latency drops about 1/3. To be specific, when I start sending traffic to the other nodes the latency for all the nodes increases, if I stop traffic to other nodes the latency drops again, I checked, this is not node specific it happens to any node. I don't see any GC activity in logs. Tried to control the compaction by reducing the number of threads, did not help much. > Hope that helps. > > ----------------- > Aaron Morton > New Zealand > @aaronmorton > > Co-Founder & Principal Consultant > Apache Cassandra Consulting > http://www.thelastpickle.com > > On 7/12/2013, at 8:05 am, srmore <comom...@gmail.com> wrote: > > Changed memtable_total_space_in_mb to 1024 still no luck. > > > On Fri, Dec 6, 2013 at 11:05 AM, Vicky Kak <vicky....@gmail.com> wrote: > >> Can you set the memtable_total_space_in_mb value, it is defaulting to >> 1/3 which is 8/3 ~ 2.6 gb in capacity >> >> http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management >> >> The flushing of 2.6 gb to the disk might slow the performance if >> frequently called, may be you have lots of write operations going on. >> >> >> >> On Fri, Dec 6, 2013 at 10:06 PM, srmore <comom...@gmail.com> wrote: >> >>> >>> >>> >>> On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak <vicky....@gmail.com> wrote: >>> >>>> You have passed the JVM configurations and not the cassandra >>>> configurations which is in cassandra.yaml. >>>> >>> >>> Apologies, was tuning JVM and that's what was in my mind. >>> Here are the cassandra settings http://pastebin.com/uN42GgYT >>> >>> >>> >>>> The spikes are not that significant in our case and we are running the >>>> cluster with 1.7 gb heap. >>>> >>>> Are these spikes causing any issue at your end? >>>> >>> >>> There are no big spikes, the overall performance seems to be about 40% >>> low. >>> >>> >>>> >>>> >>>> >>>> >>>> On Fri, Dec 6, 2013 at 9:10 PM, srmore <comom...@gmail.com> wrote: >>>> >>>>> >>>>> >>>>> >>>>> On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak <vicky....@gmail.com> wrote: >>>>> >>>>>> Hard to say much without knowing about the cassandra configurations. >>>>>> >>>>> >>>>> The cassandra configuration is >>>>> -Xms8G >>>>> -Xmx8G >>>>> -Xmn800m >>>>> -XX:+UseParNewGC >>>>> -XX:+UseConcMarkSweepGC >>>>> -XX:+CMSParallelRemarkEnabled >>>>> -XX:SurvivorRatio=4 >>>>> -XX:MaxTenuringThreshold=2 >>>>> -XX:CMSInitiatingOccupancyFraction=75 >>>>> -XX:+UseCMSInitiatingOccupancyOnly >>>>> >>>>> >>>>> >>>>>> Yes compactions/GC's could skipe the CPU, I had similar behavior with >>>>>> my setup. >>>>>> >>>>> >>>>> Were you able to get around it ? >>>>> >>>>> >>>>>> >>>>>> -VK >>>>>> >>>>>> >>>>>> On Fri, Dec 6, 2013 at 7:40 PM, srmore <comom...@gmail.com> wrote: >>>>>> >>>>>>> We have a 3 node cluster running cassandra 1.2.12, they are pretty >>>>>>> big machines 64G ram with 16 cores, cassandra heap is 8G. >>>>>>> >>>>>>> The interesting observation is that, when I send traffic to one node >>>>>>> its performance is 2x more than when I send traffic to all the nodes. We >>>>>>> ran 1.0.11 on the same box and we observed a slight dip but not half as >>>>>>> seen with 1.2.12. In both the cases we were writing with LOCAL_QUORUM. >>>>>>> Changing CL to ONE make a slight improvement but not much. >>>>>>> >>>>>>> The read_Repair_chance is 0.1. We see some compactions running. >>>>>>> >>>>>>> following is my iostat -x output, sda is the ssd (for commit log) >>>>>>> and sdb is the spinner. >>>>>>> >>>>>>> avg-cpu: %user %nice %system %iowait %steal %idle >>>>>>> 66.46 0.00 8.95 0.01 0.00 24.58 >>>>>>> >>>>>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s >>>>>>> avgrq-sz avgqu-sz await svctm %util >>>>>>> sda 0.00 27.60 0.00 4.40 0.00 256.00 >>>>>>> 58.18 0.01 2.55 1.32 0.58 >>>>>>> sda1 0.00 0.00 0.00 0.00 0.00 0.00 >>>>>>> 0.00 0.00 0.00 0.00 0.00 >>>>>>> sda2 0.00 27.60 0.00 4.40 0.00 256.00 >>>>>>> 58.18 0.01 2.55 1.32 0.58 >>>>>>> sdb 0.00 0.00 0.00 0.00 0.00 0.00 >>>>>>> 0.00 0.00 0.00 0.00 0.00 >>>>>>> sdb1 0.00 0.00 0.00 0.00 0.00 0.00 >>>>>>> 0.00 0.00 0.00 0.00 0.00 >>>>>>> dm-0 0.00 0.00 0.00 0.00 0.00 0.00 >>>>>>> 0.00 0.00 0.00 0.00 0.00 >>>>>>> dm-1 0.00 0.00 0.00 0.60 0.00 4.80 >>>>>>> 8.00 0.00 5.33 2.67 0.16 >>>>>>> dm-2 0.00 0.00 0.00 0.00 0.00 0.00 >>>>>>> 0.00 0.00 0.00 0.00 0.00 >>>>>>> dm-3 0.00 0.00 0.00 24.80 0.00 198.40 >>>>>>> 8.00 0.24 9.80 0.13 0.32 >>>>>>> dm-4 0.00 0.00 0.00 6.60 0.00 52.80 >>>>>>> 8.00 0.01 1.36 0.55 0.36 >>>>>>> dm-5 0.00 0.00 0.00 0.00 0.00 0.00 >>>>>>> 0.00 0.00 0.00 0.00 0.00 >>>>>>> dm-6 0.00 0.00 0.00 24.80 0.00 198.40 >>>>>>> 8.00 0.29 11.60 0.13 0.32 >>>>>>> >>>>>>> >>>>>>> >>>>>>> I can see I am cpu bound here but couldn't figure out exactly what >>>>>>> is causing it, is this caused by GC or Compaction ? I am thinking it is >>>>>>> compaction, I see a lot of context switches and interrupts in my vmstat >>>>>>> output. >>>>>>> >>>>>>> I don't see GC activity in the logs but see some compaction >>>>>>> activity. Has anyone seen this ? or know what can be done to free up the >>>>>>> CPU. >>>>>>> >>>>>>> Thanks, >>>>>>> Sandeep >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > >