RE: Config changes to leverage new hardware

Arindam Barua Mon, 25 Nov 2013 17:59:33 -0800

Here are some calculated 'latency' results reported by cassandra-stress when 
asked to write 10M rows, i.e.
cassandra-stress -d <ip1>,<ip2> -n 10000000
(we actually had cassandra-stress running in deamon mode for the below tests)



avg_latency

(percentile)


90

99

99.9

99.99

Write: 8 cores, 32 GB, 3-disk RAID 0

0.002982182

0.003963931

0.004692996

0.004792326

Write: 32 cores, 128 GB, 7-disk RAID 0

0.003157515

0.003763181

0.005184429

0.005441946


Read: 8 cores, 32 GB, 3-disk RAID 0

0.002289879

0.057178021

0.173753058

0.24386912

Read: 32 cores, 128 GB, 7-disk RAID 0

0.002317525

0.010937648

0.013205977

0.014270511




The client was another node on the same network with the 8 core, 32 GB RAM 
specs. I wouldn't expect it to bottleneck, but I can monitor it while 
generating the load. In general, what would you expect it to bottleneck at?



>> Another interesting thing is that the linux disk cache doesn't seem to be 
>> growing in spite of a lot of free memory available.

>Things will only get paged in when they are accessed.

Hmm, interesting. I did a test where I just wrote large files to disk, eg.

dd if=/dev/zero of=bigfile18 bs=1M count=10000

and checked the disk cache, and it increased by exactly the same size of the 
file written (no reads were done in this case)



-----Original Message-----
From: Aaron Morton [mailto:aa...@thelastpickle.com]
Sent: Monday, November 25, 2013 11:55 AM
To: Cassandra User
Subject: Re: Config changes to leverage new hardware



> However, for both writes and reads there was virtually no difference in the 
> latencies.

What sort of latency were you getting ?



> I'm still not very sure where the current *write* bottleneck is though.

What numbers are you getting ?

Could the bottle neck be the client ? Can it send writes fast enough to 
saturate the nodes ?



As a rule of thumb you should get 3,000 to 4,000 (non counter) writes per 
second per core.



> Sample iostat data (captured every 10s) for the dedicated disk where commit 
> logs are written is below. Does this seem like a bottle neck?

Does not look too bad.



> Another interesting thing is that the linux disk cache doesn't seem to be 
> growing in spite of a lot of free memory available.

Things will only get paged in when they are accessed.



Cheers





-----------------

Aaron Morton

New Zealand

@aaronmorton



Co-Founder & Principal Consultant

Apache Cassandra Consulting

http://www.thelastpickle.com



On 21/11/2013, at 12:42 pm, Arindam Barua 
<aba...@247-inc.com<mailto:aba...@247-inc.com>> wrote:



>

> Thanks for the suggestions Aaron.

>

> As a follow up, we ran a bunch of tests with different combinations of these 
> changes on a 2-node ring. The load was generated using cassandra-stress, run 
> with default values to write 30 million rows, and read them back.

> However, for both writes and reads there was virtually no difference in the 
> latencies.

>

> The different combinations attempted:

> 1.       Baseline test with none of the below changes.

> 2.       Grabbing the TLAB setting from 1.2

> 3.       Moving the commit logs too to the 7 disk RAID 0.

> 4.       Increasing the concurrent_read to 32, and concurrent_write to 64

> 5.       (3) + (4), i.e. moving commit logs to the RAID + increasing 
> concurrent_read and concurrent_write config to 32 and 64.

>

> The write latencies were very similar, except them being ~3x worse for the 
> 99.9th percentile and above for scenario (5) above.

> The read latencies were also similar, with (3) and (5) being a little worse 
> for the 99.99th percentile.

>

> Overall, not making any changes, i.e. (1) performed as well or slightly 
> better than any of the other changes.

>

> Running cassandra-stress on both the old and new hardware without making any 
> config changes, the write performance was very similar, but the new hardware 
> did show ~10x improvement in the read for the 99.9th percentile and higher. 
> After thinking about this, the reason why we were not seeing any difference 
> with our test framework was perhaps the nature of the test where we write the 
> rows, and then do a bunch of reads to read the rows that were just written 
> immediately following. The data is read back from the memtables, and never 
> from the disk/sstables. Hence the new hardware's increased RAM and size of 
> the disk cache or higher number of disks never helps.

>

> I'm still not very sure where the current *write* bottleneck is though. The 
> new hardware has 32 cores vs 8 cores of the old hardware. Moving the commit 
> log from a dedicated disk to a 7 RAID-0 disk system (where it would be shared 
> by other data though) didn't make a difference too. (unless the extra 
> contention on the RAID nullified the positive effects of the RAID).

>

> Sample iostat data (captured every 10s) for the dedicated disk where commit 
> logs are written is below. Does this seem like a bottle neck? When the commit 
> logs are written the await/svctm ratio is high.

>

> Device:         rrqm/s   wrqm/s   r/s   w/s    rMB/s    wMB/s avgrq-sz 
> avgqu-sz   await  svctm  %util

>                0.00     8.09  0.04  8.85     0.00     0.07    15.74     0.00  
>   0.12   0.03   0.02

>                0.00   768.03  0.00  9.49     0.00     3.04   655.41     0.04  
>   4.52   0.33   0.31

>                0.00     8.10  0.04  8.85     0.00     0.07    15.75     0.00  
>   0.12   0.03   0.02

>                0.00   752.65  0.00 10.09     0.00     2.98   604.75     0.03  
>   3.00   0.26   0.26

>

> Another interesting thing is that the linux disk cache doesn't seem to be 
> growing in spite of a lot of free memory available. The total disk cache used 
> reported by 'free' is less than the size of the sstables written with over 
> 100 GB unused RAM.

> Even in production, where we have the older hardware running with 32 GB RAM 
> for a long time now, looking at 5 hosts in 1 DC, only 2.5 GB to 8 GB was used 
> for the disk cache. The Cassandra java process uses the 8 GB allocated to it, 
> and at least 10-15 GB on all the hosts is not used at all.

>

> Thanks,

> Arindam

>

> From: Aaron Morton [mailto:aa...@thelastpickle.com]

> Sent: Wednesday, November 06, 2013 8:34 PM

> To: Cassandra User

> Subject: Re: Config changes to leverage new hardware

>

> Running Cassandra 1.1.5 currently, but evaluating to upgrade to 1.2.11 soon.

> You will make more use of the extra memory moving to 1.2 as it moves bloom 
> filters and compression data off heap.

>

> Also grab the TLAB setting from cassandra-env.sh in v1.2

>

> As of now, our performance tests (our application specific as well as 
> cassandra-stress) are not showing any significant difference in the 
> hardwares, which is a little disheartening, since the new hardware has a lot 
> more RAM and CPU.

> For reads or writes or both ?

>

> Writes tend to scale with cores as long as the commit log can keep up.

> Reads improve with disk IO and page cache size when the hot set is in memory.

>

> Old Hardware: 8 cores (2 quad core), 32 GB RAM, four 1-TB disks ( 1

> disk used for commitlog and 3 disks RAID 0 for data) New Hardware: 32

> cores (2 8-core with hyperthreading), 128 GB RAM, eight 1-TB disks ( 1 disk 
> used for commitlog and 7 disks RAID 0 for data) Is the disk IO on the commit 
> log volume keeping up ?

> You cranked up the concurrent writers and the commit log may not keep up. You 
> could put the commit log on the same RAID volume to see if that improves 
> writes.

>

> The config we tried modifying so far was concurrent_reads to (16 *

> number of drives) and concurrent_writes to (8 * number of cores) as

> per

> 256 write threads is a lot. Make sure the commit log can keep up, I would put 
> it back to 32, maybe try 64. Not sure the concurrent list for the commit log 
> will work well with that many threads.

>

> May want to put the reads down as well.

>

> It's easier to tune the system if you can provide some info on the workload.

>

> Cheers

>

> -----------------

> Aaron Morton

> New Zealand

> @aaronmorton

>

> Co-Founder & Principal Consultant

> Apache Cassandra Consulting

> http://www.thelastpickle.com

>

> On 7/11/2013, at 12:35 pm, Arindam Barua 
> <aba...@247-inc.com<mailto:aba...@247-inc.com>> wrote:

>

>

>

> We want to upgrade our Cassandra cluster to have newer hardware, and were 
> wondering if anyone has suggestions on Cassandra or linux config changes that 
> will prove to be beneficial.

> As of now, our performance tests (our application specific as well as 
> cassandra-stress) are not showing any significant difference in the 
> hardwares, which is a little disheartening, since the new hardware has a lot 
> more RAM and CPU.

>

> Old Hardware: 8 cores (2 quad core), 32 GB RAM, four 1-TB disks ( 1

> disk used for commitlog and 3 disks RAID 0 for data) New Hardware: 32

> cores (2 8-core with hyperthreading), 128 GB RAM, eight 1-TB disks ( 1

> disk used for commitlog and 7 disks RAID 0 for data)

>

> Most of the cassandra config currently is the default, and we are using 
> LeveledCompaction strategy. Default key cache, row cache turned off.

> The config we tried modifying so far was concurrent_reads to (16 * number of 
> drives) and concurrent_writes to (8 * number of cores) as per recommendation 
> in cassandra.yaml, but that didn't make much difference.

> We were hoping that at least the extra RAM in the new hardware will be used 
> for Linux file caching and hence an improvement in performance will be 
> observed.

>

> Running Cassandra 1.1.5 currently, but evaluating to upgrade to 1.2.11 soon.

>

> Thanks,

> Arindam

RE: Config changes to leverage new hardware

Reply via email to