[jira] [Commented] (CASSANDRA-8457) nio MessagingService

Ariel Weisberg (JIRA) Tue, 16 Dec 2014 14:37:38 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14249085#comment-14249085
 ]


Ariel Weisberg commented on CASSANDRA-8457:
-------------------------------------------

I have some code and results. https://github.com/aweisberg/cassandra/tree/C-8457

I tested on AWS using a 3 node cluster of c3.8xlarge instances in the same 
placement group using HVM with Ubuntu 14.04. Other then 
/etc/security/limits.conf I made no changes to the install which was the 
Rightscale ServerTemplate Base ServerTemplate for Linux (v14.1.0).

Config provided to cstar bootstrap was 
{code:JavaScript}
{
    "revision": "aweisberg/C-8457",
    "label": "test",
    "yaml": "key_cache_size_in_mb: 256\nrow_cache_size_in_mb: 
2000\ncommitlog_sync: periodic\ncommitlog_sync_batch_window_in_ms: 
null\ncommitlog_sync_period_in_ms: 10000\ncompaction_throughput_mb_per_sec: 
0\nconcurrent_compactors: 4",
    "env": "MAX_HEAP_SIZE=8g\nHEAP_NEWSIZE=2g",
    "options": {
        "use_vnodes": true
    }
}
{
    "commitlog_directory": "/mnt/ephemeral/commitlog",
    "data_file_directories": [
        "/mnt/ephemeral/datadir"
    ],
    "block_devices": [
        "/dev/mapper/vg--data-ephemeral0"
    ],
    "blockdev_readahead": "128",
    "hosts": {
        "ec2-54-175-1-84.compute-1.amazonaws.com": {
            "internal_ip": "172.31.49.199",
            "hostname": "ec2-54-175-1-84.compute-1.amazonaws.com",
            "seed": true
         },
        "ec2-54-175-32-238.compute-1.amazonaws.com": {
            "internal_ip": "172.31.53.77",
            "hostname": "ec2-54-175-32-238.compute-1.amazonaws.com",
            "seed": true
        },
        "ec2-54-175-32-206.compute-1.amazonaws.com": {
            "internal_ip": "172.31.57.63",
            "hostname": "ec2-54-175-32-206.compute-1.amazonaws.com",
            "seed": true
        }
    },
    "user": "ariel_weisberg",
    "name": "example1",
    "saved_caches_directory": "/mnt/ephemeral/caches"
}
{code}

To populate data I used
bq. ./cassandra-stress write n=100000 -pop seq=1...100000 no-wrap -rate 
threads=50 -col 'n=fixed(1)' -schema 'replication(factor=3)' -node 
file=$HOME/hosts

To read I used
bq. ./cassandra-stress read n=10000000 cl=ALL -pop 'dist=UNIFORM(1...100000)' 
-rate threads=200 -col 'n=fixed(1)' -schema 'replication(factor=3)' -node 
file=~/hosts

I ran two client instances on two nodes (one per node) also on c3.8xlarge in 
the same placement group.

Unmodified trunk
{noformat}
op rate                   : 87497
partition rate            : 87497
row rate                  : 87497
latency mean              : 2.3
latency median            : 2.1
latency 95th percentile   : 3.2
latency 99th percentile   : 3.7
latency 99.9th percentile : 4.5
latency max               : 124.0
total gc count            : 28
total gc mb               : 44299
total gc time (s)         : 1
avg gc time(ms)           : 21
stdev gc time(ms)         : 17
Total operation time      : 00:01:54
END

op rate                   : 87598
partition rate            : 87598
row rate                  : 87598
latency mean              : 2.3
latency median            : 2.1
latency 95th percentile   : 3.2
latency 99th percentile   : 3.8
latency 99.9th percentile : 4.4
latency max               : 124.8
total gc count            : 133
total gc mb               : 211358
total gc time (s)         : 3
avg gc time(ms)           : 20
stdev gc time(ms)         : 17
Total operation time      : 00:01:54
END
{noformat}

Modified
{noformat}
Results:
op rate                   : 87476
partition rate            : 87476
row rate                  : 87476
latency mean              : 2.3
latency median            : 2.1
latency 95th percentile   : 3.2
latency 99th percentile   : 3.7
latency 99.9th percentile : 4.0
latency max               : 130.2
total gc count            : 102
total gc mb               : 165487
total gc time (s)         : 3
avg gc time(ms)           : 25
stdev gc time(ms)         : 21
Total operation time      : 00:01:54
END

Results:
op rate                   : 87347
partition rate            : 87347
row rate                  : 87347
latency mean              : 2.3
latency median            : 2.1
latency 95th percentile   : 3.1
latency 99th percentile   : 3.6
latency 99.9th percentile : 3.9
latency max               : 129.2
total gc count            : 59
total gc mb               : 93416
total gc time (s)         : 1
avg gc time(ms)           : 23
stdev gc time(ms)         : 16
Total operation time      : 00:01:54
END
{noformat}

[~benedict] Can you look at the code and the stress params and validate that 
you think I am measuring what I think I am measuring?

I am going to profile the client and server tomorrow to get my bearings on what 
executing this workload actually looks like.

> nio MessagingService
> --------------------
>
>                 Key: CASSANDRA-8457
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8457
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Ariel Weisberg
>              Labels: performance
>             Fix For: 3.0
>
>
> Thread-per-peer (actually two each incoming and outbound) is a big 
> contributor to context switching, especially for larger clusters.  Let's look 
> at switching to nio, possibly via Netty.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-8457) nio MessagingService

Reply via email to