[jira] [Updated] (CASSANDRA-15717) Benchmark performance difference between Docker and Kubernetes when running Cassandra:2.2.16 official Docker image

Eddy Truyen (Jira) Mon, 13 Apr 2020 14:00:50 -0700


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-15717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Eddy Truyen updated CASSANDRA-15717:
------------------------------------
    Description: 
Sorry for the slightly irrelevant post. This is not an issue with Cassandra but 
possibly with the interaction between Cassandra and Kubernetes.

We experienced a performance degradation when running a single Cassandra 
instance inside kubeadm 1.14 in comparison with running the Docker container 
stand-alone.
 A write-only workload (YCSB benchmark workload A - Load phase) using the 
following user table:

 

{{ cqlsh> create keyspace ycsb
    WITH REPLICATION = \{'class' : 'SimpleStrategy', 'replication_factor': 1 }
    ;
    cqlsh> USE ycsb;
    cqlsh> create table usertable (
    y_id varchar primary key,
    field0 varchar,
    field1 varchar,
    field2 varchar,
    field3 varchar,
    field4 varchar,
    field5 varchar,
    field6 varchar,
    field7 varchar,
    field8 varchar,
    field9 varchar);}}

And using the following script:

 

{{python ./bin/ycsb load cassandra2-cql -P workloads/workloada -p 
recordcount=1500000 -p 
operationcount=1500000 -p measurementtype=raw -p 
cassandra.connecttimeoutmillis=60000 -p 
cassandra.readtimeoutmillis=60000 -target 1500 -threads 20 -p hosts=localhost > 
results/cassandra-docker/cassandra-docker-load-workloada-1-records-1500000-rnd-1762034446.txt
sleep 15}}

We used the following image: {{decomads/cassandra:2.2.16}}, which uses the 
official {{cassandra:2.2.16}} as base image and adds a readinessProbe to it.

We used identical Docker configuration parameters by ensuring that the output 
of {{docker inspect}} is as much as possible the same. First we got the YCSB 
benchmark in a container that is co-located with the cassandra container in one 
pod. Kubernetes starts these containers then with network mode 
{{net=container:...}} This is a separate container that links up the ycsb and 
cassandra containers within the same network space so they can talk via 
localhost. By this we hope to avoid network plugin interference from the CNI 
plugin.

We ran the docker-only container within the Kubernetes node using the default 
bridge network

We first performed the experiment on an Openstack VM Ubuntu 16:04 (4GB, 4 CPU 
cores, 50GB), that runs on a physical nodes with 16 CPU cores. Storage is Ceph 
however and therefore distributed

To avoid distributed storage of ceph, we repeated the experiment also on 
minikube+VirtualBox (12GB, 4 CPU cores, 30 GB) on a Windows 10 laptop with 4 
cores/8 logical processors and 16GB RAM. However the same performance 
degradation was measured.

Observations (On Ubuntu-OpenStack)
 * Docker:
 ** Mean average response latency YCSB benchmark: 1,5 ms-1.7ms
 * Kubernetes
 ** Mean average response latency YCSB benchmark: 2.7 ms-3ms
 * CPU usage of the Cassandra Daemon JVM is way lower than Kubernetes (see my 
position paper: [https://lirias.kuleuven.be/2788169?limo=0]):

Possible causes:
 * Network overhead of virtual bridge in Kubernetes is not the cause of the 
problem in our opinion.
 ** We repeated the experiment where we ran the Docker-Only containers inside a 
Kubernetes node and we linked the containers using the --net=container: mode 
mechanisms as similar as possible as we could. The YCSB latency stayed the same.
 * Disk/io bottleneck: Nodetool tablestats are very similar. Cassandra 
containers are configured to write data to a filesystem that is mounted from 
the host inside the container. Exactly the same Docker mount type is used
 ** Write latency is very stable over multiple runs
 * Kubernetes for ycsb user table: 0.0167 ms.
 * Write latency Docker for ycsb usertable: 0.0150 ms.
 ** Compaction_history/compaction_in_progress is also very similar (see 
attached files)

)

Do you know of any other causes that might explain the difference in reported 
YCSB reponse latency? Could it be the the Cassandra Session is closed by 
Kubernetes after each request?  How can I diagnose this?

 

  was:
This is my first JIRA issue. Sorry if I do something  wrong in the reporting.

I experienced a performance degradation when running a single Cassandra 
instance  inside Kubernetes in comparison with running the Docker container 
stand-alone. I used the following image decomads/cassandra:2.2.16, which uses 
cassandra:2.2.16 as base image and adds a readinessProbe to it.

I used identical Docker configuration parameters by ensuring that the output of 
docker inspect is as much as possible the same.  First we got the ycsb 
benchmark in a container that is co-located with the cassandra container in one 
pod.  Kubernetes starts these containers then with network mode 
"net=container:... This is a  separate container that link up the ycsb and 
cassandra containers within the same network space so they can talk via 
localhost – by this we hope to avoid network plugin interference from the CNI 
plugin.

We ran the docker-only container within the Kubernetes node using the default 
bridge network

 Experiment (repeated on minikube+VirtualBox (12GB, 4 CPU cores, 30 GB) on 
physical laptop with 4 cores/8 logical processors and 16GB RAM on and Openstack 
VM Ubuntu 16:04  (4GB, 4 CPU cores, 50GB), that runs on a physical nodes with 
16 CPU cores. Storage is Ceph.
 * A write-only workload (YCSB benchmark workload A - Load phase) using the 
following user table:
 cqlsh> create keyspace ycsb
 WITH REPLICATION = \{'class' : 'SimpleStrategy', 'replication_factor': 1 }
 ;
 cqlsh> USE ycsb;
 cqlsh> create table usertable (
 y_id varchar primary key,
 field0 varchar,
 field1 varchar,
 field2 varchar,
 field3 varchar,
 field4 varchar,
 field5 varchar,
 field6 varchar,
 field7 varchar,
 field8 varchar,
 field9 varchar);

 * And using the following script: python ./bin/ycsb load cassandra2-cql -P 
workloads/workloada -p recordcount=1500000 -p operationcount=1500000 -p 
measurementtype=raw -p cassandra.connecttimeoutmillis=60000 -p 
cassandra.readtimeoutmillis=60000 -target 1500 -threads 20 -p hosts=localhost > 
results/cassandra-docker/cassandra-docker-load-workloada-1-records-1500000-rnd-1762034446.txt
 sleep 15

Observations (On Ubuntu-OpenStack)
 * Docker:
 ** Mean average  response latency YCSB benchmark: 1,5 ms-1.7ms
 * Kubernetes
 ** Mean average response latency YCSB benchmark: 2.7 ms-3ms
 * CPU usage of the Cassandra Daemon JVM is way lower than Kubernetes (see my 
position paper: [https://lirias.kuleuven.be/2788169?limo=0)]: 

Possible  causes:
 * Network overhead of virtual bridge in container orchestrator is not the 
cause of the problem in our opinion
 ** We repeated the experiment where we ran the Docker-Only containers inside a 
Kubernetes node and we linked the containers using the --net=container: mode 
mechanisms as similar as possible as we could. The YCSB latency stayed the same.
 * Disk/io bottleneck: Nodetool tablestats are very similar
 ** Cassandra containers are configured to write data to a filesystem that is 
mounted from the host inside the container. Exactly the same Docker mount type 
is used
 ** Write latency is very stable over multiple runs
 *** Kubernetes for ycsb user table: 0.0167 ms.
 *** Write latency Docker for ycsb usertable: 0.0150 ms.
 ** Compaction_history/compaction_in_progress is also very similar (as opposed 
to earlier versions of the issue – sorry for the confusion!)

Do you know of any other causes that might explain the difference in reported 
YCSB reponse latency?

 

             

 

 

 

 


> Benchmark performance difference between Docker and Kubernetes when running 
> Cassandra:2.2.16 official Docker image
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15717
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15717
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/benchmark
>            Reporter: Eddy Truyen
>            Priority: Normal
>         Attachments: nodetool-compaction-history-docker-cassandra.txt, 
> nodetool-compaction-history-kubeadm-cassandra.txt
>
>
> Sorry for the slightly irrelevant post. This is not an issue with Cassandra 
> but possibly with the interaction between Cassandra and Kubernetes.
> We experienced a performance degradation when running a single Cassandra 
> instance inside kubeadm 1.14 in comparison with running the Docker container 
> stand-alone.
>  A write-only workload (YCSB benchmark workload A - Load phase) using the 
> following user table:
>  
> {{ cqlsh> create keyspace ycsb
>     WITH REPLICATION = \{'class' : 'SimpleStrategy', 'replication_factor': 1 }
>     ;
>     cqlsh> USE ycsb;
>     cqlsh> create table usertable (
>     y_id varchar primary key,
>     field0 varchar,
>     field1 varchar,
>     field2 varchar,
>     field3 varchar,
>     field4 varchar,
>     field5 varchar,
>     field6 varchar,
>     field7 varchar,
>     field8 varchar,
>     field9 varchar);}}
> And using the following script:
>  
> {{python ./bin/ycsb load cassandra2-cql -P workloads/workloada -p 
> recordcount=1500000 -p 
> operationcount=1500000 -p measurementtype=raw -p 
> cassandra.connecttimeoutmillis=60000 -p 
> cassandra.readtimeoutmillis=60000 -target 1500 -threads 20 -p hosts=localhost 
> > 
> results/cassandra-docker/cassandra-docker-load-workloada-1-records-1500000-rnd-1762034446.txt
> sleep 15}}
> We used the following image: {{decomads/cassandra:2.2.16}}, which uses the 
> official {{cassandra:2.2.16}} as base image and adds a readinessProbe to it.
> We used identical Docker configuration parameters by ensuring that the output 
> of {{docker inspect}} is as much as possible the same. First we got the YCSB 
> benchmark in a container that is co-located with the cassandra container in 
> one pod. Kubernetes starts these containers then with network mode 
> {{net=container:...}} This is a separate container that links up the ycsb and 
> cassandra containers within the same network space so they can talk via 
> localhost. By this we hope to avoid network plugin interference from the CNI 
> plugin.
> We ran the docker-only container within the Kubernetes node using the default 
> bridge network
> We first performed the experiment on an Openstack VM Ubuntu 16:04 (4GB, 4 CPU 
> cores, 50GB), that runs on a physical nodes with 16 CPU cores. Storage is 
> Ceph however and therefore distributed
> To avoid distributed storage of ceph, we repeated the experiment also on 
> minikube+VirtualBox (12GB, 4 CPU cores, 30 GB) on a Windows 10 laptop with 4 
> cores/8 logical processors and 16GB RAM. However the same performance 
> degradation was measured.
> Observations (On Ubuntu-OpenStack)
>  * Docker:
>  ** Mean average response latency YCSB benchmark: 1,5 ms-1.7ms
>  * Kubernetes
>  ** Mean average response latency YCSB benchmark: 2.7 ms-3ms
>  * CPU usage of the Cassandra Daemon JVM is way lower than Kubernetes (see my 
> position paper: [https://lirias.kuleuven.be/2788169?limo=0]):
> Possible causes:
>  * Network overhead of virtual bridge in Kubernetes is not the cause of the 
> problem in our opinion.
>  ** We repeated the experiment where we ran the Docker-Only containers inside 
> a Kubernetes node and we linked the containers using the --net=container: 
> mode mechanisms as similar as possible as we could. The YCSB latency stayed 
> the same.
>  * Disk/io bottleneck: Nodetool tablestats are very similar. Cassandra 
> containers are configured to write data to a filesystem that is mounted from 
> the host inside the container. Exactly the same Docker mount type is used
>  ** Write latency is very stable over multiple runs
>  * Kubernetes for ycsb user table: 0.0167 ms.
>  * Write latency Docker for ycsb usertable: 0.0150 ms.
>  ** Compaction_history/compaction_in_progress is also very similar (see 
> attached files)
> )
> Do you know of any other causes that might explain the difference in reported 
> YCSB reponse latency? Could it be the the Cassandra Session is closed by 
> Kubernetes after each request?  How can I diagnose this?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (CASSANDRA-15717) Benchmark performance difference between Docker and Kubernetes when running Cassandra:2.2.16 official Docker image

Reply via email to