[
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16871656#comment-16871656
]
Joseph Lynch edited comment on CASSANDRA-15175 at 6/24/19 7:23 PM:
-------------------------------------------------------------------
I have completed the {{LOCAL_ONE}} scaling test. I have summarized the test in
the following graph:
!trunk_vs_30x_summary.png!
As we can see, even with the extra TLS CPU requirements, trunk was able to
significantly outperform the status quo 3.0.x cluster across the load spectrum
for this consistency level
I am proceeding with other consistency levels and gathering additional data.
So far I have noticed the following issues during these tests which I will
gather more data on and follow up with in other tickets (and edit here with
ticket numbers once I have them):
# JDK Netty TLS appears significantly more CPU intensive than the previous
Java Sockets implementation. [~norman] is taking a look from the Netty side and
we can follow up and make sure we're not creating improperly (looking at the
flamegraphs it looks like we may have a buffer sizing issue)
# When a node was terminated and replaced, the new node appeared to sit for a
very long time waiting for schema pulls to complete (I think it was waiting on
the node it was replacing but I haven't fully debugged this).
# Nodetool netstats doesn't report progress properly for the file count
(percent, single file, and size still seem right; this is probably
CASSANDRA-14192
# When we re-load NTS keyspaces from disk we throw warnings about "Ignoring
Unrecognized strategy option" for datacenters that we are not in
# After a node shuts down there is a burst of re-connections on the urgent
port prior to actual shutdown (I _think_ this is pre-existing and I'm just
noticing it because of the new logging)
Also while setting up the {{LOCAL_QUORUM}} test I found the following trying to
understand why I was seeing a higher number of blocking read repairs on the
trunk cluster than the 30x cluster:
# When I stop and start nodes, it appears that hints may not always playback.
In particular the high blocking read repairs were coming from neighbors of the
node I had restarted a few times to test tcnative openssl integration. I
checked the neighbor's hints directories and sure enough there were pending
hints there that were not playing at all (they had been there for over 8 hours
and still not played).
# -Repair appears to fail on the default system_traces when run with {{-full}}
and {{-os}- (Edit: this is operator error, we shouldn't pass -local to a
SimpleStrategy keyspace)
{noformat}
cass-perf-trunk-14746--useast1c-i-00a32889835534b75:~$ nodetool repair -os
-full -local
[2019-06-23 23:29:30,210] Starting repair command #1
(bfbc7ba0-960e-11e9-b238-77fd1c2e9b1c), repairing keyspace perftest with repair
options (parallelism: parallel, primary range: false, incremental: false, job
threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [],
previewKind: NONE, # of ranges: 6, pull repair: false, force repair: false,
optimise streams: true)
[2019-06-23 23:52:08,248] Repair session c0573500-960e-11e9-b238-77fd1c2e9b1c
for range [(384307168575030403,384307170010857891],
(192153585909716729,384307168575030403]] finished (progress: 10%)
[2019-06-23 23:52:26,393] Repair session c0307320-960e-11e9-b238-77fd1c2e9b1c
for range [(1808575567,192153584473889241],
(192153584473889241,192153585909716729]] finished (progress: 20%)
[2019-06-23 23:52:28,298] Repair session c059f420-960e-11e9-b238-77fd1c2e9b1c
for range [(576460752676171565,576460754111999053],
(384307170010857891,576460752676171565]] finished (progress: 30%)
[2019-06-23 23:52:28,302] Repair completed successfully
[2019-06-23 23:52:28,310] Repair command #1 finished in 22 minutes 58 seconds
[2019-06-23 23:52:28,331] Replication factor is 1. No repair is needed for
keyspace 'system_auth'
[2019-06-23 23:52:28,350] Starting repair command #2
(f52c1c70-9611-11e9-b238-77fd1c2e9b1c), repairing keyspace system_traces with
repair options (parallelism: parallel, primary range: false, incremental:
false, job threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [],
previewKind: NONE, # of ranges: 2, pull repair: false, force repair: false,
optimise streams: true)
[2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not
be empty
[2019-06-23 23:52:28,351] Repair command #2 finished with error
error: Repair job has failed with the error message: [2019-06-23 23:52:28,351]
Repair command #2 failed with error Endpoints can not be empty. Check the logs
on the repair participants for further details
-- StackTrace --
java.lang.RuntimeException: Repair job has failed with the error message:
[2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not
be empty. Check the logs on the repair participants for further details
at
org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:122)
at
org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108)
{noformat}
was (Author: jolynch):
I have completed the {{LOCAL_ONE}} scaling test. I have summarized the test in
the following graph:
!trunk_vs_30x_summary.png!
As we can see, even with the extra TLS CPU requirements, trunk was able to
significantly outperform the status quo 3.0.x cluster across the load spectrum
for this consistency level
I am proceeding with other consistency levels and gathering additional data.
So far I have noticed the following issues during these tests which I will
gather more data on and follow up with in other tickets (and edit here with
ticket numbers once I have them):
# JDK Netty TLS appears significantly more CPU intensive than the previous
Java Sockets implementation. [~norman] is taking a look from the Netty side and
we can follow up and make sure we're not creating improperly (looking at the
flamegraphs it looks like we may have a buffer sizing issue)
# When a node was terminated and replaced, the new node appeared to sit for a
very long time waiting for schema pulls to complete (I think it was waiting on
the node it was replacing but I haven't fully debugged this).
# Nodetool netstats doesn't report progress properly for the file count
(percent, single file, and size still seem right; this is probably
CASSANDRA-14192
# When we re-load NTS keyspaces from disk we throw warnings about "Ignoring
Unrecognized strategy option" for datacenters that we are not in
# After a node shuts down there is a burst of re-connections on the urgent
port prior to actual shutdown (I _think_ this is pre-existing and I'm just
noticing it because of the new logging)
Also while setting up the {{LOCAL_QUORUM}} test I found the following trying to
understand why I was seeing a higher number of blocking read repairs on the
trunk cluster than the 30x cluster:
# When I stop and start nodes, it appears that hints may not always playback.
In particular the high blocking read repairs were coming from neighbors of the
node I had restarted a few times to test tcnative openssl integration. I
checked the neighbor's hints directories and sure enough there were pending
hints there that were not playing at all (they had been there for over 8 hours
and still not played).
# Repair appears to fail on the default system_traces when run with {{-full}}
and \{{-os}
{noformat}
cass-perf-trunk-14746--useast1c-i-00a32889835534b75:~$ nodetool repair -os
-full -local
[2019-06-23 23:29:30,210] Starting repair command #1
(bfbc7ba0-960e-11e9-b238-77fd1c2e9b1c), repairing keyspace perftest with repair
options (parallelism: parallel, primary range: false, incremental: false, job
threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [],
previewKind: NONE, # of ranges: 6, pull repair: false, force repair: false,
optimise streams: true)
[2019-06-23 23:52:08,248] Repair session c0573500-960e-11e9-b238-77fd1c2e9b1c
for range [(384307168575030403,384307170010857891],
(192153585909716729,384307168575030403]] finished (progress: 10%)
[2019-06-23 23:52:26,393] Repair session c0307320-960e-11e9-b238-77fd1c2e9b1c
for range [(1808575567,192153584473889241],
(192153584473889241,192153585909716729]] finished (progress: 20%)
[2019-06-23 23:52:28,298] Repair session c059f420-960e-11e9-b238-77fd1c2e9b1c
for range [(576460752676171565,576460754111999053],
(384307170010857891,576460752676171565]] finished (progress: 30%)
[2019-06-23 23:52:28,302] Repair completed successfully
[2019-06-23 23:52:28,310] Repair command #1 finished in 22 minutes 58 seconds
[2019-06-23 23:52:28,331] Replication factor is 1. No repair is needed for
keyspace 'system_auth'
[2019-06-23 23:52:28,350] Starting repair command #2
(f52c1c70-9611-11e9-b238-77fd1c2e9b1c), repairing keyspace system_traces with
repair options (parallelism: parallel, primary range: false, incremental:
false, job threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [],
previewKind: NONE, # of ranges: 2, pull repair: false, force repair: false,
optimise streams: true)
[2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not
be empty
[2019-06-23 23:52:28,351] Repair command #2 finished with error
error: Repair job has failed with the error message: [2019-06-23 23:52:28,351]
Repair command #2 failed with error Endpoints can not be empty. Check the logs
on the repair participants for further details
-- StackTrace --
java.lang.RuntimeException: Repair job has failed with the error message:
[2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not
be empty. Check the logs on the repair participants for further details
at
org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:122)
at
org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452)
at
com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108)
{noformat}
> Evaluate 200 node, compression=on, encryption=all
> -------------------------------------------------
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
> Issue Type: Sub-task
> Components: Test/benchmark
> Reporter: Joseph Lynch
> Assignee: Joseph Lynch
> Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, ShortbufferExceptions.png,
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg,
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png,
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg,
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png,
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg,
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png,
> trunk_vs_30x_14kcRPS_14kcWPS.png,
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png,
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png,
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png,
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png,
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]