[jira] [Created] (CASSANDRA-15311) Fix flakey test_13595 - consistency_test.TestConsistency

2019-09-06 Thread Joseph Lynch (Jira)
Joseph Lynch created CASSANDRA-15311:


 Summary: Fix flakey  test_13595 - consistency_test.TestConsistency
 Key: CASSANDRA-15311
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15311
 Project: Cassandra
  Issue Type: Bug
  Components: Test/dtest
Reporter: Joseph Lynch


Example failure: 
[https://circleci.com/gh/jolynch/cassandra/559#tests/containers/29]
{noformat}
Your job ran 1007 tests with 1 failure
test_13595 - consistency_test.TestConsistencyconsistency_test.pyAssertionError: 
assert 9 == 4  +  where 4 = >('org.apache.cassandra.metrics:type=Table,name=ShortReadProtectionRequests,keyspace=test,scope=test',
 'Count')  +where > = 
.read_attribute
self = 

@since('3.0')
def test_13595(self):
"""
@jira_ticket CASSANDRA-13595
"""
cluster = self.cluster

# disable hinted handoff and set batch commit log so this doesn't 
interfere with the test
cluster.set_configuration_options(values={'hinted_handoff_enabled': 
False})
cluster.set_batch_commitlog(enabled=True)

cluster.populate(2)
node1, node2 = cluster.nodelist()
remove_perf_disable_shared_mem(node1)  # necessary for jmx
cluster.start(wait_other_notice=True)

session = self.patient_cql_connection(node1)

query = "CREATE KEYSPACE IF NOT EXISTS test WITH replication = 
{'class': 'NetworkTopologyStrategy', 'datacenter1': 2};"
session.execute(query)

query = 'CREATE TABLE IF NOT EXISTS test.test (id int PRIMARY KEY);'
session.execute(query)

# populate the table with 10 partitions,
# then delete a bunch of them on different nodes
# until we get the following pattern:

#token | k | 1 | 2 |
# -7509452495886106294 | 5 | n | y |
# -4069959284402364209 | 1 | y | n |
# -3799847372828181882 | 8 | n | y |
# -3485513579396041028 | 0 | y | n |
# -3248873570005575792 | 2 | n | y |
# -2729420104000364805 | 4 | y | n |
#  1634052884888577606 | 7 | n | y |
#  2705480034054113608 | 6 | y | n |
#  3728482343045213994 | 9 | n | y |
#  9010454139840013625 | 3 | y | y |

stmt = session.prepare('INSERT INTO test.test (id) VALUES (?);')
for id in range(0, 10):
session.execute(stmt, [id], ConsistencyLevel.ALL)

# delete every other partition on node1 while node2 is down
node2.stop(wait_other_notice=True)
session.execute('DELETE FROM test.test WHERE id IN (5, 8, 2, 7, 9);')
node2.start(wait_other_notice=True, wait_for_binary_proto=True)

session = self.patient_cql_connection(node2)

# delete every other alternate partition on node2 while node1 is down
node1.stop(wait_other_notice=True)
session.execute('DELETE FROM test.test WHERE id IN (1, 0, 4, 6);')
node1.start(wait_other_notice=True, wait_for_binary_proto=True)

session = self.patient_exclusive_cql_connection(node1)

# until #13595 the query would incorrectly return [1]
assert_all(session,
   'SELECT id FROM test.test LIMIT 1;',
   [[3]],
   cl=ConsistencyLevel.ALL)

srp = make_mbean('metrics', type='Table', 
name='ShortReadProtectionRequests', keyspace='test', scope='test')
with JolokiaAgent(node1) as jmx:
# 4 srp requests for node1 and 5 for node2, total of 9
>   assert 9 == jmx.read_attribute(srp, 'Count')
E   AssertionError: assert 9 == 4
E+  where 4 = >('org.apache.cassandra.metrics:type=Table,name=ShortReadProtectionRequests,keyspace=test,scope=test',
 'Count')
E+where > = 
.read_attribute

consistency_test.py:1288: AssertionError {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15311) Fix flakey test_13595 - consistency_test.TestConsistency

2019-09-06 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15311:
-
Fix Version/s: 4.0-alpha

> Fix flakey  test_13595 - consistency_test.TestConsistency
> -
>
> Key: CASSANDRA-15311
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15311
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> Example failure: 
> [https://circleci.com/gh/jolynch/cassandra/559#tests/containers/29]
> {noformat}
> Your job ran 1007 tests with 1 failure
> test_13595 - 
> consistency_test.TestConsistencyconsistency_test.pyAssertionError: assert 9 
> == 4  +  where 4 =   0x7f9f0775b160>>('org.apache.cassandra.metrics:type=Table,name=ShortReadProtectionRequests,keyspace=test,scope=test',
>  'Count')  +where  > = 
> .read_attribute
> self = 
> @since('3.0')
> def test_13595(self):
> """
> @jira_ticket CASSANDRA-13595
> """
> cluster = self.cluster
> 
> # disable hinted handoff and set batch commit log so this doesn't 
> interfere with the test
> cluster.set_configuration_options(values={'hinted_handoff_enabled': 
> False})
> cluster.set_batch_commitlog(enabled=True)
> 
> cluster.populate(2)
> node1, node2 = cluster.nodelist()
> remove_perf_disable_shared_mem(node1)  # necessary for jmx
> cluster.start(wait_other_notice=True)
> 
> session = self.patient_cql_connection(node1)
> 
> query = "CREATE KEYSPACE IF NOT EXISTS test WITH replication = 
> {'class': 'NetworkTopologyStrategy', 'datacenter1': 2};"
> session.execute(query)
> 
> query = 'CREATE TABLE IF NOT EXISTS test.test (id int PRIMARY KEY);'
> session.execute(query)
> 
> # populate the table with 10 partitions,
> # then delete a bunch of them on different nodes
> # until we get the following pattern:
> 
> #token | k | 1 | 2 |
> # -7509452495886106294 | 5 | n | y |
> # -4069959284402364209 | 1 | y | n |
> # -3799847372828181882 | 8 | n | y |
> # -3485513579396041028 | 0 | y | n |
> # -3248873570005575792 | 2 | n | y |
> # -2729420104000364805 | 4 | y | n |
> #  1634052884888577606 | 7 | n | y |
> #  2705480034054113608 | 6 | y | n |
> #  3728482343045213994 | 9 | n | y |
> #  9010454139840013625 | 3 | y | y |
> 
> stmt = session.prepare('INSERT INTO test.test (id) VALUES (?);')
> for id in range(0, 10):
> session.execute(stmt, [id], ConsistencyLevel.ALL)
> 
> # delete every other partition on node1 while node2 is down
> node2.stop(wait_other_notice=True)
> session.execute('DELETE FROM test.test WHERE id IN (5, 8, 2, 7, 9);')
> node2.start(wait_other_notice=True, wait_for_binary_proto=True)
> 
> session = self.patient_cql_connection(node2)
> 
> # delete every other alternate partition on node2 while node1 is down
> node1.stop(wait_other_notice=True)
> session.execute('DELETE FROM test.test WHERE id IN (1, 0, 4, 6);')
> node1.start(wait_other_notice=True, wait_for_binary_proto=True)
> 
> session = self.patient_exclusive_cql_connection(node1)
> 
> # until #13595 the query would incorrectly return [1]
> assert_all(session,
>'SELECT id FROM test.test LIMIT 1;',
>[[3]],
>cl=ConsistencyLevel.ALL)
> 
> srp = make_mbean('metrics', type='Table', 
> name='ShortReadProtectionRequests', keyspace='test', scope='test')
> with JolokiaAgent(node1) as jmx:
> # 4 srp requests for node1 and 5 for node2, total of 9
> >   assert 9 == jmx.read_attribute(srp, 'Count')
> E   AssertionError: assert 9 == 4
> E+  where 4 =   0x7f9f0775b160>>('org.apache.cassandra.metrics:type=Table,name=ShortReadProtectionRequests,keyspace=test,scope=test',
>  'Count')
> E+where  > = 
> .read_attribute
> consistency_test.py:1288: AssertionError {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15310) Fix flakey - testIdleDisconnect - org.apache.cassandra.transport.IdleDisconnectTest

2019-09-06 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15310:
-
Platform: All,Java11  (was: All)

> Fix flakey - testIdleDisconnect - 
> org.apache.cassandra.transport.IdleDisconnectTest
> ---
>
> Key: CASSANDRA-15310
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15310
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Joseph Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> Example run: 
> [https://circleci.com/gh/jolynch/cassandra/561#tests/containers/86]
>  
> {noformat}
> Your job ran 4428 tests with 1 failure
> - testIdleDisconnect - 
> org.apache.cassandra.transport.IdleDisconnectTestjunit.framework.AssertionFailedError
>   at 
> org.apache.cassandra.transport.IdleDisconnectTest.testIdleDisconnect(IdleDisconnectTest.java:56)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15310) Fix flakey - testIdleDisconnect - org.apache.cassandra.transport.IdleDisconnectTest

2019-09-06 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15310:
-
Fix Version/s: 4.0-alpha

> Fix flakey - testIdleDisconnect - 
> org.apache.cassandra.transport.IdleDisconnectTest
> ---
>
> Key: CASSANDRA-15310
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15310
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Joseph Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> Example run: 
> [https://circleci.com/gh/jolynch/cassandra/561#tests/containers/86]
>  
> {noformat}
> Your job ran 4428 tests with 1 failure
> - testIdleDisconnect - 
> org.apache.cassandra.transport.IdleDisconnectTestjunit.framework.AssertionFailedError
>   at 
> org.apache.cassandra.transport.IdleDisconnectTest.testIdleDisconnect(IdleDisconnectTest.java:56)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15310) Fix flakey - testIdleDisconnect - org.apache.cassandra.transport.IdleDisconnectTest

2019-09-06 Thread Joseph Lynch (Jira)
Joseph Lynch created CASSANDRA-15310:


 Summary: Fix flakey - testIdleDisconnect - 
org.apache.cassandra.transport.IdleDisconnectTest
 Key: CASSANDRA-15310
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15310
 Project: Cassandra
  Issue Type: Bug
  Components: Test/unit
Reporter: Joseph Lynch


Example run: [https://circleci.com/gh/jolynch/cassandra/561#tests/containers/86]

 
{noformat}
Your job ran 4428 tests with 1 failure
- testIdleDisconnect - 
org.apache.cassandra.transport.IdleDisconnectTestjunit.framework.AssertionFailedError
at 
org.apache.cassandra.transport.IdleDisconnectTest.testIdleDisconnect(IdleDisconnectTest.java:56)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15309) Make the upgrade tests run on trunk

2019-09-06 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15309:
-
Fix Version/s: 4.0-alpha

> Make the upgrade tests run on trunk
> ---
>
> Key: CASSANDRA-15309
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15309
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> It appears that the upgrade tests (j8_upgradetests-no-vnodes circleci target) 
> don't really work on trunk right now, it appears to be a java home issue 
> potentially. Example run: https://circleci.com/gh/jolynch/cassandra/553
> {noformat}
>  Your job ran 4412 tests with 3923 failures
> - test_IN_clause_on_last_key - 
> upgrade_tests.cql_tests.TestCQLNodes2RF1_Upgrade_current_2_1_x_To_indev_2_1_xupgrade_tests/cql_tests.pymajor_version_int
>  = 8
> def switch_jdks(major_version_int):
> """
> Changes the jdk version globally, by setting JAVA_HOME = JAVA[N]_HOME.
> This means the environment must have JAVA[N]_HOME set to switch to 
> jdk version N.
> """
> new_java_home = 'JAVA{}_HOME'.format(major_version_int)
> 
> try:
> >   os.environ[new_java_home]
> upgrade_tests/upgrade_base.py:25: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = environ({'PYTHONUNBUFFERED': 'true', 'DEFAULT_DIR': 
> '/home/cassandra/cassandra-dtest', 'CIRCLE_NODE_INDEX': '47', 
> 'CIR...ade_tests/cql_tests.py::TestCQLNodes2RF1_Upgrade_current_2_1_x_To_indev_2_1_x::()::test_IN_clause_on_last_key
>  (call)'})
> key = 'JAVA8_HOME'
> def __getitem__(self, key):
> try:
> value = self._data[self.encodekey(key)]
> except KeyError:
> # raise KeyError with the original key value
> >   raise KeyError(key) from None
> E   KeyError: 'JAVA8_HOME'{noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15309) Make the upgrade tests run on trunk

2019-09-06 Thread Joseph Lynch (Jira)
Joseph Lynch created CASSANDRA-15309:


 Summary: Make the upgrade tests run on trunk
 Key: CASSANDRA-15309
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15309
 Project: Cassandra
  Issue Type: Bug
  Components: Test/dtest
Reporter: Joseph Lynch


It appears that the upgrade tests (j8_upgradetests-no-vnodes circleci target) 
don't really work on trunk right now, it appears to be a java home issue 
potentially. Example run: https://circleci.com/gh/jolynch/cassandra/553
{noformat}
 Your job ran 4412 tests with 3923 failures

- test_IN_clause_on_last_key - 
upgrade_tests.cql_tests.TestCQLNodes2RF1_Upgrade_current_2_1_x_To_indev_2_1_xupgrade_tests/cql_tests.pymajor_version_int
 = 8

def switch_jdks(major_version_int):
"""
Changes the jdk version globally, by setting JAVA_HOME = JAVA[N]_HOME.
This means the environment must have JAVA[N]_HOME set to switch to jdk 
version N.
"""
new_java_home = 'JAVA{}_HOME'.format(major_version_int)

try:
>   os.environ[new_java_home]

upgrade_tests/upgrade_base.py:25: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = environ({'PYTHONUNBUFFERED': 'true', 'DEFAULT_DIR': 
'/home/cassandra/cassandra-dtest', 'CIRCLE_NODE_INDEX': '47', 
'CIR...ade_tests/cql_tests.py::TestCQLNodes2RF1_Upgrade_current_2_1_x_To_indev_2_1_x::()::test_IN_clause_on_last_key
 (call)'})
key = 'JAVA8_HOME'

def __getitem__(self, key):
try:
value = self._data[self.encodekey(key)]
except KeyError:
# raise KeyError with the original key value
>   raise KeyError(key) from None
E   KeyError: 'JAVA8_HOME'{noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15308) Fix flakey testAcquireReleaseOutbound - org.apache.cassandra.net.ConnectionTest

2019-09-06 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15308:
-
Fix Version/s: 4.0-alpha

> Fix flakey testAcquireReleaseOutbound - 
> org.apache.cassandra.net.ConnectionTest
> ---
>
> Key: CASSANDRA-15308
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15308
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Joseph Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> Example failure: 
> https://circleci.com/gh/jolynch/cassandra/554#tests/containers/61
> {noformat}
> Your job ran 4428 tests with 1 failure
> - testAcquireReleaseOutbound - org.apache.cassandra.net.ConnectionTest
> junit.framework.AssertionFailedError
>   at 
> org.apache.cassandra.net.ConnectionTest.lambda$testAcquireReleaseOutbound$53(ConnectionTest.java:770)
>   at 
> org.apache.cassandra.net.ConnectionTest.lambda$doTest$8(ConnectionTest.java:238)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258)
>   at 
> org.apache.cassandra.net.ConnectionTest.doTest(ConnectionTest.java:236)
>   at org.apache.cassandra.net.ConnectionTest.test(ConnectionTest.java:225)
>   at 
> org.apache.cassandra.net.ConnectionTest.testAcquireReleaseOutbound(ConnectionTest.java:767)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15308) Fix flakey testAcquireReleaseOutbound - org.apache.cassandra.net.ConnectionTest

2019-09-06 Thread Joseph Lynch (Jira)
Joseph Lynch created CASSANDRA-15308:


 Summary: Fix flakey testAcquireReleaseOutbound - 
org.apache.cassandra.net.ConnectionTest
 Key: CASSANDRA-15308
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15308
 Project: Cassandra
  Issue Type: Bug
  Components: Test/unit
Reporter: Joseph Lynch


Example failure: 
https://circleci.com/gh/jolynch/cassandra/554#tests/containers/61
{noformat}
Your job ran 4428 tests with 1 failure
- testAcquireReleaseOutbound - org.apache.cassandra.net.ConnectionTest

junit.framework.AssertionFailedError
at 
org.apache.cassandra.net.ConnectionTest.lambda$testAcquireReleaseOutbound$53(ConnectionTest.java:770)
at 
org.apache.cassandra.net.ConnectionTest.lambda$doTest$8(ConnectionTest.java:238)
at 
org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258)
at 
org.apache.cassandra.net.ConnectionTest.doTest(ConnectionTest.java:236)
at org.apache.cassandra.net.ConnectionTest.test(ConnectionTest.java:225)
at 
org.apache.cassandra.net.ConnectionTest.testAcquireReleaseOutbound(ConnectionTest.java:767)
 {noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15307) Fix flakey test_remote_query - cql_test.TestCQLSlowQuery test

2019-09-06 Thread Joseph Lynch (Jira)
Joseph Lynch created CASSANDRA-15307:


 Summary: Fix flakey  test_remote_query - cql_test.TestCQLSlowQuery 
test
 Key: CASSANDRA-15307
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15307
 Project: Cassandra
  Issue Type: Bug
  Components: Test/dtest
Reporter: Joseph Lynch


Example failure: 
[https://circleci.com/gh/jolynch/cassandra/554#tests/containers/61]

 
{noformat}
Your job ran 959 tests with 1 failure
- test_remote_query cql_test.TestCQLSlowQuerycql_test.py

ccmlib.node.TimeoutError: 05 Sep 2019 23:05:07 [node2] Missing: ['operations 
were slow', 'SELECT \\* FROM ks.test2 WHERE id = 1']: DEBUG [BatchlogTasks:1] 
2019-09-05 23:04:24,437 Ba. See debug.log for remainder
self = 

def test_remote_query(self):
"""
Check that a query running on a node other than the coordinator is 
reported as slow:

- populate the cluster with 2 nodes
- start one node without having it join the ring
- start the other one node with slow_query_log_timeout_in_ms set to 
a small value
  and the read request timeouts set to a large value (to ensure the 
query is not aborted) and
  read_iteration_delay set to a value big enough for the query to 
exceed slow_query_log_timeout_in_ms
  (this will cause read queries to take longer than the slow query 
timeout)
- CREATE a table
- INSERT 5000 rows on a session on the node that is not a member of 
the ring
- run SELECT statements and check that the slow query messages are 
present in the debug logs
  (we cannot check the logs at info level because the no spam 
logger has unpredictable results)

@jira_ticket CASSANDRA-12403
"""
cluster = self.cluster

cluster.set_configuration_options(values={'slow_query_log_timeout_in_ms': 10,
  'request_timeout_in_ms': 
12,
  'read_request_timeout_in_ms': 
12,
  
'range_request_timeout_in_ms': 12})

cluster.populate(2)
node1, node2 = cluster.nodelist()

node1.start(wait_for_binary_proto=True, join_ring=False)  # ensure 
other node executes queries
node2.start(wait_for_binary_proto=True,
jvm_args=["-Dcassandra.monitoring_report_interval_ms=10",
  "-Dcassandra.test.read_iteration_delay_ms=1"])  # 
see above for explanation

session = self.patient_exclusive_cql_connection(node1)

create_ks(session, 'ks', 1)
session.execute("""
CREATE TABLE test2 (
id int,
col int,
val text,
PRIMARY KEY(id, col)
);
""")

for i, j in itertools.product(list(range(100)), list(range(10))):
session.execute("INSERT INTO test2 (id, col, val) VALUES ({}, {}, 
'foo')".format(i, j))

# only check debug logs because at INFO level the no-spam logger has 
unpredictable results
mark = node2.mark_log(filename='debug.log')
session.execute(SimpleStatement("SELECT * from test2",
consistency_level=ConsistencyLevel.ONE,
retry_policy=FallthroughRetryPolicy()))
node2.watch_log_for(["operations were slow", "SELECT \* FROM ks.test2"],
from_mark=mark, filename='debug.log', timeout=60)


mark = node2.mark_log(filename='debug.log')
session.execute(SimpleStatement("SELECT * from test2 where id = 1",
consistency_level=ConsistencyLevel.ONE,
retry_policy=FallthroughRetryPolicy()))
node2.watch_log_for(["operations were slow", "SELECT \* FROM ks.test2 
WHERE id = 1"],
>   from_mark=mark, filename='debug.log', timeout=60)

cql_test.py:1150: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
exprs = ['operations were slow', 'SELECT \\* FROM ks.test2 WHERE id = 1']
from_mark = 166214, timeout = 60, process = None, verbose = False
filename = 'debug.log'

def watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, 
verbose=False, filename='system.log'):
"""
Watch the log until one or more (regular) expression are found.
This methods when all the expressions have been found or the method
timeouts (a TimeoutError is then raised). On successful completion,
a list of pair (line matched, match object) is returned.
"""
start = time.time()
tofind = [exprs] if 

[jira] [Updated] (CASSANDRA-15307) Fix flakey test_remote_query - cql_test.TestCQLSlowQuery test

2019-09-06 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15307:
-
Fix Version/s: 4.0-alpha

> Fix flakey  test_remote_query - cql_test.TestCQLSlowQuery test
> --
>
> Key: CASSANDRA-15307
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15307
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Joseph Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> Example failure: 
> [https://circleci.com/gh/jolynch/cassandra/554#tests/containers/61]
>  
> {noformat}
> Your job ran 959 tests with 1 failure
> - test_remote_query cql_test.TestCQLSlowQuerycql_test.py
> ccmlib.node.TimeoutError: 05 Sep 2019 23:05:07 [node2] Missing: ['operations 
> were slow', 'SELECT \\* FROM ks.test2 WHERE id = 1']: DEBUG [BatchlogTasks:1] 
> 2019-09-05 23:04:24,437 Ba. See debug.log for remainder
> self = 
> def test_remote_query(self):
> """
> Check that a query running on a node other than the coordinator 
> is reported as slow:
> 
> - populate the cluster with 2 nodes
> - start one node without having it join the ring
> - start the other one node with slow_query_log_timeout_in_ms set 
> to a small value
>   and the read request timeouts set to a large value (to ensure 
> the query is not aborted) and
>   read_iteration_delay set to a value big enough for the query to 
> exceed slow_query_log_timeout_in_ms
>   (this will cause read queries to take longer than the slow 
> query timeout)
> - CREATE a table
> - INSERT 5000 rows on a session on the node that is not a member 
> of the ring
> - run SELECT statements and check that the slow query messages 
> are present in the debug logs
>   (we cannot check the logs at info level because the no spam 
> logger has unpredictable results)
> 
> @jira_ticket CASSANDRA-12403
> """
> cluster = self.cluster
> 
> cluster.set_configuration_options(values={'slow_query_log_timeout_in_ms': 10,
>   'request_timeout_in_ms': 
> 12,
>   
> 'read_request_timeout_in_ms': 12,
>   
> 'range_request_timeout_in_ms': 12})
> 
> cluster.populate(2)
> node1, node2 = cluster.nodelist()
> 
> node1.start(wait_for_binary_proto=True, join_ring=False)  # ensure 
> other node executes queries
> node2.start(wait_for_binary_proto=True,
> jvm_args=["-Dcassandra.monitoring_report_interval_ms=10",
>   "-Dcassandra.test.read_iteration_delay_ms=1"])  
> # see above for explanation
> 
> session = self.patient_exclusive_cql_connection(node1)
> 
> create_ks(session, 'ks', 1)
> session.execute("""
> CREATE TABLE test2 (
> id int,
> col int,
> val text,
> PRIMARY KEY(id, col)
> );
> """)
> 
> for i, j in itertools.product(list(range(100)), list(range(10))):
> session.execute("INSERT INTO test2 (id, col, val) VALUES ({}, {}, 
> 'foo')".format(i, j))
> 
> # only check debug logs because at INFO level the no-spam logger has 
> unpredictable results
> mark = node2.mark_log(filename='debug.log')
> session.execute(SimpleStatement("SELECT * from test2",
> 
> consistency_level=ConsistencyLevel.ONE,
> 
> retry_policy=FallthroughRetryPolicy()))
> node2.watch_log_for(["operations were slow", "SELECT \* FROM 
> ks.test2"],
> from_mark=mark, filename='debug.log', timeout=60)
> 
> 
> mark = node2.mark_log(filename='debug.log')
> session.execute(SimpleStatement("SELECT * from test2 where id = 1",
> 
> consistency_level=ConsistencyLevel.ONE,
> 
> retry_policy=FallthroughRetryPolicy()))
> node2.watch_log_for(["operations were slow", "SELECT \* FROM ks.test2 
> WHERE id = 1"],
> >   from_mark=mark, filename='debug.log', timeout=60)
> cql_test.py:1150: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> self = 
> exprs = ['operations were slow', 'SELECT \\* FROM ks.test2 WHERE id = 1']
> from_mark = 166214, timeout = 60, process = None, verbose = False
> filename = 'debug.log'
> def watch_log_for(self, 

[jira] [Updated] (CASSANDRA-15306) Investigate why we are allocating 8MiB chunks and reaching the maximum BufferPool size

2019-09-06 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15306:
-
Description: 
While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the 
following in the logs
{noformat}
INFO  [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB

INFO  [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB

INFO  [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB

INFO  [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB 
{noformat}
This was with about 150 WPS against a LCS table containing 4kib partitions. It 
seemed that compaction proceeded just fine but I don't remember seeing this in 
previous testing runs and I'd like to make sure it's not a bug (otherwise we 
may want to reduce the logging). 

  was:
While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the 
following in the logs
{noformat}
INFO  [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB

INFO  [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB

INFO  [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB

INFO  [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB 
{noformat}
This was with about 150 WPS against a LCS table containing 4kib data. It seemed 
that compaction proceeded just fine but I don't remember seeing this in 
previous testing runs and I'd like to make sure it's not a bug (otherwise we 
may want to reduce the logging). 


> Investigate why we are allocating 8MiB chunks and reaching the maximum 
> BufferPool size
> --
>
> Key: CASSANDRA-15306
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15306
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Logging, Test/benchmark
>Reporter: Joseph Lynch
>Priority: Normal
> Fix For: 4.0-beta
>
>
> While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the 
> following in the logs
> {noformat}
> INFO  [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB 
> {noformat}
> This was with about 150 WPS against a LCS table containing 4kib partitions. 
> It seemed that compaction proceeded just fine but I don't remember seeing 
> this in previous testing runs and I'd like to make sure it's not a bug 
> (otherwise we may want to reduce the logging). 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15306) Investigate why we are allocating 8MiB chunks and reaching the maximum BufferPool size

2019-09-06 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15306:
-
Component/s: Observability/Logging

> Investigate why we are allocating 8MiB chunks and reaching the maximum 
> BufferPool size
> --
>
> Key: CASSANDRA-15306
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15306
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Logging, Test/benchmark
>Reporter: Joseph Lynch
>Priority: Normal
> Fix For: 4.0-beta
>
>
> While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the 
> following in the logs
> {noformat}
> INFO  [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB 
> {noformat}
> This was with about 150 WPS against a LCS table containing 4kib data. It 
> seemed that compaction proceeded just fine but I don't remember seeing this 
> in previous testing runs and I'd like to make sure it's not a bug (otherwise 
> we may want to reduce the logging). 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15306) Investigate why we are allocating 8MiB chunks and reaching the maximum BufferPool size

2019-09-06 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15306:
-
Fix Version/s: 4.0-alpha

> Investigate why we are allocating 8MiB chunks and reaching the maximum 
> BufferPool size
> --
>
> Key: CASSANDRA-15306
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15306
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the 
> following in the logs
> {noformat}
> INFO  [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB 
> {noformat}
> This was with about 150 WPS against a LCS table containing 4kib data. It 
> seemed that compaction proceeded just fine but I don't remember seeing this 
> in previous testing runs and I'd like to make sure it's not a bug (otherwise 
> we may want to reduce the logging). 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15306) Investigate why we are allocating 8MiB chunks and reaching the maximum BufferPool size

2019-09-06 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15306:
-
Fix Version/s: (was: 4.0-alpha)
   4.0-beta

> Investigate why we are allocating 8MiB chunks and reaching the maximum 
> BufferPool size
> --
>
> Key: CASSANDRA-15306
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15306
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Priority: Normal
> Fix For: 4.0-beta
>
>
> While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the 
> following in the logs
> {noformat}
> INFO  [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB
> INFO  [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - 
> Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB 
> {noformat}
> This was with about 150 WPS against a LCS table containing 4kib data. It 
> seemed that compaction proceeded just fine but I don't remember seeing this 
> in previous testing runs and I'd like to make sure it's not a bug (otherwise 
> we may want to reduce the logging). 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15306) Investigate why we are allocating 8MiB chunks and reaching the maximum BufferPool size

2019-09-06 Thread Joseph Lynch (Jira)
Joseph Lynch created CASSANDRA-15306:


 Summary: Investigate why we are allocating 8MiB chunks and 
reaching the maximum BufferPool size
 Key: CASSANDRA-15306
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15306
 Project: Cassandra
  Issue Type: Bug
  Components: Test/benchmark
Reporter: Joseph Lynch


While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the 
following in the logs
{noformat}
INFO  [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB

INFO  [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB

INFO  [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB

INFO  [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - 
Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB 
{noformat}
This was with about 150 WPS against a LCS table containing 4kib data. It seemed 
that compaction proceeded just fine but I don't remember seeing this in 
previous testing runs and I'd like to make sure it's not a bug (otherwise we 
may want to reduce the logging). 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)

2019-08-29 Thread Joseph Lynch (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918975#comment-16918975
 ] 

Joseph Lynch commented on CASSANDRA-13938:
--

I might have cycles to tackle this shortly, if someone else has cycles first 
please take it.

> Default repair is broken, crashes other nodes participating in repair (in 
> trunk)
> 
>
> Key: CASSANDRA-13938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13938
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Nate McCall
>Assignee: Jason Brown
>Priority: Urgent
> Fix For: 4.0-alpha
>
> Attachments: 13938.yaml, test.sh
>
>
> Running through a simple scenario to test some of the new repair features, I 
> was not able to make a repair command work. Further, the exception seemed to 
> trigger a nasty failure state that basically shuts down the netty connections 
> for messaging *and* CQL on the nodes transferring back data to the node being 
> repaired. The following steps reproduce this issue consistently.
> Cassandra stress profile (probably not necessary, but this one provides a 
> really simple schema and consistent data shape):
> {noformat}
> keyspace: standard_long
> keyspace_definition: |
>   CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy', 
> 'replication_factor':3};
> table: test_data
> table_definition: |
>   CREATE TABLE test_data (
>   key text,
>   ts bigint,
>   val text,
>   PRIMARY KEY (key, ts)
>   ) WITH COMPACT STORAGE AND
>   CLUSTERING ORDER BY (ts DESC) AND
>   bloom_filter_fp_chance=0.01 AND
>   caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND
>   comment='' AND
>   dclocal_read_repair_chance=0.00 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.00 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> columnspec:
>   - name: key
> population: uniform(1..5000) # 50 million records available
>   - name: ts
> cluster: gaussian(1..50) # Up to 50 inserts per record
>   - name: val
> population: gaussian(128..1024) # varrying size of value data
> insert:
>   partitions: fixed(1) # only one insert per batch for individual partitions
>   select: fixed(1)/1 # each insert comes in one at a time
>   batchtype: UNLOGGED
> queries:
>   single:
> cql: select * from test_data where key = ? and ts = ? limit 1;
>   series:
> cql: select key,ts,val from test_data where key = ? limit 10;
> {noformat}
> The commands to build and run:
> {noformat}
> ccm create 4_0_test -v git:trunk -n 3 -s
> ccm stress user profile=./histo-test-schema.yml 
> ops\(insert=20,single=1,series=1\) duration=15s -rate threads=4
> # flush the memtable just to get everything on disk
> ccm node1 nodetool flush
> ccm node2 nodetool flush
> ccm node3 nodetool flush
> # disable hints for nodes 2 and 3
> ccm node2 nodetool disablehandoff
> ccm node3 nodetool disablehandoff
> # stop node1
> ccm node1 stop
> ccm stress user profile=./histo-test-schema.yml 
> ops\(insert=20,single=1,series=1\) duration=45s -rate threads=4
> # wait 10 seconds
> ccm node1 start
> # Note that we are local to ccm's nodetool install 'cause repair preview is 
> not reported yet
> node1/bin/nodetool repair --preview
> node1/bin/nodetool repair standard_long test_data
> {noformat} 
> The error outputs from the last repair command follow. First, this is stdout 
> from node1:
> {noformat}
> $ node1/bin/nodetool repair standard_long test_data
> objc[47876]: Class JavaLaunchHelper is implemented in both 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java 
> (0x10274d4c0) and 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/libinstrument.dylib
>  (0x1047b64e0). One of the two will be used. Which one is undefined.
> [2017-10-05 14:31:52,425] Starting repair command #4 
> (7e1a9150-a98e-11e7-ad86-cbd2801b8de2), repairing keyspace standard_long with 
> repair options (parallelism: parallel, primary range: false, incremental: 
> true, job threads: 1, ColumnFamilies: [test_data], dataCenters: [], hosts: 
> [], previewKind: NONE, # of ranges: 3, pull repair: false, force repair: 
> false)
> [2017-10-05 14:32:07,045] Repair session 7e2e8e80-a98e-11e7-ad86-cbd2801b8de2 
> for range [(3074457345618258602,-9223372036854775808], 
> (-9223372036854775808,-3074457345618258603], 
> (-3074457345618258603,3074457345618258602]] failed with error Stream failed
> [2017-10-05 14:32:07,048] null
> [2017-10-05 14:32:07,050] Repair command #4 finished in 14 seconds
> error: Repair job has failed with the error message: [2017-10-05 
> 14:32:07,048] null
> -- StackTrace --
> java.lang.RuntimeException: Repair 

[jira] [Commented] (CASSANDRA-15262) server_encryption_options is not backwards compatible with 3.11

2019-08-29 Thread Joseph Lynch (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918959#comment-16918959
 ] 

Joseph Lynch commented on CASSANDRA-15262:
--

This could slip to 4.0-beta if we had to, but it is going to be annoying for 
folks testing with TLS (it was for us).

> server_encryption_options is not backwards compatible with 3.11
> ---
>
> Key: CASSANDRA-15262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15262
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Fix For: 4.0, 4.0-alpha
>
>
> The current `server_encryption_options` configuration options are as follows:
> {noformat}
> server_encryption_options:
> # set to true for allowing secure incoming connections
> enabled: false
> # If enabled and optional are both set to true, encrypted and unencrypted 
> connections are handled on the storage_port
> optional: false
> # if enabled, will open up an encrypted listening socket on 
> ssl_storage_port. Should be used
> # during upgrade to 4.0; otherwise, set to false.
> enable_legacy_ssl_storage_port: false
> # on outbound connections, determine which type of peers to securely 
> connect to. 'enabled' must be set to true.
> internode_encryption: none
> keystore: conf/.keystore
> keystore_password: cassandra
> truststore: conf/.truststore
> truststore_password: cassandra
> # More advanced defaults below:
> # protocol: TLS
> # store_type: JKS
> # cipher_suites: 
> [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
> # require_client_auth: false
> # require_endpoint_verification: false
> {noformat}
> A couple of issues here:
> 1. optional defaults to false, which will break existing TLS configurations 
> for (from what I can tell) no particularly good reason
> 2. The provided protocol and cipher suites are not good ideas (in particular 
> encouraging anyone to use CBC ciphers is a bad plan
> I propose that before the 4.0 cut we fixup server_encryption_options and even 
> client_encryption_options :
> # Change the default {{optional}} setting to true. As the new Netty code 
> intelligently decides to open a TLS connection or not this is the more 
> sensible default (saves operators a step while transitioning to TLS as well)
> # Update the defaults to what netty actually defaults to



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15294) Allow easy use of custom security providers

2019-08-29 Thread Joseph Lynch (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918955#comment-16918955
 ] 

Joseph Lynch commented on CASSANDRA-15294:
--

Yes I think after the alpha cuts I should have cycles to add this in, since it 
doesn't involve any backwards incompatible API changes I can do it before beta. 
I'd like to add the configuration capability to 3.0/3.11/trunk if possible but 
I think people might object to it being in 3.0 ... If no-one objects I'll just 
make patches for all three.

> Allow easy use of custom security providers
> ---
>
> Key: CASSANDRA-15294
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15294
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Joseph Lynch
>Priority: Normal
>
> As more users are switching to using {{AES-GCM}} TLS they are increasingly 
> running into extremely poor performance with the JDK implementations (e.g. 
> [JDK-8046943|https://bugs.openjdk.java.net/browse/JDK-8046943]). It's not 
> just TLS either, generally speaking Java crypto can be really slow, including 
> for example MD5 hashing which powers our digests (CASSANDRA-14611).
> There have been a few community attempts to fix this via customer java 
> security providers, for example Google's 
> [conscrypt|https://github.com/google/conscrypt] and recently Amazon's 
> [ACCP|https://github.com/corretto/amazon-corretto-crypto-provider] which are 
> basically portions of OpenSSL/BoringSSL that are statically linked in and 
> exposed via JNI. These approaches are similar in spirit to what 
> [netty-tcnative|https://github.com/netty/netty-tcnative] is doing for TLS in 
> C* trunk.
> Since there may be tradeoffs to using various providers for various functions 
> (e.g. {{conscrypt}} may be faster or slower than {{accp}} in certain use 
> cases and in other cases you may want to use JDK providers for ease of 
> upgrading) it would be useful if Cassandra supported pluggable providers per 
> use case. For example we could use {{conscrypt}} for TLS, {{accp}} for MD5 
> digesting, and the {{SUN}} provider for everything else. There is a small 
> amount of JVM wiring that needs to be done for this and it could unlock 
> 10-25% CPU capacity improvements.
> We can then use this pluggability to test different providers and if one is 
> strictly dominant we can just check that one in in libs and default to it.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15146) Transitional TLS server configuration options are overly complex

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15146:
-
Fix Version/s: 4.0-beta

> Transitional TLS server configuration options are overly complex
> 
>
> Key: CASSANDRA-15146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption, Local/Config
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
>
> It appears as part of the port from transitional client TLS to transitional 
> server TLS in CASSANDRA-10404 (the ability to switch a cluster to using 
> {{internode_encryption}} without listening on two ports and without downtime) 
> we carried the {{enabled}} setting over from the client implementation. I 
> believe that the {{enabled}} option is redundant to {{internode_encryption}} 
> and {{optional}} and it should therefore be removed prior to the 4.0 release 
> where we will have to start respecting that interface. 
> Current trunk yaml:
> {noformat}
> server_encryption_options:
>   
> # set to true for allowing secure incoming connections
>   
> enabled: false
>   
> # If enabled and optional are both set to true, encrypted and unencrypted 
> connections are handled on the storage_port
> optional: false   
>   
>   
>   
> 
> # if enabled, will open up an encrypted listening socket on 
> ssl_storage_port. Should be used
> # during upgrade to 4.0; otherwise, set to false. 
>   
> enable_legacy_ssl_storage_port: false 
>   
> # on outbound connections, determine which type of peers to securely 
> connect to. 'enabled' must be set to true.
> internode_encryption: none
>   
> keystore: conf/.keystore  
>   
> keystore_password: cassandra  
>   
> truststore: conf/.truststore  
>   
> truststore_password: cassandra
> {noformat}
> I propose we eliminate {{enabled}} and just use {{optional}} and 
> {{internode_encryption}} to determine the listener setup. I also propose we 
> change the default of {{optional}} to true. We could also re-name 
> {{optional}} since it's a new option but I think it's good to stay consistent 
> with the client and use {{optional}}.
> ||optional||internode_encryption||description||
> |true|none|(default) No encryption is used but if a server reaches out with 
> it we'll use it|
> |false|dc|Encryption is required for inter-dc communication, but not intra-dc|
> |false|all|Encryption is required for all communication|
> |false|none|We only listen for unencrypted connections|
> |true|dc|Encryption is used for inter-dc communication but is not required|
> |true|all|Encryption is used for all communication but is not required|
> From these states it is clear when we should be accepting TLS connections 
> (all except for false and none) as well as when we must enforce it.
> To transition without downtime from an un-encrypted cluster to an encrypted 
> cluster the user would do the following:
> 1. After adding valid truststores, change {{internode_encryption}} to the 
> desired level of encryption (recommended {{all}}) and restart Cassandra
>  2. Change {{optional=false}} and restart Cassandra to enforce #1
> If {{optional}} defaulted to {{false}} as it does right now we'd need a third 
> restart to first change {{optional}} to {{true}}, which given my 
> understanding of the OptionalSslHandler isn't really relevant.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15262) server_encryption_options is not backwards compatible with 3.11

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15262:
-
Fix Version/s: 4.0-alpha

> server_encryption_options is not backwards compatible with 3.11
> ---
>
> Key: CASSANDRA-15262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15262
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Fix For: 4.0, 4.0-alpha
>
>
> The current `server_encryption_options` configuration options are as follows:
> {noformat}
> server_encryption_options:
> # set to true for allowing secure incoming connections
> enabled: false
> # If enabled and optional are both set to true, encrypted and unencrypted 
> connections are handled on the storage_port
> optional: false
> # if enabled, will open up an encrypted listening socket on 
> ssl_storage_port. Should be used
> # during upgrade to 4.0; otherwise, set to false.
> enable_legacy_ssl_storage_port: false
> # on outbound connections, determine which type of peers to securely 
> connect to. 'enabled' must be set to true.
> internode_encryption: none
> keystore: conf/.keystore
> keystore_password: cassandra
> truststore: conf/.truststore
> truststore_password: cassandra
> # More advanced defaults below:
> # protocol: TLS
> # store_type: JKS
> # cipher_suites: 
> [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
> # require_client_auth: false
> # require_endpoint_verification: false
> {noformat}
> A couple of issues here:
> 1. optional defaults to false, which will break existing TLS configurations 
> for (from what I can tell) no particularly good reason
> 2. The provided protocol and cipher suites are not good ideas (in particular 
> encouraging anyone to use CBC ciphers is a bad plan
> I propose that before the 4.0 cut we fixup server_encryption_options and even 
> client_encryption_options :
> # Change the default {{optional}} setting to true. As the new Netty code 
> intelligently decides to open a TLS connection or not this is the more 
> sensible default (saves operators a step while transitioning to TLS as well)
> # Update the defaults to what netty actually defaults to



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14764) Evaluate 12 Node Breaking Point, compression=none, encryption=none, coalescing=off

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-14764:
-
Fix Version/s: 4.0-beta

> Evaluate 12 Node Breaking Point, compression=none, encryption=none, 
> coalescing=off
> --
>
> Key: CASSANDRA-14764
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14764
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Legacy/Streaming and Messaging
>Reporter: Joseph Lynch
>Assignee: Vinay Chella
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: i-03341e1c52de6ea3e-after-queue-change.svg, 
> i-07cd92e844d66d801-after-queue-bound.svg, i-07cd92e844d66d801-hint-play.svg, 
> i-07cd92e844d66d801-uninlined-with-jvm-methods.svg, ttop.txt
>
>
> *Setup:*
>  * Cassandra: 12 (2*6) node i3.xlarge AWS instance (4 cpu cores, 30GB ram) 
> running cassandra trunk off of jasobrown/14503 jdd7ec5a2 (Jasons patched 
> internode messaging branch) vs the same footprint running 3.0.17
>  * Two datacenters with 100ms latency between them
>  * No compression, encryption, or coalescing turned on
> *Test #1:*
> ndbench sent 1.5k QPS at a coordinator level to one datacenter (RF=3*2 = 6 so 
> 3k global replica QPS) of 4kb single partition BATCH mutations at LOCAL_ONE. 
> This represents about 250 QPS per coordinator in the first datacenter or 60 
> QPS per core. The goal was to observe P99 write and read latencies under 
> various QPS.
> *Result:*
> The good news is since the CASSANDRA-14503 changes, instead of keeping the 
> mutations on heap we put the message into hints instead and don't run out of 
> memory. The bad news is that the {{MessagingService-NettyOutbound-Thread's}} 
> would occasionally enter a degraded state where they would just spin on a 
> core. I've attached flame graphs showing the CPU state as [~jasobrown] 
> applied fixes to the {{OutboundMessagingConnection}} class.
>  *Follow Ups:*
> [~jasobrown] has committed a number of fixes onto his 
> {{jasobrown/14503-collab}} branch including:
> 1. Limiting the amount of time spent dequeuing messages if they are expired 
> (previously if messages entered the queue faster than we could dequeue them 
> we'd just inifinte loop on the consumer side)
> 2. Don't call {{dequeueMessages}} from within {{dequeueMessages}} created 
> callbacks.
> We're continuing to use CPU flamegraphs to figure out where we're looping and 
> fixing bugs as we find them.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14747) Evaluate 200 node, compression=none, encryption=none, coalescing=off

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-14747:
-
Fix Version/s: 4.0-beta

> Evaluate 200 node, compression=none, encryption=none, coalescing=off 
> -
>
> Key: CASSANDRA-14747
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14747
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Legacy/Testing
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 3.0.17-QPS.png, 4.0.1-QPS.png, 
> 4.0.11-after-jolynch-tweaks.svg, 4.0.12-after-unconditional-flush.svg, 
> 4.0.15-after-sndbuf-fix.svg, 4.0.7-before-my-changes.svg, 
> 4.0_errors_showing_heap_pressure.txt, 
> 4.0_heap_histogram_showing_many_MessageOuts.txt, 
> i-0ed2acd2dfacab7c1-after-looping-fixes.svg, 
> trunk_14503_v2_cpuflamegraph.svg, trunk_vs_3.0.17_latency_under_load.png, 
> ttop_NettyOutbound-Thread_spinning.txt, 
> useast1c-i-0e1ddfe8b2f769060-mutation-flame.svg, 
> useast1e-i-08635fa1631601538_flamegraph_96node.svg, 
> useast1e-i-08635fa1631601538_ttop_netty_outbound_threads_96nodes, 
> useast1e-i-08635fa1631601538_uninlinedcpuflamegraph.0_96node_60sec_profile.svg
>
>
> Tracks evaluating a 200 node cluster with all internode settings off (no 
> compression, no encryption, no coalescing).



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14746) Ensure Netty Internode Messaging Refactor is Solid

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-14746:
-
Fix Version/s: 4.0-beta

> Ensure Netty Internode Messaging Refactor is Solid
> --
>
> Key: CASSANDRA-14746
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14746
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Streaming and Messaging
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
>  Labels: 4.0-QA
> Fix For: 4.0, 4.0-beta
>
>
> Before we release 4.0 let's ensure that the internode messaging refactor is 
> 100% solid. As internode messaging is naturally used in many code paths and 
> widely configurable we have a large number of cluster configurations and test 
> configurations that must be vetted.
> We plan to vary the following:
>  * Version of Cassandra 3.0.17 vs 4.0-alpha
>  * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes
>  * Client request rates varying between 1k QPS and 100k QPS of varying sizes 
> and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...)
>  * Internode compression
>  * Internode SSL (as well as openssl vs jdk)
>  * Internode Coalescing options
> We are looking to measure the following as appropriate:
>  * Latency distributions of reads and writes (lower is better)
>  * Scaling limit, aka maximum throughput before violating p99 latency 
> deadline of 10ms @ LOCAL_QUORUM, on a fixed hardware deployment for 100% 
> writes, 100% reads and 50-50 writes+reads (higher is better)
>  * Thread counts (lower is better)
>  * Context switches (lower is better)
>  * On-CPU time of tasks (higher periods without context switch is better)
>  * GC allocation rates / throughput for a fixed size heap (lower allocation 
> better)
>  * Streaming recovery time for a single node failure, i.e. can Cassandra 
> saturate the NIC
>  
> The goal is that 4.0 should have better latency, more throughput, fewer 
> threads, fewer context switches, less GC allocation, and faster recovery 
> time. I'm putting Jason Brown as the reviewer since he implemented most of 
> the internode refactor.
> Current collaborators driving this QA task: Dinesh Joshi, Jordan West, Joey 
> Lynch (Netflix), Vinay Chella (Netflix)
> Owning committer(s): Jason Brown



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15181) Ensure Nodes can Start and Stop

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15181:
-
Fix Version/s: 4.0-beta

> Ensure Nodes can Start and Stop
> ---
>
> Key: CASSANDRA-15181
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15181
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Legacy/Streaming and Messaging, Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Vinay Chella
>Priority: High
> Fix For: 4.0-beta
>
>
> Let's load a cluster up with data and start killing nodes. We can do hard 
> failures (node terminations) and soft failures (process kills) We plan to 
> observe the following:
> * Can nodes successfully bootstrap?
> * How long does it take to bootstrap
> * What are the effects of TLS on and off (e.g. on stream time)
> * Are hints properly played after a node restart
> * Do nodes properly shutdown and start back up.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14688) Update protocol spec and class level doc with protocol checksumming details

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-14688:
-
Fix Version/s: 4.0-beta

> Update protocol spec and class level doc with protocol checksumming details
> ---
>
> Key: CASSANDRA-14688
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14688
> Project: Cassandra
>  Issue Type: Task
>  Components: Legacy/Documentation and Website
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
>
> CASSANDRA-13304 provides an option to add checksumming to the frame body of 
> native protocol messages. The native protocol spec needs to be updated to 
> reflect this ASAP. We should also verify that the javadoc comments describing 
> the on-wire format in 
> {{o.a.c.transport.frame.checksum.ChecksummingTransformer}} are up to date.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15228) Commit Log should not use sync markers

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15228:
-
Fix Version/s: 4.0-alpha

> Commit Log should not use sync markers
> --
>
> Key: CASSANDRA-15228
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15228
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0, 4.0-alpha
>
>
> The sync markers existed to permit file re-use.  Since we no longer re-use 
> files, they no longer provide any value.  However, they _can_ corrupt the 
> commit log for replay in the event of a process crash.  Before we release 
> 4.0, we should ideally remove the sync markers entirely.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements

2019-08-29 Thread Joseph Lynch (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918949#comment-16918949
 ] 

Joseph Lynch commented on CASSANDRA-14801:
--

[~benedict] do you think this should block the first alpha or it can wait for 
beta?

> calculatePendingRanges no longer safe for multiple adjacent range movements
> ---
>
> Key: CASSANDRA-14801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14801
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination, Legacy/Distributed Metadata
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0
>
>
> Correctness depended upon the narrowing to a {{Set}}, 
> which we no longer do - we maintain a collection of all {{Replica}}.  Our 
> {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result 
> contain the same endpoint multiple times; and our {{EndpointsForToken}} 
> obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, 
> resulting in cluster-wide failures for writes to the affected token ranges 
> for the duration of the range movement.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-10190) Python 3 support for cqlsh

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-10190:
-
Fix Version/s: 4.0-alpha

> Python 3 support for cqlsh
> --
>
> Key: CASSANDRA-10190
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10190
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Tools
>Reporter: Andrew Pennebaker
>Assignee: Patrick Bannister
>Priority: Normal
>  Labels: cqlsh
> Fix For: 4.0-alpha
>
> Attachments: coverage_notes.txt
>
>
> Users who operate in a Python 3 environment may have trouble launching cqlsh. 
> Could we please update cqlsh's syntax to run in Python 3?
> As a workaround, users can setup pyenv, and cd to a directory with a 
> .python-version containing "2.7". But it would be nice if cqlsh supported 
> modern Python versions out of the box.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-13938:
-
Fix Version/s: (was: 4.0)

> Default repair is broken, crashes other nodes participating in repair (in 
> trunk)
> 
>
> Key: CASSANDRA-13938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13938
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Nate McCall
>Assignee: Jason Brown
>Priority: Urgent
> Attachments: 13938.yaml, test.sh
>
>
> Running through a simple scenario to test some of the new repair features, I 
> was not able to make a repair command work. Further, the exception seemed to 
> trigger a nasty failure state that basically shuts down the netty connections 
> for messaging *and* CQL on the nodes transferring back data to the node being 
> repaired. The following steps reproduce this issue consistently.
> Cassandra stress profile (probably not necessary, but this one provides a 
> really simple schema and consistent data shape):
> {noformat}
> keyspace: standard_long
> keyspace_definition: |
>   CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy', 
> 'replication_factor':3};
> table: test_data
> table_definition: |
>   CREATE TABLE test_data (
>   key text,
>   ts bigint,
>   val text,
>   PRIMARY KEY (key, ts)
>   ) WITH COMPACT STORAGE AND
>   CLUSTERING ORDER BY (ts DESC) AND
>   bloom_filter_fp_chance=0.01 AND
>   caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND
>   comment='' AND
>   dclocal_read_repair_chance=0.00 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.00 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> columnspec:
>   - name: key
> population: uniform(1..5000) # 50 million records available
>   - name: ts
> cluster: gaussian(1..50) # Up to 50 inserts per record
>   - name: val
> population: gaussian(128..1024) # varrying size of value data
> insert:
>   partitions: fixed(1) # only one insert per batch for individual partitions
>   select: fixed(1)/1 # each insert comes in one at a time
>   batchtype: UNLOGGED
> queries:
>   single:
> cql: select * from test_data where key = ? and ts = ? limit 1;
>   series:
> cql: select key,ts,val from test_data where key = ? limit 10;
> {noformat}
> The commands to build and run:
> {noformat}
> ccm create 4_0_test -v git:trunk -n 3 -s
> ccm stress user profile=./histo-test-schema.yml 
> ops\(insert=20,single=1,series=1\) duration=15s -rate threads=4
> # flush the memtable just to get everything on disk
> ccm node1 nodetool flush
> ccm node2 nodetool flush
> ccm node3 nodetool flush
> # disable hints for nodes 2 and 3
> ccm node2 nodetool disablehandoff
> ccm node3 nodetool disablehandoff
> # stop node1
> ccm node1 stop
> ccm stress user profile=./histo-test-schema.yml 
> ops\(insert=20,single=1,series=1\) duration=45s -rate threads=4
> # wait 10 seconds
> ccm node1 start
> # Note that we are local to ccm's nodetool install 'cause repair preview is 
> not reported yet
> node1/bin/nodetool repair --preview
> node1/bin/nodetool repair standard_long test_data
> {noformat} 
> The error outputs from the last repair command follow. First, this is stdout 
> from node1:
> {noformat}
> $ node1/bin/nodetool repair standard_long test_data
> objc[47876]: Class JavaLaunchHelper is implemented in both 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java 
> (0x10274d4c0) and 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/libinstrument.dylib
>  (0x1047b64e0). One of the two will be used. Which one is undefined.
> [2017-10-05 14:31:52,425] Starting repair command #4 
> (7e1a9150-a98e-11e7-ad86-cbd2801b8de2), repairing keyspace standard_long with 
> repair options (parallelism: parallel, primary range: false, incremental: 
> true, job threads: 1, ColumnFamilies: [test_data], dataCenters: [], hosts: 
> [], previewKind: NONE, # of ranges: 3, pull repair: false, force repair: 
> false)
> [2017-10-05 14:32:07,045] Repair session 7e2e8e80-a98e-11e7-ad86-cbd2801b8de2 
> for range [(3074457345618258602,-9223372036854775808], 
> (-9223372036854775808,-3074457345618258603], 
> (-3074457345618258603,3074457345618258602]] failed with error Stream failed
> [2017-10-05 14:32:07,048] null
> [2017-10-05 14:32:07,050] Repair command #4 finished in 14 seconds
> error: Repair job has failed with the error message: [2017-10-05 
> 14:32:07,048] null
> -- StackTrace --
> java.lang.RuntimeException: Repair job has failed with the error message: 
> [2017-10-05 14:32:07,048] null
> at 

[jira] [Updated] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-13938:
-
Fix Version/s: 4.0

> Default repair is broken, crashes other nodes participating in repair (in 
> trunk)
> 
>
> Key: CASSANDRA-13938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13938
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Nate McCall
>Assignee: Jason Brown
>Priority: Urgent
> Fix For: 4.0
>
> Attachments: 13938.yaml, test.sh
>
>
> Running through a simple scenario to test some of the new repair features, I 
> was not able to make a repair command work. Further, the exception seemed to 
> trigger a nasty failure state that basically shuts down the netty connections 
> for messaging *and* CQL on the nodes transferring back data to the node being 
> repaired. The following steps reproduce this issue consistently.
> Cassandra stress profile (probably not necessary, but this one provides a 
> really simple schema and consistent data shape):
> {noformat}
> keyspace: standard_long
> keyspace_definition: |
>   CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy', 
> 'replication_factor':3};
> table: test_data
> table_definition: |
>   CREATE TABLE test_data (
>   key text,
>   ts bigint,
>   val text,
>   PRIMARY KEY (key, ts)
>   ) WITH COMPACT STORAGE AND
>   CLUSTERING ORDER BY (ts DESC) AND
>   bloom_filter_fp_chance=0.01 AND
>   caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND
>   comment='' AND
>   dclocal_read_repair_chance=0.00 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.00 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> columnspec:
>   - name: key
> population: uniform(1..5000) # 50 million records available
>   - name: ts
> cluster: gaussian(1..50) # Up to 50 inserts per record
>   - name: val
> population: gaussian(128..1024) # varrying size of value data
> insert:
>   partitions: fixed(1) # only one insert per batch for individual partitions
>   select: fixed(1)/1 # each insert comes in one at a time
>   batchtype: UNLOGGED
> queries:
>   single:
> cql: select * from test_data where key = ? and ts = ? limit 1;
>   series:
> cql: select key,ts,val from test_data where key = ? limit 10;
> {noformat}
> The commands to build and run:
> {noformat}
> ccm create 4_0_test -v git:trunk -n 3 -s
> ccm stress user profile=./histo-test-schema.yml 
> ops\(insert=20,single=1,series=1\) duration=15s -rate threads=4
> # flush the memtable just to get everything on disk
> ccm node1 nodetool flush
> ccm node2 nodetool flush
> ccm node3 nodetool flush
> # disable hints for nodes 2 and 3
> ccm node2 nodetool disablehandoff
> ccm node3 nodetool disablehandoff
> # stop node1
> ccm node1 stop
> ccm stress user profile=./histo-test-schema.yml 
> ops\(insert=20,single=1,series=1\) duration=45s -rate threads=4
> # wait 10 seconds
> ccm node1 start
> # Note that we are local to ccm's nodetool install 'cause repair preview is 
> not reported yet
> node1/bin/nodetool repair --preview
> node1/bin/nodetool repair standard_long test_data
> {noformat} 
> The error outputs from the last repair command follow. First, this is stdout 
> from node1:
> {noformat}
> $ node1/bin/nodetool repair standard_long test_data
> objc[47876]: Class JavaLaunchHelper is implemented in both 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java 
> (0x10274d4c0) and 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/libinstrument.dylib
>  (0x1047b64e0). One of the two will be used. Which one is undefined.
> [2017-10-05 14:31:52,425] Starting repair command #4 
> (7e1a9150-a98e-11e7-ad86-cbd2801b8de2), repairing keyspace standard_long with 
> repair options (parallelism: parallel, primary range: false, incremental: 
> true, job threads: 1, ColumnFamilies: [test_data], dataCenters: [], hosts: 
> [], previewKind: NONE, # of ranges: 3, pull repair: false, force repair: 
> false)
> [2017-10-05 14:32:07,045] Repair session 7e2e8e80-a98e-11e7-ad86-cbd2801b8de2 
> for range [(3074457345618258602,-9223372036854775808], 
> (-9223372036854775808,-3074457345618258603], 
> (-3074457345618258603,3074457345618258602]] failed with error Stream failed
> [2017-10-05 14:32:07,048] null
> [2017-10-05 14:32:07,050] Repair command #4 finished in 14 seconds
> error: Repair job has failed with the error message: [2017-10-05 
> 14:32:07,048] null
> -- StackTrace --
> java.lang.RuntimeException: Repair job has failed with the error message: 
> [2017-10-05 14:32:07,048] null
> at 

[jira] [Updated] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-13938:
-
Fix Version/s: (was: 4.x)
   4.0

> Default repair is broken, crashes other nodes participating in repair (in 
> trunk)
> 
>
> Key: CASSANDRA-13938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13938
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Nate McCall
>Assignee: Jason Brown
>Priority: Urgent
> Fix For: 4.0
>
> Attachments: 13938.yaml, test.sh
>
>
> Running through a simple scenario to test some of the new repair features, I 
> was not able to make a repair command work. Further, the exception seemed to 
> trigger a nasty failure state that basically shuts down the netty connections 
> for messaging *and* CQL on the nodes transferring back data to the node being 
> repaired. The following steps reproduce this issue consistently.
> Cassandra stress profile (probably not necessary, but this one provides a 
> really simple schema and consistent data shape):
> {noformat}
> keyspace: standard_long
> keyspace_definition: |
>   CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy', 
> 'replication_factor':3};
> table: test_data
> table_definition: |
>   CREATE TABLE test_data (
>   key text,
>   ts bigint,
>   val text,
>   PRIMARY KEY (key, ts)
>   ) WITH COMPACT STORAGE AND
>   CLUSTERING ORDER BY (ts DESC) AND
>   bloom_filter_fp_chance=0.01 AND
>   caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND
>   comment='' AND
>   dclocal_read_repair_chance=0.00 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.00 AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'LZ4Compressor'};
> columnspec:
>   - name: key
> population: uniform(1..5000) # 50 million records available
>   - name: ts
> cluster: gaussian(1..50) # Up to 50 inserts per record
>   - name: val
> population: gaussian(128..1024) # varrying size of value data
> insert:
>   partitions: fixed(1) # only one insert per batch for individual partitions
>   select: fixed(1)/1 # each insert comes in one at a time
>   batchtype: UNLOGGED
> queries:
>   single:
> cql: select * from test_data where key = ? and ts = ? limit 1;
>   series:
> cql: select key,ts,val from test_data where key = ? limit 10;
> {noformat}
> The commands to build and run:
> {noformat}
> ccm create 4_0_test -v git:trunk -n 3 -s
> ccm stress user profile=./histo-test-schema.yml 
> ops\(insert=20,single=1,series=1\) duration=15s -rate threads=4
> # flush the memtable just to get everything on disk
> ccm node1 nodetool flush
> ccm node2 nodetool flush
> ccm node3 nodetool flush
> # disable hints for nodes 2 and 3
> ccm node2 nodetool disablehandoff
> ccm node3 nodetool disablehandoff
> # stop node1
> ccm node1 stop
> ccm stress user profile=./histo-test-schema.yml 
> ops\(insert=20,single=1,series=1\) duration=45s -rate threads=4
> # wait 10 seconds
> ccm node1 start
> # Note that we are local to ccm's nodetool install 'cause repair preview is 
> not reported yet
> node1/bin/nodetool repair --preview
> node1/bin/nodetool repair standard_long test_data
> {noformat} 
> The error outputs from the last repair command follow. First, this is stdout 
> from node1:
> {noformat}
> $ node1/bin/nodetool repair standard_long test_data
> objc[47876]: Class JavaLaunchHelper is implemented in both 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java 
> (0x10274d4c0) and 
> /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/libinstrument.dylib
>  (0x1047b64e0). One of the two will be used. Which one is undefined.
> [2017-10-05 14:31:52,425] Starting repair command #4 
> (7e1a9150-a98e-11e7-ad86-cbd2801b8de2), repairing keyspace standard_long with 
> repair options (parallelism: parallel, primary range: false, incremental: 
> true, job threads: 1, ColumnFamilies: [test_data], dataCenters: [], hosts: 
> [], previewKind: NONE, # of ranges: 3, pull repair: false, force repair: 
> false)
> [2017-10-05 14:32:07,045] Repair session 7e2e8e80-a98e-11e7-ad86-cbd2801b8de2 
> for range [(3074457345618258602,-9223372036854775808], 
> (-9223372036854775808,-3074457345618258603], 
> (-3074457345618258603,3074457345618258602]] failed with error Stream failed
> [2017-10-05 14:32:07,048] null
> [2017-10-05 14:32:07,050] Repair command #4 finished in 14 seconds
> error: Repair job has failed with the error message: [2017-10-05 
> 14:32:07,048] null
> -- StackTrace --
> java.lang.RuntimeException: Repair job has failed with the error message: 
> [2017-10-05 14:32:07,048] null
> at 

[jira] [Updated] (CASSANDRA-15146) Transitional TLS server configuration options are overly complex

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15146:
-
Fix Version/s: 4.0

> Transitional TLS server configuration options are overly complex
> 
>
> Key: CASSANDRA-15146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Encryption, Local/Config
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Fix For: 4.0
>
>
> It appears as part of the port from transitional client TLS to transitional 
> server TLS in CASSANDRA-10404 (the ability to switch a cluster to using 
> {{internode_encryption}} without listening on two ports and without downtime) 
> we carried the {{enabled}} setting over from the client implementation. I 
> believe that the {{enabled}} option is redundant to {{internode_encryption}} 
> and {{optional}} and it should therefore be removed prior to the 4.0 release 
> where we will have to start respecting that interface. 
> Current trunk yaml:
> {noformat}
> server_encryption_options:
>   
> # set to true for allowing secure incoming connections
>   
> enabled: false
>   
> # If enabled and optional are both set to true, encrypted and unencrypted 
> connections are handled on the storage_port
> optional: false   
>   
>   
>   
> 
> # if enabled, will open up an encrypted listening socket on 
> ssl_storage_port. Should be used
> # during upgrade to 4.0; otherwise, set to false. 
>   
> enable_legacy_ssl_storage_port: false 
>   
> # on outbound connections, determine which type of peers to securely 
> connect to. 'enabled' must be set to true.
> internode_encryption: none
>   
> keystore: conf/.keystore  
>   
> keystore_password: cassandra  
>   
> truststore: conf/.truststore  
>   
> truststore_password: cassandra
> {noformat}
> I propose we eliminate {{enabled}} and just use {{optional}} and 
> {{internode_encryption}} to determine the listener setup. I also propose we 
> change the default of {{optional}} to true. We could also re-name 
> {{optional}} since it's a new option but I think it's good to stay consistent 
> with the client and use {{optional}}.
> ||optional||internode_encryption||description||
> |true|none|(default) No encryption is used but if a server reaches out with 
> it we'll use it|
> |false|dc|Encryption is required for inter-dc communication, but not intra-dc|
> |false|all|Encryption is required for all communication|
> |false|none|We only listen for unencrypted connections|
> |true|dc|Encryption is used for inter-dc communication but is not required|
> |true|all|Encryption is used for all communication but is not required|
> From these states it is clear when we should be accepting TLS connections 
> (all except for false and none) as well as when we must enforce it.
> To transition without downtime from an un-encrypted cluster to an encrypted 
> cluster the user would do the following:
> 1. After adding valid truststores, change {{internode_encryption}} to the 
> desired level of encryption (recommended {{all}}) and restart Cassandra
>  2. Change {{optional=false}} and restart Cassandra to enforce #1
> If {{optional}} defaulted to {{false}} as it does right now we'd need a third 
> restart to first change {{optional}} to {{true}}, which given my 
> understanding of the OptionalSslHandler isn't really relevant.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15262) server_encryption_options is not backwards compatible with 3.11

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15262:
-
Fix Version/s: 4.0

> server_encryption_options is not backwards compatible with 3.11
> ---
>
> Key: CASSANDRA-15262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15262
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Fix For: 4.0
>
>
> The current `server_encryption_options` configuration options are as follows:
> {noformat}
> server_encryption_options:
> # set to true for allowing secure incoming connections
> enabled: false
> # If enabled and optional are both set to true, encrypted and unencrypted 
> connections are handled on the storage_port
> optional: false
> # if enabled, will open up an encrypted listening socket on 
> ssl_storage_port. Should be used
> # during upgrade to 4.0; otherwise, set to false.
> enable_legacy_ssl_storage_port: false
> # on outbound connections, determine which type of peers to securely 
> connect to. 'enabled' must be set to true.
> internode_encryption: none
> keystore: conf/.keystore
> keystore_password: cassandra
> truststore: conf/.truststore
> truststore_password: cassandra
> # More advanced defaults below:
> # protocol: TLS
> # store_type: JKS
> # cipher_suites: 
> [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
> # require_client_auth: false
> # require_endpoint_verification: false
> {noformat}
> A couple of issues here:
> 1. optional defaults to false, which will break existing TLS configurations 
> for (from what I can tell) no particularly good reason
> 2. The provided protocol and cipher suites are not good ideas (in particular 
> encouraging anyone to use CBC ciphers is a bad plan
> I propose that before the 4.0 cut we fixup server_encryption_options and even 
> client_encryption_options :
> # Change the default {{optional}} setting to true. As the new Netty code 
> intelligently decides to open a TLS connection or not this is the more 
> sensible default (saves operators a step while transitioning to TLS as well)
> # Update the defaults to what netty actually defaults to



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-29 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Fix Version/s: 4.0

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
>  Labels: 4.0-QA
> Fix For: 4.0
>
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> cassandra_comparative_performance_all_flamegraphs.html, 
> image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, 
> trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, 
> write_scaling_lq_eq_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15294) Allow easy use of custom security providers

2019-08-28 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15294:
-
Impacts: Security  (was: None)

> Allow easy use of custom security providers
> ---
>
> Key: CASSANDRA-15294
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15294
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Joseph Lynch
>Priority: Normal
>
> As more users are switching to using {{AES-GCM}} TLS they are increasingly 
> running into extremely poor performance with the JDK implementations (e.g. 
> [JDK-8046943|https://bugs.openjdk.java.net/browse/JDK-8046943]). It's not 
> just TLS either, generally speaking Java crypto can be really slow, including 
> for example MD5 hashing which powers our digests (CASSANDRA-14611).
> There have been a few community attempts to fix this via customer java 
> security providers, for example Google's 
> [conscrypt|https://github.com/google/conscrypt] and recently Amazon's 
> [ACCP|https://github.com/corretto/amazon-corretto-crypto-provider] which are 
> basically portions of OpenSSL/BoringSSL that are statically linked in and 
> exposed via JNI. These approaches are similar in spirit to what 
> [netty-tcnative|https://github.com/netty/netty-tcnative] is doing for TLS in 
> C* trunk.
> Since there may be tradeoffs to using various providers for various functions 
> (e.g. {{conscrypt}} may be faster or slower than {{accp}} in certain use 
> cases and in other cases you may want to use JDK providers for ease of 
> upgrading) it would be useful if Cassandra supported pluggable providers per 
> use case. For example we could use {{conscrypt}} for TLS, {{accp}} for MD5 
> digesting, and the {{SUN}} provider for everything else. There is a small 
> amount of JVM wiring that needs to be done for this and it could unlock 
> 10-25% CPU capacity improvements.
> We can then use this pluggability to test different providers and if one is 
> strictly dominant we can just check that one in in libs and default to it.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15294) Allow easy use of custom security providers

2019-08-28 Thread Joseph Lynch (Jira)
Joseph Lynch created CASSANDRA-15294:


 Summary: Allow easy use of custom security providers
 Key: CASSANDRA-15294
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15294
 Project: Cassandra
  Issue Type: Improvement
  Components: Local/Config
Reporter: Joseph Lynch


As more users are switching to using {{AES-GCM}} TLS they are increasingly 
running into extremely poor performance with the JDK implementations (e.g. 
[JDK-8046943|https://bugs.openjdk.java.net/browse/JDK-8046943]). It's not just 
TLS either, generally speaking Java crypto can be really slow, including for 
example MD5 hashing which powers our digests (CASSANDRA-14611).

There have been a few community attempts to fix this via customer java security 
providers, for example Google's [conscrypt|https://github.com/google/conscrypt] 
and recently Amazon's 
[ACCP|https://github.com/corretto/amazon-corretto-crypto-provider] which are 
basically portions of OpenSSL/BoringSSL that are statically linked in and 
exposed via JNI. These approaches are similar in spirit to what 
[netty-tcnative|https://github.com/netty/netty-tcnative] is doing for TLS in C* 
trunk.

Since there may be tradeoffs to using various providers for various functions 
(e.g. {{conscrypt}} may be faster or slower than {{accp}} in certain use cases 
and in other cases you may want to use JDK providers for ease of upgrading) it 
would be useful if Cassandra supported pluggable providers per use case. For 
example we could use {{conscrypt}} for TLS, {{accp}} for MD5 digesting, and the 
{{SUN}} provider for everything else. There is a small amount of JVM wiring 
that needs to be done for this and it could unlock 10-25% CPU capacity 
improvements.

We can then use this pluggability to test different providers and if one is 
strictly dominant we can just check that one in in libs and default to it.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-28 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Labels: 4.0-QA  (was: )

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
>  Labels: 4.0-QA
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> cassandra_comparative_performance_all_flamegraphs.html, 
> image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, 
> trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, 
> write_scaling_lq_eq_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-27 Thread Joseph Lynch (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Resolution: Fixed
Status: Resolved  (was: Open)

I posted our final analysis, we still need to do more testing but I think that 
can be done at smaller scales and we have enough follow ups from this test as 
it is.

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> cassandra_comparative_performance_all_flamegraphs.html, 
> image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, 
> trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, 
> write_scaling_lq_eq_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15262) server_encryption_options is not backwards compatible with 3.11

2019-08-06 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15262:
-
 Severity: Low
   Complexity: Low Hanging Fruit
Discovered By: Performance Regression Test
 Bug Category: Parent values: Correctness(12982)Level 1 values: Semantic 
Failure(12988)
   Status: Open  (was: Triage Needed)

> server_encryption_options is not backwards compatible with 3.11
> ---
>
> Key: CASSANDRA-15262
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15262
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
>
> The current `server_encryption_options` configuration options are as follows:
> {noformat}
> server_encryption_options:
> # set to true for allowing secure incoming connections
> enabled: false
> # If enabled and optional are both set to true, encrypted and unencrypted 
> connections are handled on the storage_port
> optional: false
> # if enabled, will open up an encrypted listening socket on 
> ssl_storage_port. Should be used
> # during upgrade to 4.0; otherwise, set to false.
> enable_legacy_ssl_storage_port: false
> # on outbound connections, determine which type of peers to securely 
> connect to. 'enabled' must be set to true.
> internode_encryption: none
> keystore: conf/.keystore
> keystore_password: cassandra
> truststore: conf/.truststore
> truststore_password: cassandra
> # More advanced defaults below:
> # protocol: TLS
> # store_type: JKS
> # cipher_suites: 
> [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
> # require_client_auth: false
> # require_endpoint_verification: false
> {noformat}
> A couple of issues here:
> 1. optional defaults to false, which will break existing TLS configurations 
> for (from what I can tell) no particularly good reason
> 2. The provided protocol and cipher suites are not good ideas (in particular 
> encouraging anyone to use CBC ciphers is a bad plan
> I propose that before the 4.0 cut we fixup server_encryption_options and even 
> client_encryption_options :
> # Change the default {{optional}} setting to true. As the new Netty code 
> intelligently decides to open a TLS connection or not this is the more 
> sensible default (saves operators a step while transitioning to TLS as well)
> # Update the defaults to what netty actually defaults to



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15214) OOMs caught and not rethrown

2019-08-06 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901555#comment-16901555
 ] 

Joseph Lynch commented on CASSANDRA-15214:
--

[~yifanc] If you are ok with it I can add your test cases to 
[jvmquake|https://github.com/Netflix-Skunkworks/jvmquake/tree/master/tests] to 
ensure it handles all edge cases. For what it's worth jvmquake is a strict 
superset of jvmkill and I wouldn't advocate for using jvmkill (I'm biased 
though). In my production experience jvmquake actually works at detecting GC 
spirals of death that C* runs into while jvmkill simply doesn't work as C* 
doesn't actually go OOM, it just death spirals. See the "hard oom"  [test 
cases|https://github.com/Netflix-Skunkworks/jvmquake/blob/master/tests/test_hard_ooms.py]
 for example where jvmkill won't work while jvmquake will work.

> OOMs caught and not rethrown
> 
>
> Key: CASSANDRA-15214
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15214
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client, Messaging/Internode
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0
>
> Attachments: oom-experiments.zip
>
>
> Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, 
> so presently there is no way to ensure that an OOM reaches the JVM handler to 
> trigger a crash/heapdump.
> It may be that the simplest most consistent way to do this would be to have a 
> single thread spawned at startup that waits for any exceptions we must 
> propagate to the Runtime.
> We could probably submit a patch upstream to Netty, but for a guaranteed 
> future proof approach, it may be worth paying the cost of a single thread.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15262) server_encryption_options is not backwards compatible with 3.11

2019-08-06 Thread Joseph Lynch (JIRA)
Joseph Lynch created CASSANDRA-15262:


 Summary: server_encryption_options is not backwards compatible 
with 3.11
 Key: CASSANDRA-15262
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15262
 Project: Cassandra
  Issue Type: Bug
  Components: Local/Config
Reporter: Joseph Lynch
Assignee: Joseph Lynch


The current `server_encryption_options` configuration options are as follows:
{noformat}
server_encryption_options:
# set to true for allowing secure incoming connections
enabled: false
# If enabled and optional are both set to true, encrypted and unencrypted 
connections are handled on the storage_port
optional: false
# if enabled, will open up an encrypted listening socket on 
ssl_storage_port. Should be used
# during upgrade to 4.0; otherwise, set to false.
enable_legacy_ssl_storage_port: false
# on outbound connections, determine which type of peers to securely 
connect to. 'enabled' must be set to true.
internode_encryption: none
keystore: conf/.keystore
keystore_password: cassandra
truststore: conf/.truststore
truststore_password: cassandra
# More advanced defaults below:
# protocol: TLS
# store_type: JKS
# cipher_suites: 
[TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
# require_client_auth: false
# require_endpoint_verification: false
{noformat}

A couple of issues here:
1. optional defaults to false, which will break existing TLS configurations for 
(from what I can tell) no particularly good reason
2. The provided protocol and cipher suites are not good ideas (in particular 
encouraging anyone to use CBC ciphers is a bad plan

I propose that before the 4.0 cut we fixup server_encryption_options and even 
client_encryption_options :
# Change the default {{optional}} setting to true. As the new Netty code 
intelligently decides to open a TLS connection or not this is the more sensible 
default (saves operators a step while transitioning to TLS as well)
# Update the defaults to what netty actually defaults to



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-06 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: cassandra_comparative_performance_all_flamegraphs.html

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> cassandra_comparative_performance_all_flamegraphs.html, 
> image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, 
> trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, 
> write_scaling_lq_eq_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-06 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901482#comment-16901482
 ] 

Joseph Lynch commented on CASSANDRA-15175:
--

The analysis so far:
 [^cassandra_comparative_performance_all_flamegraphs.html] 

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, 
> trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, 
> write_scaling_lq_eq_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-06 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: write_scaling_local_one_summary.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, 
> trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, 
> write_scaling_lq_eq_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-06 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: write_scaling_lq_eq_summary.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, 
> trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, 
> write_scaling_lq_eq_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-06 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901476#comment-16901476
 ] 

Joseph Lynch commented on CASSANDRA-15175:
--

Write scaling test LOCAL_ONE looks good:
 !write_scaling_local_one_summary.png! 
 
Write scaling test at LQ reads + EQ writes looks ok, but not great:
 !write_scaling_lq_eq_summary.png! 

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, 
> trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, 
> write_scaling_lq_eq_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-06 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: image-2019-08-06-14-20-25-140.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, 
> trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-06 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: (was: trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png)

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-06 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-08-06 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-31 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, 
> trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-31 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-30 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15214) OOMs caught and not rethrown

2019-07-16 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886475#comment-16886475
 ] 

Joseph Lynch edited comment on CASSANDRA-15214 at 7/16/19 9:36 PM:
---

We've (Netlfix) found handling OOMs to be generally hard to do correctly in all 
the various Java codebases we have so we built an agent solution which attaches 
to the JVM in [https://github.com/Netflix-Skunkworks/jvmquake]. I think the 
only reason that we couldn't just directly include that in C* is because it's a 
C JVMTI agent instead of a Java one, but perhaps we could just solve this with 
some documentation and making it really easy to include agents (which is useful 
regardless)?  I can also spend some time and see if I can make it a java agent 
instead of a c one.

The following is the patch for supporting easy pluggable agents for C*:
{noformat}
diff --git a/conf/cassandra-env.sh b/conf/cassandra-env.sh
index d6c48be0a3..92061db3ab 100644
--- a/conf/cassandra-env.sh
+++ b/conf/cassandra-env.sh
@@ -134,6 +134,29 @@ do
   JVM_OPTS="$JVM_OPTS $opt"
 done
 
+# Pull in any agents present in CASSANDRA_HOME
+for agent_file in ${CASSANDRA_HOME}/agents/*.jar; do
+  if [ -e "${agent_file}" ]; then
+base_file="${agent_file%.jar}"
+if [ -s "${base_file}.options" ]; then
+  options=`cat ${base_file}.options`
+  agent_file="${agent_file}=${options}"
+fi
+JVM_OPTS="$JVM_OPTS -javaagent:${agent_file}"
+  fi
+done
+
+for agent_file in ${CASSANDRA_HOME}/agents/*.so; do
+  if [ -e "${agent_file}" ]; then
+base_file="${agent_file%.so}"
+if [ -s "${base_file}.options" ]; then
+  options=`cat ${base_file}.options`
+  agent_file="${agent_file}=${options}"
+fi
+JVM_OPTS="$JVM_OPTS -agentpath:${agent_file}"
+  fi
+done
{noformat}
Then we can just drop agents into the {{CASSANDRA_HOME/agents}} folder and they 
are loaded automatically by Cassandra. From a security perspective this is 
identical to "drop a jar".


was (Author: jolynch):
We've (Netlfix) found handling OOMs to be generally hard to do correctly in all 
the various Java codebases we have so we built an agent solution which attaches 
to the JVM in [https://github.com/Netflix-Skunkworks/jvmquake]. I think the 
only reason that we couldn't just directly include that in C* is because it's a 
C JVMTI agent instead of a Java one, but perhaps we could just solve this with 
some documentation and making it really easy to include agents (which is useful 
regardless)?

The following is the patch for supporting easy pluggable agents for C*:
{noformat}
diff --git a/conf/cassandra-env.sh b/conf/cassandra-env.sh
index d6c48be0a3..92061db3ab 100644
--- a/conf/cassandra-env.sh
+++ b/conf/cassandra-env.sh
@@ -134,6 +134,29 @@ do
   JVM_OPTS="$JVM_OPTS $opt"
 done
 
+# Pull in any agents present in CASSANDRA_HOME
+for agent_file in ${CASSANDRA_HOME}/agents/*.jar; do
+  if [ -e "${agent_file}" ]; then
+base_file="${agent_file%.jar}"
+if [ -s "${base_file}.options" ]; then
+  options=`cat ${base_file}.options`
+  agent_file="${agent_file}=${options}"
+fi
+JVM_OPTS="$JVM_OPTS -javaagent:${agent_file}"
+  fi
+done
+
+for agent_file in ${CASSANDRA_HOME}/agents/*.so; do
+  if [ -e "${agent_file}" ]; then
+base_file="${agent_file%.so}"
+if [ -s "${base_file}.options" ]; then
+  options=`cat ${base_file}.options`
+  agent_file="${agent_file}=${options}"
+fi
+JVM_OPTS="$JVM_OPTS -agentpath:${agent_file}"
+  fi
+done
{noformat}
Then we can just drop agents into the {{CASSANDRA_HOME/agents}} folder and they 
are loaded automatically by Cassandra. From a security perspective this is 
identical to "drop a jar".

> OOMs caught and not rethrown
> 
>
> Key: CASSANDRA-15214
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15214
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client, Messaging/Internode
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0
>
>
> Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, 
> so presently there is no way to ensure that an OOM reaches the JVM handler to 
> trigger a crash/heapdump.
> It may be that the simplest most consistent way to do this would be to have a 
> single thread spawned at startup that waits for any exceptions we must 
> propagate to the Runtime.
> We could probably submit a patch upstream to Netty, but for a guaranteed 
> future proof approach, it may be worth paying the cost of a single thread.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15214) OOMs caught and not rethrown

2019-07-16 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886475#comment-16886475
 ] 

Joseph Lynch commented on CASSANDRA-15214:
--

We've (Netlfix) found handling OOMs to be generally hard to do correctly in all 
the various Java codebases we have so we built an agent solution which attaches 
to the JVM in [https://github.com/Netflix-Skunkworks/jvmquake]. I think the 
only reason that we couldn't just directly include that in C* is because it's a 
C JVMTI agent instead of a Java one, but perhaps we could just solve this with 
some documentation and making it really easy to include agents (which is useful 
regardless)?

The following is the patch for supporting easy pluggable agents for C*:
{noformat}
diff --git a/conf/cassandra-env.sh b/conf/cassandra-env.sh
index d6c48be0a3..92061db3ab 100644
--- a/conf/cassandra-env.sh
+++ b/conf/cassandra-env.sh
@@ -134,6 +134,29 @@ do
   JVM_OPTS="$JVM_OPTS $opt"
 done
 
+# Pull in any agents present in CASSANDRA_HOME
+for agent_file in ${CASSANDRA_HOME}/agents/*.jar; do
+  if [ -e "${agent_file}" ]; then
+base_file="${agent_file%.jar}"
+if [ -s "${base_file}.options" ]; then
+  options=`cat ${base_file}.options`
+  agent_file="${agent_file}=${options}"
+fi
+JVM_OPTS="$JVM_OPTS -javaagent:${agent_file}"
+  fi
+done
+
+for agent_file in ${CASSANDRA_HOME}/agents/*.so; do
+  if [ -e "${agent_file}" ]; then
+base_file="${agent_file%.so}"
+if [ -s "${base_file}.options" ]; then
+  options=`cat ${base_file}.options`
+  agent_file="${agent_file}=${options}"
+fi
+JVM_OPTS="$JVM_OPTS -agentpath:${agent_file}"
+  fi
+done
{noformat}
Then we can just drop agents into the {{CASSANDRA_HOME/agents}} folder and they 
are loaded automatically by Cassandra. From a security perspective this is 
identical to "drop a jar".

> OOMs caught and not rethrown
> 
>
> Key: CASSANDRA-15214
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15214
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client, Messaging/Internode
>Reporter: Benedict
>Priority: Normal
> Fix For: 4.0
>
>
> Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, 
> so presently there is no way to ensure that an OOM reaches the JVM handler to 
> trigger a crash/heapdump.
> It may be that the simplest most consistent way to do this would be to have a 
> single thread spawned at startup that waits for any exceptions we must 
> propagate to the Runtime.
> We could probably submit a patch upstream to Netty, but for a guaranteed 
> future proof approach, it may be worth paying the cost of a single thread.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15224) DynamicSnitch.applyConfigChanges can corrupt snitch state

2019-07-16 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886472#comment-16886472
 ] 

Joseph Lynch commented on CASSANDRA-15224:
--

This is at least fixed in the trunk patch for CASSANDRA-14459. I may be able to 
isolate those changes for backport to the 3.x series.

> DynamicSnitch.applyConfigChanges can corrupt snitch state
> -
>
> Key: CASSANDRA-15224
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15224
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Benedict
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> This method is not synchronised, and doesn’t wait for the cancelled task to 
> complete (which could already be running), so we could have two updates in 
> flight simultaneously and corrupt the internal state of the collection



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-05 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: (was: trunk_vs_30x_write_LO_108kcWPS_7kcRPS.png)

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-05 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_write_LO_108kcWPS_7kcRPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-05 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-04 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-03 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Description: 
Tracks evaluating a 192 node cluster with compression and encryption on.

First test is a read scaling test  
[https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]

 
|Test Setup| |
|Baseline|3.0.19
@d7d00036|
|Candiate|trunk
@abb0e177|
| | |
|Workload| |
|Write size|4kb random|
|Read size|4kb random|
|Per Node Data|110GiB|
|Generator|ndbench|
|Key Distribution|Uniform|
|SSTable Compr|Off|
|Internode TLS|On (jdk)|
|Internode Compr|On|
|Compaction|LCS (320 MiB)|
|Repair|Off|
| | |
|Hardware| |
|Instance Type|i3.xlarge|
|Deployment|96 us-east-1, 96 eu-west-1|
|Region node count|96|
| | |
|OS Settings| |
|IO scheduler|kyber|
|Net qdisc|tc-fq|
|readahead|32kb|
|Java Version|OpenJDK 1.8.0_202 (Zulu)|
| | |


Second test is a [write scaling 
test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:


  was:
Tracks evaluating a 192 node cluster with compression and encryption on.

Test setup at (reproduced below)

[https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]

 
|Test Setup| |
|Baseline|3.0.19
@d7d00036|
|Candiate|trunk
@abb0e177|
| | |
|Workload| |
|Write size|4kb random|
|Read size|4kb random|
|Per Node Data|110GiB|
|Generator|ndbench|
|Key Distribution|Uniform|
|SSTable Compr|Off|
|Internode TLS|On (jdk)|
|Internode Compr|On|
|Compaction|LCS (320 MiB)|
|Repair|Off|
| | |
|Hardware| |
|Instance Type|i3.xlarge|
|Deployment|96 us-east-1, 96 eu-west-1|
|Region node count|96|
| | |
|OS Settings| |
|IO scheduler|kyber|
|Net qdisc|tc-fq|
|readahead|32kb|
|Java Version|OpenJDK 1.8.0_202 (Zulu)|
| | |


> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a read scaling test  
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-03 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Description: 
Tracks evaluating a 192 node cluster with compression and encryption on.

First test is a [read scaling test 
|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]

 
|Test Setup| |
|Baseline|3.0.19
@d7d00036|
|Candiate|trunk
@abb0e177|
| | |
|Workload| |
|Write size|4kb random|
|Read size|4kb random|
|Per Node Data|110GiB|
|Generator|ndbench|
|Key Distribution|Uniform|
|SSTable Compr|Off|
|Internode TLS|On (jdk)|
|Internode Compr|On|
|Compaction|LCS (320 MiB)|
|Repair|Off|
| | |
|Hardware| |
|Instance Type|i3.xlarge|
|Deployment|96 us-east-1, 96 eu-west-1|
|Region node count|96|
| | |
|OS Settings| |
|IO scheduler|kyber|
|Net qdisc|tc-fq|
|readahead|32kb|
|Java Version|OpenJDK 1.8.0_202 (Zulu)|
| | |


Second test is a [write scaling 
test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:


  was:
Tracks evaluating a 192 node cluster with compression and encryption on.

First test is a read scaling test  
[https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]

 
|Test Setup| |
|Baseline|3.0.19
@d7d00036|
|Candiate|trunk
@abb0e177|
| | |
|Workload| |
|Write size|4kb random|
|Read size|4kb random|
|Per Node Data|110GiB|
|Generator|ndbench|
|Key Distribution|Uniform|
|SSTable Compr|Off|
|Internode TLS|On (jdk)|
|Internode Compr|On|
|Compaction|LCS (320 MiB)|
|Repair|Off|
| | |
|Hardware| |
|Instance Type|i3.xlarge|
|Deployment|96 us-east-1, 96 eu-west-1|
|Region node count|96|
| | |
|OS Settings| |
|IO scheduler|kyber|
|Net qdisc|tc-fq|
|readahead|32kb|
|Java Version|OpenJDK 1.8.0_202 (Zulu)|
| | |


Second test is a [write scaling 
test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> First test is a [read scaling test 
> |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |
> Second test is a [write scaling 
> test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]:



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-03 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878164#comment-16878164
 ] 

Joseph Lynch commented on CASSANDRA-15175:
--

The QUORUM test is completed, results look pretty good on the read side:

 !trunk_vs_30x_Q_tcnative_summary.png! 

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-03 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_Q_tcnative_summary.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, 
> trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-07-03 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_summary.png, 
> trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-29 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875587#comment-16875587
 ] 

Joseph Lynch commented on CASSANDRA-15175:
--

[~norman] Yes, I think we have enough evidence that the latency regression was 
probably not Netty TLS (although I am somewhat surprised CPU time with tcnative 
is about the same as pre-netty jdk TLS) and probably in how we're using it 
instead.  It is possible that the default cipher choice for netty on Java 8 may 
want to be revised, or at least noted somewhere in the documentation that Java 
8 has a performance limitation with GCM ciphers? 

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-29 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_Q_36kcRPS_7200cWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-28 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_allocation_Q_21k_cRPS.svg

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-28 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_Q_21600cRPS-7200cWPS.svg

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-28 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_Q_21kcRPS_7200cWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, 
> trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-27 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874410#comment-16874410
 ] 

Joseph Lynch commented on CASSANDRA-15175:
--

Alright, I've run the scaling test with both (jdk) TLS and (openssl) TLS (I 
dropped the statically linked boringssl jar).

With jdk TLS after switching the default cipher to 
{{TLS_RSA_WITH_AES_128_CBC_SHA}}:
 !trunk_vs_30x_LQ_jdk_summary.png! 
With openssl TLS again with the default cipher the same as 30x:
 !trunk_vs_30x_LQ_tcnative_summary.png!  

So the summary is, we have a minor regression in average performance for 
LOCAL_QUORUM. Flamegraphs are attached for root cause. The good news is that 
the tail is significantly better.

Action items:
* Make sure that cipher we use isn't GCM by default
* Determine why writes got slower, still outstanding

I am now moving on to a QUORUM test.

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-27 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_LQ_tcnative_summary.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-27 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_LQ_jdk_summary.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-27 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-25 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_LQ_64kcRPS_14kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_LQ_21kcRPS_14kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, 
> trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871894#comment-16871894
 ] 

Joseph Lynch commented on CASSANDRA-15175:
--

I switched to the same cipher that 3.0 is running and saw a reduction of on CPU 
time to 12.9% (compared to 3.0's 8.5%). This is a significant improvement but 
still not quite equal. Interestingly with that improvement average latency is 
now on par with 3.0 in the local quorum test. 

 [^trunk_LQ_21600cRPS-14400cWPS.svg]  [^30x_LQ_21600cRPS-14400cWPS.svg] 

I'm going to finish off this round of jdk TLS testing and then switch to 
tcnative tomorrow and test that.

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: 30x_LQ_21600cRPS-14400cWPS.svg
trunk_LQ_21600cRPS-14400cWPS.svg

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871656#comment-16871656
 ] 

Joseph Lynch edited comment on CASSANDRA-15175 at 6/24/19 11:22 PM:


I have completed the {{LOCAL_ONE}} scaling test. I have summarized the test in 
the following graph:
 !trunk_vs_30x_summary.png!

As we can see, even with the extra TLS CPU requirements, trunk was able to 
significantly outperform the status quo 3.0.x cluster across the load spectrum 
for this consistency level

I am proceeding with other consistency levels and gathering additional data.

So far I have noticed the following issues during these tests which I will 
gather more data on and follow up with in other tickets (and edit here with 
ticket numbers once I have them):
 # JDK Netty TLS appears significantly more CPU intensive than the previous 
Java Sockets implementation. [~norman] is taking a look from the Netty side and 
we can follow up and make sure we're not creating improperly (looking at the 
flamegraphs it looks like we may have a buffer sizing issue)
 # When a node was terminated and replaced, the new node appeared to sit for a 
very long time waiting for schema pulls to complete (I think it was waiting on 
the node it was replacing but I haven't fully debugged this).
 # Nodetool netstats doesn't report progress properly for the file count 
(percent, single file, and size still seem right; this is probably 
CASSANDRA-14192
 # When we re-load NTS keyspaces from disk we throw warnings about "Ignoring 
Unrecognized strategy option" for datacenters that we are not in
 # After a node shuts down there is a burst of re-connections on the urgent 
port prior to actual shutdown (I _think_ this is pre-existing and I'm just 
noticing it because of the new logging)

Also while setting up the {{LOCAL_QUORUM}} test I found the following trying to 
understand why I was seeing a higher number of blocking read repairs on the 
trunk cluster than the 30x cluster:
 # -When I stop and start nodes, it appears that hints may not always playback. 
In particular the high blocking read repairs were coming from neighbors of the 
node I had restarted a few times to test tcnative openssl integration. I 
checked the neighbor's hints directories and sure enough there were pending 
hints there that were not playing at all (they had been there for over 8 hours 
and still not played).- (Edit: This is a bad default. The default 
hinted_handoff_throttle_in_kb is 1024 but it is divided by the number of nodes 
in the cluster. In this case the size of 192 meant we were playing hints at a 
rate of ~5 kbps, which meant if we were down for even a few minutes we would 
essentially lose those mutations before the 24 hour hint expiry window)
 # -Repair appears to fail on the default system_traces when run with {{-full}} 
and {{-os}- (Edit: this is operator error, we shouldn't pass -local to a 
SimpleStrategy keyspace)
{noformat}
cass-perf-trunk-14746--useast1c-i-00a32889835534b75:~$ nodetool repair -os 
-full -local
[2019-06-23 23:29:30,210] Starting repair command #1 
(bfbc7ba0-960e-11e9-b238-77fd1c2e9b1c), repairing keyspace perftest with repair 
options (parallelism: parallel, primary range: false, incremental: false, job 
threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], 
previewKind: NONE, # of ranges: 6, pull repair: false, force repair: false, 
optimise streams: true)
[2019-06-23 23:52:08,248] Repair session c0573500-960e-11e9-b238-77fd1c2e9b1c 
for range [(384307168575030403,384307170010857891], 
(192153585909716729,384307168575030403]] finished (progress: 10%)
[2019-06-23 23:52:26,393] Repair session c0307320-960e-11e9-b238-77fd1c2e9b1c 
for range [(1808575567,192153584473889241], 
(192153584473889241,192153585909716729]] finished (progress: 20%)
[2019-06-23 23:52:28,298] Repair session c059f420-960e-11e9-b238-77fd1c2e9b1c 
for range [(576460752676171565,576460754111999053], 
(384307170010857891,576460752676171565]] finished (progress: 30%)
[2019-06-23 23:52:28,302] Repair completed successfully
[2019-06-23 23:52:28,310] Repair command #1 finished in 22 minutes 58 seconds
[2019-06-23 23:52:28,331] Replication factor is 1. No repair is needed for 
keyspace 'system_auth'
[2019-06-23 23:52:28,350] Starting repair command #2 
(f52c1c70-9611-11e9-b238-77fd1c2e9b1c), repairing keyspace system_traces with 
repair options (parallelism: parallel, primary range: false, incremental: 
false, job threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], 
previewKind: NONE, # of ranges: 2, pull repair: false, force repair: false, 
optimise streams: true)
[2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not 
be empty
[2019-06-23 23:52:28,351] Repair command #2 finished with error
error: Repair job has failed with the error message: [2019-06-23 23:52:28,351] 
Repair command #2 failed with 

[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871774#comment-16871774
 ] 

Joseph Lynch commented on CASSANDRA-15175:
--

[~norman] yeah sadly I don't see any exceptions in the C* code that correlate 
with that exception, but I can try enabling more verbose logging on particular 
netty modules if you think it will help? Also fwiw I think that the 3.0 branch 
in this test is using {{TLS_RSA_WITH_AES_128_CBC_SHA}} as the default cipher 
instead of {{TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384}}. I don't really know 
what I'm talking about when it comes to TLS cipher suites but it appears from 
my reading of https://bugs.openjdk.java.net/browse/JDK-8046943 that {{GCM}} is 
very slow in Java 8 (apparently fixed in Java 9). That might explain why we're 
spending so much CPU time in GaloisCountMode (which I assume is GCM). I can try 
using {{TLS_RSA_WITH_AES_256_CBC_SHA}} with both as a fair test?

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871656#comment-16871656
 ] 

Joseph Lynch edited comment on CASSANDRA-15175 at 6/24/19 7:23 PM:
---

I have completed the {{LOCAL_ONE}} scaling test. I have summarized the test in 
the following graph:
 !trunk_vs_30x_summary.png!

As we can see, even with the extra TLS CPU requirements, trunk was able to 
significantly outperform the status quo 3.0.x cluster across the load spectrum 
for this consistency level

I am proceeding with other consistency levels and gathering additional data.

So far I have noticed the following issues during these tests which I will 
gather more data on and follow up with in other tickets (and edit here with 
ticket numbers once I have them):
 # JDK Netty TLS appears significantly more CPU intensive than the previous 
Java Sockets implementation. [~norman] is taking a look from the Netty side and 
we can follow up and make sure we're not creating improperly (looking at the 
flamegraphs it looks like we may have a buffer sizing issue)
 # When a node was terminated and replaced, the new node appeared to sit for a 
very long time waiting for schema pulls to complete (I think it was waiting on 
the node it was replacing but I haven't fully debugged this).
 # Nodetool netstats doesn't report progress properly for the file count 
(percent, single file, and size still seem right; this is probably 
CASSANDRA-14192
 # When we re-load NTS keyspaces from disk we throw warnings about "Ignoring 
Unrecognized strategy option" for datacenters that we are not in
 # After a node shuts down there is a burst of re-connections on the urgent 
port prior to actual shutdown (I _think_ this is pre-existing and I'm just 
noticing it because of the new logging)

Also while setting up the {{LOCAL_QUORUM}} test I found the following trying to 
understand why I was seeing a higher number of blocking read repairs on the 
trunk cluster than the 30x cluster:
 # When I stop and start nodes, it appears that hints may not always playback. 
In particular the high blocking read repairs were coming from neighbors of the 
node I had restarted a few times to test tcnative openssl integration. I 
checked the neighbor's hints directories and sure enough there were pending 
hints there that were not playing at all (they had been there for over 8 hours 
and still not played).
 # -Repair appears to fail on the default system_traces when run with {{-full}} 
and {{-os}- (Edit: this is operator error, we shouldn't pass -local to a 
SimpleStrategy keyspace)
{noformat}
cass-perf-trunk-14746--useast1c-i-00a32889835534b75:~$ nodetool repair -os 
-full -local
[2019-06-23 23:29:30,210] Starting repair command #1 
(bfbc7ba0-960e-11e9-b238-77fd1c2e9b1c), repairing keyspace perftest with repair 
options (parallelism: parallel, primary range: false, incremental: false, job 
threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], 
previewKind: NONE, # of ranges: 6, pull repair: false, force repair: false, 
optimise streams: true)
[2019-06-23 23:52:08,248] Repair session c0573500-960e-11e9-b238-77fd1c2e9b1c 
for range [(384307168575030403,384307170010857891], 
(192153585909716729,384307168575030403]] finished (progress: 10%)
[2019-06-23 23:52:26,393] Repair session c0307320-960e-11e9-b238-77fd1c2e9b1c 
for range [(1808575567,192153584473889241], 
(192153584473889241,192153585909716729]] finished (progress: 20%)
[2019-06-23 23:52:28,298] Repair session c059f420-960e-11e9-b238-77fd1c2e9b1c 
for range [(576460752676171565,576460754111999053], 
(384307170010857891,576460752676171565]] finished (progress: 30%)
[2019-06-23 23:52:28,302] Repair completed successfully
[2019-06-23 23:52:28,310] Repair command #1 finished in 22 minutes 58 seconds
[2019-06-23 23:52:28,331] Replication factor is 1. No repair is needed for 
keyspace 'system_auth'
[2019-06-23 23:52:28,350] Starting repair command #2 
(f52c1c70-9611-11e9-b238-77fd1c2e9b1c), repairing keyspace system_traces with 
repair options (parallelism: parallel, primary range: false, incremental: 
false, job threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], 
previewKind: NONE, # of ranges: 2, pull repair: false, force repair: false, 
optimise streams: true)
[2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not 
be empty
[2019-06-23 23:52:28,351] Repair command #2 finished with error
error: Repair job has failed with the error message: [2019-06-23 23:52:28,351] 
Repair command #2 failed with error Endpoints can not be empty. Check the logs 
on the repair participants for further details
-- StackTrace --
java.lang.RuntimeException: Repair job has failed with the error message: 
[2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not 
be empty. Check the logs on the repair participants for further details
at 

[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871714#comment-16871714
 ] 

Joseph Lynch commented on CASSANDRA-15175:
--

{quote}
[~jolynch] one question... when using JDK TLS do you see any errors at all or 
you just see more CPU usage and thats it ? 
{quote}

I don't see any errors in our logs but we are spending more CPU handling 
{{ShortBufferExceptions}} internally to Netty then may make sense. I took the 
following screenshots from the [^trunk_LQ_14400cRPS-14400cWPS.svg] flamegraph.
 !odd_netty_jdk_tls_cpu_usage.png!

!ShortbufferExceptions.png!

Other than the flamegraph and degraded latency in {{LOCAL_QUORUM}} mode (where 
C* nodes actually have to talk to each other through the internode messaging 
framework), things appear about the same (no errors that I can see).

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: ShortbufferExceptions.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, ShortbufferExceptions.png, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: odd_netty_jdk_tls_cpu_usage.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, 
> trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, 
> trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, 
> trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_LQ_14400cRPS-14400cWPS.svg

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_LQ_14400cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_LQ_14kcRPS_14kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871656#comment-16871656
 ] 

Joseph Lynch commented on CASSANDRA-15175:
--

I have completed the {{LOCAL_ONE}} scaling test. I have summarized the test in 
the following graph:
 !trunk_vs_30x_summary.png!

As we can see, even with the extra TLS CPU requirements, trunk was able to 
significantly outperform the status quo 3.0.x cluster across the load spectrum 
for this consistency level

I am proceeding with other consistency levels and gathering additional data.

So far I have noticed the following issues during these tests which I will 
gather more data on and follow up with in other tickets (and edit here with 
ticket numbers once I have them):
 # JDK Netty TLS appears significantly more CPU intensive than the previous 
Java Sockets implementation. [~norman] is taking a look from the Netty side and 
we can follow up and make sure we're not creating improperly (looking at the 
flamegraphs it looks like we may have a buffer sizing issue)
 # When a node was terminated and replaced, the new node appeared to sit for a 
very long time waiting for schema pulls to complete (I think it was waiting on 
the node it was replacing but I haven't fully debugged this).
 # Nodetool netstats doesn't report progress properly for the file count 
(percent, single file, and size still seem right; this is probably 
CASSANDRA-14192
 # When we re-load NTS keyspaces from disk we throw warnings about "Ignoring 
Unrecognized strategy option" for datacenters that we are not in
 # After a node shuts down there is a burst of re-connections on the urgent 
port prior to actual shutdown (I _think_ this is pre-existing and I'm just 
noticing it because of the new logging)

Also while setting up the {{LOCAL_QUORUM}} test I found the following trying to 
understand why I was seeing a higher number of blocking read repairs on the 
trunk cluster than the 30x cluster:
 # When I stop and start nodes, it appears that hints may not always playback. 
In particular the high blocking read repairs were coming from neighbors of the 
node I had restarted a few times to test tcnative openssl integration. I 
checked the neighbor's hints directories and sure enough there were pending 
hints there that were not playing at all (they had been there for over 8 hours 
and still not played).
 # Repair appears to fail on the default system_traces when run with {{-full}} 
and \{{-os}
{noformat}
cass-perf-trunk-14746--useast1c-i-00a32889835534b75:~$ nodetool repair -os 
-full -local
[2019-06-23 23:29:30,210] Starting repair command #1 
(bfbc7ba0-960e-11e9-b238-77fd1c2e9b1c), repairing keyspace perftest with repair 
options (parallelism: parallel, primary range: false, incremental: false, job 
threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], 
previewKind: NONE, # of ranges: 6, pull repair: false, force repair: false, 
optimise streams: true)
[2019-06-23 23:52:08,248] Repair session c0573500-960e-11e9-b238-77fd1c2e9b1c 
for range [(384307168575030403,384307170010857891], 
(192153585909716729,384307168575030403]] finished (progress: 10%)
[2019-06-23 23:52:26,393] Repair session c0307320-960e-11e9-b238-77fd1c2e9b1c 
for range [(1808575567,192153584473889241], 
(192153584473889241,192153585909716729]] finished (progress: 20%)
[2019-06-23 23:52:28,298] Repair session c059f420-960e-11e9-b238-77fd1c2e9b1c 
for range [(576460752676171565,576460754111999053], 
(384307170010857891,576460752676171565]] finished (progress: 30%)
[2019-06-23 23:52:28,302] Repair completed successfully
[2019-06-23 23:52:28,310] Repair command #1 finished in 22 minutes 58 seconds
[2019-06-23 23:52:28,331] Replication factor is 1. No repair is needed for 
keyspace 'system_auth'
[2019-06-23 23:52:28,350] Starting repair command #2 
(f52c1c70-9611-11e9-b238-77fd1c2e9b1c), repairing keyspace system_traces with 
repair options (parallelism: parallel, primary range: false, incremental: 
false, job threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], 
previewKind: NONE, # of ranges: 2, pull repair: false, force repair: false, 
optimise streams: true)
[2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not 
be empty
[2019-06-23 23:52:28,351] Repair command #2 finished with error
error: Repair job has failed with the error message: [2019-06-23 23:52:28,351] 
Repair command #2 failed with error Endpoints can not be empty. Check the logs 
on the repair participants for further details
-- StackTrace --
java.lang.RuntimeException: Repair job has failed with the error message: 
[2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not 
be empty. Check the logs on the repair participants for further details
at 
org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:122)
at 

[jira] [Created] (CASSANDRA-15181) Ensure Nodes can Start and Stop

2019-06-24 Thread Joseph Lynch (JIRA)
Joseph Lynch created CASSANDRA-15181:


 Summary: Ensure Nodes can Start and Stop
 Key: CASSANDRA-15181
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15181
 Project: Cassandra
  Issue Type: Sub-task
  Components: Legacy/Streaming and Messaging, Test/benchmark
Reporter: Joseph Lynch
Assignee: Vinay Chella


Let's load a cluster up with data and start killing nodes. We can do hard 
failures (node terminations) and soft failures (process kills) We plan to 
observe the following:

* Can nodes successfully bootstrap?
* How long does it take to bootstrap
* What are the effects of TLS on and off (e.g. on stream time)
* Are hints properly played after a node restart
* Do nodes properly shutdown and start back up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15181) Ensure Nodes can Start and Stop

2019-06-24 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15181:
-
 Complexity: Normal
   Priority: High  (was: Normal)
Change Category: Operability
 Status: Open  (was: Triage Needed)

> Ensure Nodes can Start and Stop
> ---
>
> Key: CASSANDRA-15181
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15181
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Legacy/Streaming and Messaging, Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Vinay Chella
>Priority: High
>
> Let's load a cluster up with data and start killing nodes. We can do hard 
> failures (node terminations) and soft failures (process kills) We plan to 
> observe the following:
> * Can nodes successfully bootstrap?
> * How long does it take to bootstrap
> * What are the effects of TLS on and off (e.g. on stream time)
> * Are hints properly played after a node restart
> * Do nodes properly shutdown and start back up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14764) Evaluate 12 Node Breaking Point, compression=none, encryption=none, coalescing=off

2019-06-24 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch reassigned CASSANDRA-14764:


Assignee: Vinay Chella

> Evaluate 12 Node Breaking Point, compression=none, encryption=none, 
> coalescing=off
> --
>
> Key: CASSANDRA-14764
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14764
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Legacy/Streaming and Messaging
>Reporter: Joseph Lynch
>Assignee: Vinay Chella
>Priority: Normal
> Attachments: i-03341e1c52de6ea3e-after-queue-change.svg, 
> i-07cd92e844d66d801-after-queue-bound.svg, i-07cd92e844d66d801-hint-play.svg, 
> i-07cd92e844d66d801-uninlined-with-jvm-methods.svg, ttop.txt
>
>
> *Setup:*
>  * Cassandra: 12 (2*6) node i3.xlarge AWS instance (4 cpu cores, 30GB ram) 
> running cassandra trunk off of jasobrown/14503 jdd7ec5a2 (Jasons patched 
> internode messaging branch) vs the same footprint running 3.0.17
>  * Two datacenters with 100ms latency between them
>  * No compression, encryption, or coalescing turned on
> *Test #1:*
> ndbench sent 1.5k QPS at a coordinator level to one datacenter (RF=3*2 = 6 so 
> 3k global replica QPS) of 4kb single partition BATCH mutations at LOCAL_ONE. 
> This represents about 250 QPS per coordinator in the first datacenter or 60 
> QPS per core. The goal was to observe P99 write and read latencies under 
> various QPS.
> *Result:*
> The good news is since the CASSANDRA-14503 changes, instead of keeping the 
> mutations on heap we put the message into hints instead and don't run out of 
> memory. The bad news is that the {{MessagingService-NettyOutbound-Thread's}} 
> would occasionally enter a degraded state where they would just spin on a 
> core. I've attached flame graphs showing the CPU state as [~jasobrown] 
> applied fixes to the {{OutboundMessagingConnection}} class.
>  *Follow Ups:*
> [~jasobrown] has committed a number of fixes onto his 
> {{jasobrown/14503-collab}} branch including:
> 1. Limiting the amount of time spent dequeuing messages if they are expired 
> (previously if messages entered the queue faster than we could dequeue them 
> we'd just inifinte loop on the consumer side)
> 2. Don't call {{dequeueMessages}} from within {{dequeueMessages}} created 
> callbacks.
> We're continuing to use CPU flamegraphs to figure out where we're looping and 
> fixing bugs as we find them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-24 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_vs_30x_summary.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, 
> trunk_vs_30x_summary.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-23 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: (was: trunk_252kcRPS-14kcWPS.png)

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-23 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_252kcRPS-14kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-23 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_252kcRPS-14kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-23 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_220kcRPS_14kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, 
> trunk_93500cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-23 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Description: 
Tracks evaluating a 192 node cluster with compression and encryption on.

Test setup at (reproduced below)

[https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]

 
|Test Setup| |
|Baseline|3.0.19
@d7d00036|
|Candiate|trunk
@abb0e177|
| | |
|Workload| |
|Write size|4kb random|
|Read size|4kb random|
|Per Node Data|110GiB|
|Generator|ndbench|
|Key Distribution|Uniform|
|SSTable Compr|Off|
|Internode TLS|On (jdk)|
|Internode Compr|On|
|Compaction|LCS (320 MiB)|
|Repair|Off|
| | |
|Hardware| |
|Instance Type|i3.xlarge|
|Deployment|96 us-east-1, 96 eu-west-1|
|Region node count|96|
| | |
|OS Settings| |
|IO scheduler|kyber|
|Net qdisc|tc-fq|
|readahead|32kb|
|Java Version|OpenJDK 1.8.0_202 (Zulu)|
| | |

  was:
Tracks evaluating a 192 node cluster with compression and encryption on.

Test setup at (reproduced below)

[https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]

 
|Test Setup| |
|Baseline|3.0.19
@d7d00036|
|Candiate|trunk
@abb0e177|
| | |
|Workload| |
|Write size|4kb random|
|Read size|4kb random|
|Per Node Data|110GiB|
|Generator|ndbench|
|Key Distribution|Uniform|
|SSTable Compr|Off|
|Internode TLS|On|
|Internode Compr|On|
|Compaction|LCS (320 MiB)|
|Repair|Off|
| | |
|Hardware| |
|Instance Type|i3.xlarge|
|Deployment|96 us-east-1, 96 eu-west-1|
|Region node count|96|
| | |
|OS Settings| |
|IO scheduler|kyber|
|Net qdisc|tc-fq|
|readahead|32kb|
|Java Version|OpenJDK 1.8.0_202 (Zulu)|
| | |


> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_93500cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On (jdk)|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-23 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_22000cRPS-14400cWPS-openssl.svg
trunk_22000cRPS-14400cWPS-jdk.svg

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, 
> trunk_22000cRPS-14400cWPS-openssl.svg, trunk_93500cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-23 Thread Joseph Lynch (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870615#comment-16870615
 ] 

Joseph Lynch commented on CASSANDRA-15175:
--

We just use the default protocol and cipher suite via netty's 
SslContextBuilder. I believe that means 
{{TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384}} if I understand the following 
logging properly:

{noformat}
$ grep "Default cipher " debug.log -C 2
DEBUG [main] 2019-06-23 18:24:31,158 Slf4JLogger.java:71 - netty-tcnative not 
in the classpath; OpenSslEngine will be unavailable.
DEBUG [main] 2019-06-23 18:24:31,735 Slf4JLogger.java:76 - Default protocols 
(JDK): [TLSv1.2, TLSv1.1, TLSv1] 
DEBUG [main] 2019-06-23 18:24:31,736 Slf4JLogger.java:76 - Default cipher 
suites (JDK): [TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, 
TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, 
TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, 
TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_128_CBC_SHA, 
TLS_RSA_WITH_AES_256_CBC_SHA]
{noformat}

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all

2019-06-23 Thread Joseph Lynch (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Lynch updated CASSANDRA-15175:
-
Attachment: trunk_187kcRPS_14kcWPS.png

> Evaluate 200 node, compression=on, encryption=all
> -
>
> Key: CASSANDRA-15175
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15175
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/benchmark
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Normal
> Attachments: 30x_14400cRPS-14400cWPS.svg, 
> trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, 
> trunk_187kcRPS_14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, 
> trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, 
> trunk_vs_30x_14kcRPS_14kcWPS.png, 
> trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, 
> trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, 
> trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, 
> trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png
>
>
> Tracks evaluating a 192 node cluster with compression and encryption on.
> Test setup at (reproduced below)
> [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053]
>  
> |Test Setup| |
> |Baseline|3.0.19
> @d7d00036|
> |Candiate|trunk
> @abb0e177|
> | | |
> |Workload| |
> |Write size|4kb random|
> |Read size|4kb random|
> |Per Node Data|110GiB|
> |Generator|ndbench|
> |Key Distribution|Uniform|
> |SSTable Compr|Off|
> |Internode TLS|On|
> |Internode Compr|On|
> |Compaction|LCS (320 MiB)|
> |Repair|Off|
> | | |
> |Hardware| |
> |Instance Type|i3.xlarge|
> |Deployment|96 us-east-1, 96 eu-west-1|
> |Region node count|96|
> | | |
> |OS Settings| |
> |IO scheduler|kyber|
> |Net qdisc|tc-fq|
> |readahead|32kb|
> |Java Version|OpenJDK 1.8.0_202 (Zulu)|
> | | |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



  1   2   3   4   5   >