[jira] [Created] (CASSANDRA-15311) Fix flakey test_13595 - consistency_test.TestConsistency
Joseph Lynch created CASSANDRA-15311: Summary: Fix flakey test_13595 - consistency_test.TestConsistency Key: CASSANDRA-15311 URL: https://issues.apache.org/jira/browse/CASSANDRA-15311 Project: Cassandra Issue Type: Bug Components: Test/dtest Reporter: Joseph Lynch Example failure: [https://circleci.com/gh/jolynch/cassandra/559#tests/containers/29] {noformat} Your job ran 1007 tests with 1 failure test_13595 - consistency_test.TestConsistencyconsistency_test.pyAssertionError: assert 9 == 4 + where 4 = >('org.apache.cassandra.metrics:type=Table,name=ShortReadProtectionRequests,keyspace=test,scope=test', 'Count') +where > = .read_attribute self = @since('3.0') def test_13595(self): """ @jira_ticket CASSANDRA-13595 """ cluster = self.cluster # disable hinted handoff and set batch commit log so this doesn't interfere with the test cluster.set_configuration_options(values={'hinted_handoff_enabled': False}) cluster.set_batch_commitlog(enabled=True) cluster.populate(2) node1, node2 = cluster.nodelist() remove_perf_disable_shared_mem(node1) # necessary for jmx cluster.start(wait_other_notice=True) session = self.patient_cql_connection(node1) query = "CREATE KEYSPACE IF NOT EXISTS test WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': 2};" session.execute(query) query = 'CREATE TABLE IF NOT EXISTS test.test (id int PRIMARY KEY);' session.execute(query) # populate the table with 10 partitions, # then delete a bunch of them on different nodes # until we get the following pattern: #token | k | 1 | 2 | # -7509452495886106294 | 5 | n | y | # -4069959284402364209 | 1 | y | n | # -3799847372828181882 | 8 | n | y | # -3485513579396041028 | 0 | y | n | # -3248873570005575792 | 2 | n | y | # -2729420104000364805 | 4 | y | n | # 1634052884888577606 | 7 | n | y | # 2705480034054113608 | 6 | y | n | # 3728482343045213994 | 9 | n | y | # 9010454139840013625 | 3 | y | y | stmt = session.prepare('INSERT INTO test.test (id) VALUES (?);') for id in range(0, 10): session.execute(stmt, [id], ConsistencyLevel.ALL) # delete every other partition on node1 while node2 is down node2.stop(wait_other_notice=True) session.execute('DELETE FROM test.test WHERE id IN (5, 8, 2, 7, 9);') node2.start(wait_other_notice=True, wait_for_binary_proto=True) session = self.patient_cql_connection(node2) # delete every other alternate partition on node2 while node1 is down node1.stop(wait_other_notice=True) session.execute('DELETE FROM test.test WHERE id IN (1, 0, 4, 6);') node1.start(wait_other_notice=True, wait_for_binary_proto=True) session = self.patient_exclusive_cql_connection(node1) # until #13595 the query would incorrectly return [1] assert_all(session, 'SELECT id FROM test.test LIMIT 1;', [[3]], cl=ConsistencyLevel.ALL) srp = make_mbean('metrics', type='Table', name='ShortReadProtectionRequests', keyspace='test', scope='test') with JolokiaAgent(node1) as jmx: # 4 srp requests for node1 and 5 for node2, total of 9 > assert 9 == jmx.read_attribute(srp, 'Count') E AssertionError: assert 9 == 4 E+ where 4 = >('org.apache.cassandra.metrics:type=Table,name=ShortReadProtectionRequests,keyspace=test,scope=test', 'Count') E+where > = .read_attribute consistency_test.py:1288: AssertionError {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15311) Fix flakey test_13595 - consistency_test.TestConsistency
[ https://issues.apache.org/jira/browse/CASSANDRA-15311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15311: - Fix Version/s: 4.0-alpha > Fix flakey test_13595 - consistency_test.TestConsistency > - > > Key: CASSANDRA-15311 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15311 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Priority: Normal > Fix For: 4.0-alpha > > > Example failure: > [https://circleci.com/gh/jolynch/cassandra/559#tests/containers/29] > {noformat} > Your job ran 1007 tests with 1 failure > test_13595 - > consistency_test.TestConsistencyconsistency_test.pyAssertionError: assert 9 > == 4 + where 4 = 0x7f9f0775b160>>('org.apache.cassandra.metrics:type=Table,name=ShortReadProtectionRequests,keyspace=test,scope=test', > 'Count') +where > = > .read_attribute > self = > @since('3.0') > def test_13595(self): > """ > @jira_ticket CASSANDRA-13595 > """ > cluster = self.cluster > > # disable hinted handoff and set batch commit log so this doesn't > interfere with the test > cluster.set_configuration_options(values={'hinted_handoff_enabled': > False}) > cluster.set_batch_commitlog(enabled=True) > > cluster.populate(2) > node1, node2 = cluster.nodelist() > remove_perf_disable_shared_mem(node1) # necessary for jmx > cluster.start(wait_other_notice=True) > > session = self.patient_cql_connection(node1) > > query = "CREATE KEYSPACE IF NOT EXISTS test WITH replication = > {'class': 'NetworkTopologyStrategy', 'datacenter1': 2};" > session.execute(query) > > query = 'CREATE TABLE IF NOT EXISTS test.test (id int PRIMARY KEY);' > session.execute(query) > > # populate the table with 10 partitions, > # then delete a bunch of them on different nodes > # until we get the following pattern: > > #token | k | 1 | 2 | > # -7509452495886106294 | 5 | n | y | > # -4069959284402364209 | 1 | y | n | > # -3799847372828181882 | 8 | n | y | > # -3485513579396041028 | 0 | y | n | > # -3248873570005575792 | 2 | n | y | > # -2729420104000364805 | 4 | y | n | > # 1634052884888577606 | 7 | n | y | > # 2705480034054113608 | 6 | y | n | > # 3728482343045213994 | 9 | n | y | > # 9010454139840013625 | 3 | y | y | > > stmt = session.prepare('INSERT INTO test.test (id) VALUES (?);') > for id in range(0, 10): > session.execute(stmt, [id], ConsistencyLevel.ALL) > > # delete every other partition on node1 while node2 is down > node2.stop(wait_other_notice=True) > session.execute('DELETE FROM test.test WHERE id IN (5, 8, 2, 7, 9);') > node2.start(wait_other_notice=True, wait_for_binary_proto=True) > > session = self.patient_cql_connection(node2) > > # delete every other alternate partition on node2 while node1 is down > node1.stop(wait_other_notice=True) > session.execute('DELETE FROM test.test WHERE id IN (1, 0, 4, 6);') > node1.start(wait_other_notice=True, wait_for_binary_proto=True) > > session = self.patient_exclusive_cql_connection(node1) > > # until #13595 the query would incorrectly return [1] > assert_all(session, >'SELECT id FROM test.test LIMIT 1;', >[[3]], >cl=ConsistencyLevel.ALL) > > srp = make_mbean('metrics', type='Table', > name='ShortReadProtectionRequests', keyspace='test', scope='test') > with JolokiaAgent(node1) as jmx: > # 4 srp requests for node1 and 5 for node2, total of 9 > > assert 9 == jmx.read_attribute(srp, 'Count') > E AssertionError: assert 9 == 4 > E+ where 4 = 0x7f9f0775b160>>('org.apache.cassandra.metrics:type=Table,name=ShortReadProtectionRequests,keyspace=test,scope=test', > 'Count') > E+where > = > .read_attribute > consistency_test.py:1288: AssertionError {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15310) Fix flakey - testIdleDisconnect - org.apache.cassandra.transport.IdleDisconnectTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15310: - Platform: All,Java11 (was: All) > Fix flakey - testIdleDisconnect - > org.apache.cassandra.transport.IdleDisconnectTest > --- > > Key: CASSANDRA-15310 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15310 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Joseph Lynch >Priority: Normal > Fix For: 4.0-alpha > > > Example run: > [https://circleci.com/gh/jolynch/cassandra/561#tests/containers/86] > > {noformat} > Your job ran 4428 tests with 1 failure > - testIdleDisconnect - > org.apache.cassandra.transport.IdleDisconnectTestjunit.framework.AssertionFailedError > at > org.apache.cassandra.transport.IdleDisconnectTest.testIdleDisconnect(IdleDisconnectTest.java:56) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15310) Fix flakey - testIdleDisconnect - org.apache.cassandra.transport.IdleDisconnectTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15310: - Fix Version/s: 4.0-alpha > Fix flakey - testIdleDisconnect - > org.apache.cassandra.transport.IdleDisconnectTest > --- > > Key: CASSANDRA-15310 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15310 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Joseph Lynch >Priority: Normal > Fix For: 4.0-alpha > > > Example run: > [https://circleci.com/gh/jolynch/cassandra/561#tests/containers/86] > > {noformat} > Your job ran 4428 tests with 1 failure > - testIdleDisconnect - > org.apache.cassandra.transport.IdleDisconnectTestjunit.framework.AssertionFailedError > at > org.apache.cassandra.transport.IdleDisconnectTest.testIdleDisconnect(IdleDisconnectTest.java:56) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15310) Fix flakey - testIdleDisconnect - org.apache.cassandra.transport.IdleDisconnectTest
Joseph Lynch created CASSANDRA-15310: Summary: Fix flakey - testIdleDisconnect - org.apache.cassandra.transport.IdleDisconnectTest Key: CASSANDRA-15310 URL: https://issues.apache.org/jira/browse/CASSANDRA-15310 Project: Cassandra Issue Type: Bug Components: Test/unit Reporter: Joseph Lynch Example run: [https://circleci.com/gh/jolynch/cassandra/561#tests/containers/86] {noformat} Your job ran 4428 tests with 1 failure - testIdleDisconnect - org.apache.cassandra.transport.IdleDisconnectTestjunit.framework.AssertionFailedError at org.apache.cassandra.transport.IdleDisconnectTest.testIdleDisconnect(IdleDisconnectTest.java:56) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15309) Make the upgrade tests run on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-15309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15309: - Fix Version/s: 4.0-alpha > Make the upgrade tests run on trunk > --- > > Key: CASSANDRA-15309 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15309 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Priority: Normal > Fix For: 4.0-alpha > > > It appears that the upgrade tests (j8_upgradetests-no-vnodes circleci target) > don't really work on trunk right now, it appears to be a java home issue > potentially. Example run: https://circleci.com/gh/jolynch/cassandra/553 > {noformat} > Your job ran 4412 tests with 3923 failures > - test_IN_clause_on_last_key - > upgrade_tests.cql_tests.TestCQLNodes2RF1_Upgrade_current_2_1_x_To_indev_2_1_xupgrade_tests/cql_tests.pymajor_version_int > = 8 > def switch_jdks(major_version_int): > """ > Changes the jdk version globally, by setting JAVA_HOME = JAVA[N]_HOME. > This means the environment must have JAVA[N]_HOME set to switch to > jdk version N. > """ > new_java_home = 'JAVA{}_HOME'.format(major_version_int) > > try: > > os.environ[new_java_home] > upgrade_tests/upgrade_base.py:25: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = environ({'PYTHONUNBUFFERED': 'true', 'DEFAULT_DIR': > '/home/cassandra/cassandra-dtest', 'CIRCLE_NODE_INDEX': '47', > 'CIR...ade_tests/cql_tests.py::TestCQLNodes2RF1_Upgrade_current_2_1_x_To_indev_2_1_x::()::test_IN_clause_on_last_key > (call)'}) > key = 'JAVA8_HOME' > def __getitem__(self, key): > try: > value = self._data[self.encodekey(key)] > except KeyError: > # raise KeyError with the original key value > > raise KeyError(key) from None > E KeyError: 'JAVA8_HOME'{noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15309) Make the upgrade tests run on trunk
Joseph Lynch created CASSANDRA-15309: Summary: Make the upgrade tests run on trunk Key: CASSANDRA-15309 URL: https://issues.apache.org/jira/browse/CASSANDRA-15309 Project: Cassandra Issue Type: Bug Components: Test/dtest Reporter: Joseph Lynch It appears that the upgrade tests (j8_upgradetests-no-vnodes circleci target) don't really work on trunk right now, it appears to be a java home issue potentially. Example run: https://circleci.com/gh/jolynch/cassandra/553 {noformat} Your job ran 4412 tests with 3923 failures - test_IN_clause_on_last_key - upgrade_tests.cql_tests.TestCQLNodes2RF1_Upgrade_current_2_1_x_To_indev_2_1_xupgrade_tests/cql_tests.pymajor_version_int = 8 def switch_jdks(major_version_int): """ Changes the jdk version globally, by setting JAVA_HOME = JAVA[N]_HOME. This means the environment must have JAVA[N]_HOME set to switch to jdk version N. """ new_java_home = 'JAVA{}_HOME'.format(major_version_int) try: > os.environ[new_java_home] upgrade_tests/upgrade_base.py:25: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = environ({'PYTHONUNBUFFERED': 'true', 'DEFAULT_DIR': '/home/cassandra/cassandra-dtest', 'CIRCLE_NODE_INDEX': '47', 'CIR...ade_tests/cql_tests.py::TestCQLNodes2RF1_Upgrade_current_2_1_x_To_indev_2_1_x::()::test_IN_clause_on_last_key (call)'}) key = 'JAVA8_HOME' def __getitem__(self, key): try: value = self._data[self.encodekey(key)] except KeyError: # raise KeyError with the original key value > raise KeyError(key) from None E KeyError: 'JAVA8_HOME'{noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15308) Fix flakey testAcquireReleaseOutbound - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15308: - Fix Version/s: 4.0-alpha > Fix flakey testAcquireReleaseOutbound - > org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15308 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15308 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Joseph Lynch >Priority: Normal > Fix For: 4.0-alpha > > > Example failure: > https://circleci.com/gh/jolynch/cassandra/554#tests/containers/61 > {noformat} > Your job ran 4428 tests with 1 failure > - testAcquireReleaseOutbound - org.apache.cassandra.net.ConnectionTest > junit.framework.AssertionFailedError > at > org.apache.cassandra.net.ConnectionTest.lambda$testAcquireReleaseOutbound$53(ConnectionTest.java:770) > at > org.apache.cassandra.net.ConnectionTest.lambda$doTest$8(ConnectionTest.java:238) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.doTest(ConnectionTest.java:236) > at org.apache.cassandra.net.ConnectionTest.test(ConnectionTest.java:225) > at > org.apache.cassandra.net.ConnectionTest.testAcquireReleaseOutbound(ConnectionTest.java:767) > {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15308) Fix flakey testAcquireReleaseOutbound - org.apache.cassandra.net.ConnectionTest
Joseph Lynch created CASSANDRA-15308: Summary: Fix flakey testAcquireReleaseOutbound - org.apache.cassandra.net.ConnectionTest Key: CASSANDRA-15308 URL: https://issues.apache.org/jira/browse/CASSANDRA-15308 Project: Cassandra Issue Type: Bug Components: Test/unit Reporter: Joseph Lynch Example failure: https://circleci.com/gh/jolynch/cassandra/554#tests/containers/61 {noformat} Your job ran 4428 tests with 1 failure - testAcquireReleaseOutbound - org.apache.cassandra.net.ConnectionTest junit.framework.AssertionFailedError at org.apache.cassandra.net.ConnectionTest.lambda$testAcquireReleaseOutbound$53(ConnectionTest.java:770) at org.apache.cassandra.net.ConnectionTest.lambda$doTest$8(ConnectionTest.java:238) at org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) at org.apache.cassandra.net.ConnectionTest.doTest(ConnectionTest.java:236) at org.apache.cassandra.net.ConnectionTest.test(ConnectionTest.java:225) at org.apache.cassandra.net.ConnectionTest.testAcquireReleaseOutbound(ConnectionTest.java:767) {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15307) Fix flakey test_remote_query - cql_test.TestCQLSlowQuery test
Joseph Lynch created CASSANDRA-15307: Summary: Fix flakey test_remote_query - cql_test.TestCQLSlowQuery test Key: CASSANDRA-15307 URL: https://issues.apache.org/jira/browse/CASSANDRA-15307 Project: Cassandra Issue Type: Bug Components: Test/dtest Reporter: Joseph Lynch Example failure: [https://circleci.com/gh/jolynch/cassandra/554#tests/containers/61] {noformat} Your job ran 959 tests with 1 failure - test_remote_query cql_test.TestCQLSlowQuerycql_test.py ccmlib.node.TimeoutError: 05 Sep 2019 23:05:07 [node2] Missing: ['operations were slow', 'SELECT \\* FROM ks.test2 WHERE id = 1']: DEBUG [BatchlogTasks:1] 2019-09-05 23:04:24,437 Ba. See debug.log for remainder self = def test_remote_query(self): """ Check that a query running on a node other than the coordinator is reported as slow: - populate the cluster with 2 nodes - start one node without having it join the ring - start the other one node with slow_query_log_timeout_in_ms set to a small value and the read request timeouts set to a large value (to ensure the query is not aborted) and read_iteration_delay set to a value big enough for the query to exceed slow_query_log_timeout_in_ms (this will cause read queries to take longer than the slow query timeout) - CREATE a table - INSERT 5000 rows on a session on the node that is not a member of the ring - run SELECT statements and check that the slow query messages are present in the debug logs (we cannot check the logs at info level because the no spam logger has unpredictable results) @jira_ticket CASSANDRA-12403 """ cluster = self.cluster cluster.set_configuration_options(values={'slow_query_log_timeout_in_ms': 10, 'request_timeout_in_ms': 12, 'read_request_timeout_in_ms': 12, 'range_request_timeout_in_ms': 12}) cluster.populate(2) node1, node2 = cluster.nodelist() node1.start(wait_for_binary_proto=True, join_ring=False) # ensure other node executes queries node2.start(wait_for_binary_proto=True, jvm_args=["-Dcassandra.monitoring_report_interval_ms=10", "-Dcassandra.test.read_iteration_delay_ms=1"]) # see above for explanation session = self.patient_exclusive_cql_connection(node1) create_ks(session, 'ks', 1) session.execute(""" CREATE TABLE test2 ( id int, col int, val text, PRIMARY KEY(id, col) ); """) for i, j in itertools.product(list(range(100)), list(range(10))): session.execute("INSERT INTO test2 (id, col, val) VALUES ({}, {}, 'foo')".format(i, j)) # only check debug logs because at INFO level the no-spam logger has unpredictable results mark = node2.mark_log(filename='debug.log') session.execute(SimpleStatement("SELECT * from test2", consistency_level=ConsistencyLevel.ONE, retry_policy=FallthroughRetryPolicy())) node2.watch_log_for(["operations were slow", "SELECT \* FROM ks.test2"], from_mark=mark, filename='debug.log', timeout=60) mark = node2.mark_log(filename='debug.log') session.execute(SimpleStatement("SELECT * from test2 where id = 1", consistency_level=ConsistencyLevel.ONE, retry_policy=FallthroughRetryPolicy())) node2.watch_log_for(["operations were slow", "SELECT \* FROM ks.test2 WHERE id = 1"], > from_mark=mark, filename='debug.log', timeout=60) cql_test.py:1150: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = exprs = ['operations were slow', 'SELECT \\* FROM ks.test2 WHERE id = 1'] from_mark = 166214, timeout = 60, process = None, verbose = False filename = 'debug.log' def watch_log_for(self, exprs, from_mark=None, timeout=600, process=None, verbose=False, filename='system.log'): """ Watch the log until one or more (regular) expression are found. This methods when all the expressions have been found or the method timeouts (a TimeoutError is then raised). On successful completion, a list of pair (line matched, match object) is returned. """ start = time.time() tofind = [exprs] if
[jira] [Updated] (CASSANDRA-15307) Fix flakey test_remote_query - cql_test.TestCQLSlowQuery test
[ https://issues.apache.org/jira/browse/CASSANDRA-15307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15307: - Fix Version/s: 4.0-alpha > Fix flakey test_remote_query - cql_test.TestCQLSlowQuery test > -- > > Key: CASSANDRA-15307 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15307 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: Joseph Lynch >Priority: Normal > Fix For: 4.0-alpha > > > Example failure: > [https://circleci.com/gh/jolynch/cassandra/554#tests/containers/61] > > {noformat} > Your job ran 959 tests with 1 failure > - test_remote_query cql_test.TestCQLSlowQuerycql_test.py > ccmlib.node.TimeoutError: 05 Sep 2019 23:05:07 [node2] Missing: ['operations > were slow', 'SELECT \\* FROM ks.test2 WHERE id = 1']: DEBUG [BatchlogTasks:1] > 2019-09-05 23:04:24,437 Ba. See debug.log for remainder > self = > def test_remote_query(self): > """ > Check that a query running on a node other than the coordinator > is reported as slow: > > - populate the cluster with 2 nodes > - start one node without having it join the ring > - start the other one node with slow_query_log_timeout_in_ms set > to a small value > and the read request timeouts set to a large value (to ensure > the query is not aborted) and > read_iteration_delay set to a value big enough for the query to > exceed slow_query_log_timeout_in_ms > (this will cause read queries to take longer than the slow > query timeout) > - CREATE a table > - INSERT 5000 rows on a session on the node that is not a member > of the ring > - run SELECT statements and check that the slow query messages > are present in the debug logs > (we cannot check the logs at info level because the no spam > logger has unpredictable results) > > @jira_ticket CASSANDRA-12403 > """ > cluster = self.cluster > > cluster.set_configuration_options(values={'slow_query_log_timeout_in_ms': 10, > 'request_timeout_in_ms': > 12, > > 'read_request_timeout_in_ms': 12, > > 'range_request_timeout_in_ms': 12}) > > cluster.populate(2) > node1, node2 = cluster.nodelist() > > node1.start(wait_for_binary_proto=True, join_ring=False) # ensure > other node executes queries > node2.start(wait_for_binary_proto=True, > jvm_args=["-Dcassandra.monitoring_report_interval_ms=10", > "-Dcassandra.test.read_iteration_delay_ms=1"]) > # see above for explanation > > session = self.patient_exclusive_cql_connection(node1) > > create_ks(session, 'ks', 1) > session.execute(""" > CREATE TABLE test2 ( > id int, > col int, > val text, > PRIMARY KEY(id, col) > ); > """) > > for i, j in itertools.product(list(range(100)), list(range(10))): > session.execute("INSERT INTO test2 (id, col, val) VALUES ({}, {}, > 'foo')".format(i, j)) > > # only check debug logs because at INFO level the no-spam logger has > unpredictable results > mark = node2.mark_log(filename='debug.log') > session.execute(SimpleStatement("SELECT * from test2", > > consistency_level=ConsistencyLevel.ONE, > > retry_policy=FallthroughRetryPolicy())) > node2.watch_log_for(["operations were slow", "SELECT \* FROM > ks.test2"], > from_mark=mark, filename='debug.log', timeout=60) > > > mark = node2.mark_log(filename='debug.log') > session.execute(SimpleStatement("SELECT * from test2 where id = 1", > > consistency_level=ConsistencyLevel.ONE, > > retry_policy=FallthroughRetryPolicy())) > node2.watch_log_for(["operations were slow", "SELECT \* FROM ks.test2 > WHERE id = 1"], > > from_mark=mark, filename='debug.log', timeout=60) > cql_test.py:1150: > _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ > _ > self = > exprs = ['operations were slow', 'SELECT \\* FROM ks.test2 WHERE id = 1'] > from_mark = 166214, timeout = 60, process = None, verbose = False > filename = 'debug.log' > def watch_log_for(self,
[jira] [Updated] (CASSANDRA-15306) Investigate why we are allocating 8MiB chunks and reaching the maximum BufferPool size
[ https://issues.apache.org/jira/browse/CASSANDRA-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15306: - Description: While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the following in the logs {noformat} INFO [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB INFO [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB INFO [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB INFO [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB {noformat} This was with about 150 WPS against a LCS table containing 4kib partitions. It seemed that compaction proceeded just fine but I don't remember seeing this in previous testing runs and I'd like to make sure it's not a bug (otherwise we may want to reduce the logging). was: While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the following in the logs {noformat} INFO [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB INFO [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB INFO [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB INFO [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB {noformat} This was with about 150 WPS against a LCS table containing 4kib data. It seemed that compaction proceeded just fine but I don't remember seeing this in previous testing runs and I'd like to make sure it's not a bug (otherwise we may want to reduce the logging). > Investigate why we are allocating 8MiB chunks and reaching the maximum > BufferPool size > -- > > Key: CASSANDRA-15306 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15306 > Project: Cassandra > Issue Type: Bug > Components: Observability/Logging, Test/benchmark >Reporter: Joseph Lynch >Priority: Normal > Fix For: 4.0-beta > > > While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the > following in the logs > {noformat} > INFO [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > {noformat} > This was with about 150 WPS against a LCS table containing 4kib partitions. > It seemed that compaction proceeded just fine but I don't remember seeing > this in previous testing runs and I'd like to make sure it's not a bug > (otherwise we may want to reduce the logging). -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15306) Investigate why we are allocating 8MiB chunks and reaching the maximum BufferPool size
[ https://issues.apache.org/jira/browse/CASSANDRA-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15306: - Component/s: Observability/Logging > Investigate why we are allocating 8MiB chunks and reaching the maximum > BufferPool size > -- > > Key: CASSANDRA-15306 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15306 > Project: Cassandra > Issue Type: Bug > Components: Observability/Logging, Test/benchmark >Reporter: Joseph Lynch >Priority: Normal > Fix For: 4.0-beta > > > While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the > following in the logs > {noformat} > INFO [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > {noformat} > This was with about 150 WPS against a LCS table containing 4kib data. It > seemed that compaction proceeded just fine but I don't remember seeing this > in previous testing runs and I'd like to make sure it's not a bug (otherwise > we may want to reduce the logging). -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15306) Investigate why we are allocating 8MiB chunks and reaching the maximum BufferPool size
[ https://issues.apache.org/jira/browse/CASSANDRA-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15306: - Fix Version/s: 4.0-alpha > Investigate why we are allocating 8MiB chunks and reaching the maximum > BufferPool size > -- > > Key: CASSANDRA-15306 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15306 > Project: Cassandra > Issue Type: Bug > Components: Test/benchmark >Reporter: Joseph Lynch >Priority: Normal > Fix For: 4.0-alpha > > > While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the > following in the logs > {noformat} > INFO [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > {noformat} > This was with about 150 WPS against a LCS table containing 4kib data. It > seemed that compaction proceeded just fine but I don't remember seeing this > in previous testing runs and I'd like to make sure it's not a bug (otherwise > we may want to reduce the logging). -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15306) Investigate why we are allocating 8MiB chunks and reaching the maximum BufferPool size
[ https://issues.apache.org/jira/browse/CASSANDRA-15306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15306: - Fix Version/s: (was: 4.0-alpha) 4.0-beta > Investigate why we are allocating 8MiB chunks and reaching the maximum > BufferPool size > -- > > Key: CASSANDRA-15306 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15306 > Project: Cassandra > Issue Type: Bug > Components: Test/benchmark >Reporter: Joseph Lynch >Priority: Normal > Fix For: 4.0-beta > > > While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the > following in the logs > {noformat} > INFO [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > INFO [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - > Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB > {noformat} > This was with about 150 WPS against a LCS table containing 4kib data. It > seemed that compaction proceeded just fine but I don't remember seeing this > in previous testing runs and I'd like to make sure it's not a bug (otherwise > we may want to reduce the logging). -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15306) Investigate why we are allocating 8MiB chunks and reaching the maximum BufferPool size
Joseph Lynch created CASSANDRA-15306: Summary: Investigate why we are allocating 8MiB chunks and reaching the maximum BufferPool size Key: CASSANDRA-15306 URL: https://issues.apache.org/jira/browse/CASSANDRA-15306 Project: Cassandra Issue Type: Bug Components: Test/benchmark Reporter: Joseph Lynch While throwing some light traffic at {{4.0-alpha1}} I saw a lot of the following in the logs {noformat} INFO [CompactionExecutor:8] 2019-09-06 11:40:31,419 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB INFO [CompactionExecutor:8] 2019-09-06 11:55:31,419 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB INFO [CompactionExecutor:15] 2019-09-06 12:10:31,419 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB INFO [CompactionExecutor:18] 2019-09-06 12:25:31,421 NoSpamLogger.java:91 - Maximum memory usage reached (512.000MiB), cannot allocate chunk of 8.000MiB {noformat} This was with about 150 WPS against a LCS table containing 4kib data. It seemed that compaction proceeded just fine but I don't remember seeing this in previous testing runs and I'd like to make sure it's not a bug (otherwise we may want to reduce the logging). -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)
[ https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918975#comment-16918975 ] Joseph Lynch commented on CASSANDRA-13938: -- I might have cycles to tackle this shortly, if someone else has cycles first please take it. > Default repair is broken, crashes other nodes participating in repair (in > trunk) > > > Key: CASSANDRA-13938 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13938 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Nate McCall >Assignee: Jason Brown >Priority: Urgent > Fix For: 4.0-alpha > > Attachments: 13938.yaml, test.sh > > > Running through a simple scenario to test some of the new repair features, I > was not able to make a repair command work. Further, the exception seemed to > trigger a nasty failure state that basically shuts down the netty connections > for messaging *and* CQL on the nodes transferring back data to the node being > repaired. The following steps reproduce this issue consistently. > Cassandra stress profile (probably not necessary, but this one provides a > really simple schema and consistent data shape): > {noformat} > keyspace: standard_long > keyspace_definition: | > CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy', > 'replication_factor':3}; > table: test_data > table_definition: | > CREATE TABLE test_data ( > key text, > ts bigint, > val text, > PRIMARY KEY (key, ts) > ) WITH COMPACT STORAGE AND > CLUSTERING ORDER BY (ts DESC) AND > bloom_filter_fp_chance=0.01 AND > caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND > comment='' AND > dclocal_read_repair_chance=0.00 AND > gc_grace_seconds=864000 AND > read_repair_chance=0.00 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > columnspec: > - name: key > population: uniform(1..5000) # 50 million records available > - name: ts > cluster: gaussian(1..50) # Up to 50 inserts per record > - name: val > population: gaussian(128..1024) # varrying size of value data > insert: > partitions: fixed(1) # only one insert per batch for individual partitions > select: fixed(1)/1 # each insert comes in one at a time > batchtype: UNLOGGED > queries: > single: > cql: select * from test_data where key = ? and ts = ? limit 1; > series: > cql: select key,ts,val from test_data where key = ? limit 10; > {noformat} > The commands to build and run: > {noformat} > ccm create 4_0_test -v git:trunk -n 3 -s > ccm stress user profile=./histo-test-schema.yml > ops\(insert=20,single=1,series=1\) duration=15s -rate threads=4 > # flush the memtable just to get everything on disk > ccm node1 nodetool flush > ccm node2 nodetool flush > ccm node3 nodetool flush > # disable hints for nodes 2 and 3 > ccm node2 nodetool disablehandoff > ccm node3 nodetool disablehandoff > # stop node1 > ccm node1 stop > ccm stress user profile=./histo-test-schema.yml > ops\(insert=20,single=1,series=1\) duration=45s -rate threads=4 > # wait 10 seconds > ccm node1 start > # Note that we are local to ccm's nodetool install 'cause repair preview is > not reported yet > node1/bin/nodetool repair --preview > node1/bin/nodetool repair standard_long test_data > {noformat} > The error outputs from the last repair command follow. First, this is stdout > from node1: > {noformat} > $ node1/bin/nodetool repair standard_long test_data > objc[47876]: Class JavaLaunchHelper is implemented in both > /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java > (0x10274d4c0) and > /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/libinstrument.dylib > (0x1047b64e0). One of the two will be used. Which one is undefined. > [2017-10-05 14:31:52,425] Starting repair command #4 > (7e1a9150-a98e-11e7-ad86-cbd2801b8de2), repairing keyspace standard_long with > repair options (parallelism: parallel, primary range: false, incremental: > true, job threads: 1, ColumnFamilies: [test_data], dataCenters: [], hosts: > [], previewKind: NONE, # of ranges: 3, pull repair: false, force repair: > false) > [2017-10-05 14:32:07,045] Repair session 7e2e8e80-a98e-11e7-ad86-cbd2801b8de2 > for range [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]] failed with error Stream failed > [2017-10-05 14:32:07,048] null > [2017-10-05 14:32:07,050] Repair command #4 finished in 14 seconds > error: Repair job has failed with the error message: [2017-10-05 > 14:32:07,048] null > -- StackTrace -- > java.lang.RuntimeException: Repair
[jira] [Commented] (CASSANDRA-15262) server_encryption_options is not backwards compatible with 3.11
[ https://issues.apache.org/jira/browse/CASSANDRA-15262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918959#comment-16918959 ] Joseph Lynch commented on CASSANDRA-15262: -- This could slip to 4.0-beta if we had to, but it is going to be annoying for folks testing with TLS (it was for us). > server_encryption_options is not backwards compatible with 3.11 > --- > > Key: CASSANDRA-15262 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15262 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Fix For: 4.0, 4.0-alpha > > > The current `server_encryption_options` configuration options are as follows: > {noformat} > server_encryption_options: > # set to true for allowing secure incoming connections > enabled: false > # If enabled and optional are both set to true, encrypted and unencrypted > connections are handled on the storage_port > optional: false > # if enabled, will open up an encrypted listening socket on > ssl_storage_port. Should be used > # during upgrade to 4.0; otherwise, set to false. > enable_legacy_ssl_storage_port: false > # on outbound connections, determine which type of peers to securely > connect to. 'enabled' must be set to true. > internode_encryption: none > keystore: conf/.keystore > keystore_password: cassandra > truststore: conf/.truststore > truststore_password: cassandra > # More advanced defaults below: > # protocol: TLS > # store_type: JKS > # cipher_suites: > [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA] > # require_client_auth: false > # require_endpoint_verification: false > {noformat} > A couple of issues here: > 1. optional defaults to false, which will break existing TLS configurations > for (from what I can tell) no particularly good reason > 2. The provided protocol and cipher suites are not good ideas (in particular > encouraging anyone to use CBC ciphers is a bad plan > I propose that before the 4.0 cut we fixup server_encryption_options and even > client_encryption_options : > # Change the default {{optional}} setting to true. As the new Netty code > intelligently decides to open a TLS connection or not this is the more > sensible default (saves operators a step while transitioning to TLS as well) > # Update the defaults to what netty actually defaults to -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15294) Allow easy use of custom security providers
[ https://issues.apache.org/jira/browse/CASSANDRA-15294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918955#comment-16918955 ] Joseph Lynch commented on CASSANDRA-15294: -- Yes I think after the alpha cuts I should have cycles to add this in, since it doesn't involve any backwards incompatible API changes I can do it before beta. I'd like to add the configuration capability to 3.0/3.11/trunk if possible but I think people might object to it being in 3.0 ... If no-one objects I'll just make patches for all three. > Allow easy use of custom security providers > --- > > Key: CASSANDRA-15294 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15294 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Joseph Lynch >Priority: Normal > > As more users are switching to using {{AES-GCM}} TLS they are increasingly > running into extremely poor performance with the JDK implementations (e.g. > [JDK-8046943|https://bugs.openjdk.java.net/browse/JDK-8046943]). It's not > just TLS either, generally speaking Java crypto can be really slow, including > for example MD5 hashing which powers our digests (CASSANDRA-14611). > There have been a few community attempts to fix this via customer java > security providers, for example Google's > [conscrypt|https://github.com/google/conscrypt] and recently Amazon's > [ACCP|https://github.com/corretto/amazon-corretto-crypto-provider] which are > basically portions of OpenSSL/BoringSSL that are statically linked in and > exposed via JNI. These approaches are similar in spirit to what > [netty-tcnative|https://github.com/netty/netty-tcnative] is doing for TLS in > C* trunk. > Since there may be tradeoffs to using various providers for various functions > (e.g. {{conscrypt}} may be faster or slower than {{accp}} in certain use > cases and in other cases you may want to use JDK providers for ease of > upgrading) it would be useful if Cassandra supported pluggable providers per > use case. For example we could use {{conscrypt}} for TLS, {{accp}} for MD5 > digesting, and the {{SUN}} provider for everything else. There is a small > amount of JVM wiring that needs to be done for this and it could unlock > 10-25% CPU capacity improvements. > We can then use this pluggability to test different providers and if one is > strictly dominant we can just check that one in in libs and default to it. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15146) Transitional TLS server configuration options are overly complex
[ https://issues.apache.org/jira/browse/CASSANDRA-15146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15146: - Fix Version/s: 4.0-beta > Transitional TLS server configuration options are overly complex > > > Key: CASSANDRA-15146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15146 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption, Local/Config >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Fix For: 4.0, 4.0-beta > > > It appears as part of the port from transitional client TLS to transitional > server TLS in CASSANDRA-10404 (the ability to switch a cluster to using > {{internode_encryption}} without listening on two ports and without downtime) > we carried the {{enabled}} setting over from the client implementation. I > believe that the {{enabled}} option is redundant to {{internode_encryption}} > and {{optional}} and it should therefore be removed prior to the 4.0 release > where we will have to start respecting that interface. > Current trunk yaml: > {noformat} > server_encryption_options: > > # set to true for allowing secure incoming connections > > enabled: false > > # If enabled and optional are both set to true, encrypted and unencrypted > connections are handled on the storage_port > optional: false > > > > > # if enabled, will open up an encrypted listening socket on > ssl_storage_port. Should be used > # during upgrade to 4.0; otherwise, set to false. > > enable_legacy_ssl_storage_port: false > > # on outbound connections, determine which type of peers to securely > connect to. 'enabled' must be set to true. > internode_encryption: none > > keystore: conf/.keystore > > keystore_password: cassandra > > truststore: conf/.truststore > > truststore_password: cassandra > {noformat} > I propose we eliminate {{enabled}} and just use {{optional}} and > {{internode_encryption}} to determine the listener setup. I also propose we > change the default of {{optional}} to true. We could also re-name > {{optional}} since it's a new option but I think it's good to stay consistent > with the client and use {{optional}}. > ||optional||internode_encryption||description|| > |true|none|(default) No encryption is used but if a server reaches out with > it we'll use it| > |false|dc|Encryption is required for inter-dc communication, but not intra-dc| > |false|all|Encryption is required for all communication| > |false|none|We only listen for unencrypted connections| > |true|dc|Encryption is used for inter-dc communication but is not required| > |true|all|Encryption is used for all communication but is not required| > From these states it is clear when we should be accepting TLS connections > (all except for false and none) as well as when we must enforce it. > To transition without downtime from an un-encrypted cluster to an encrypted > cluster the user would do the following: > 1. After adding valid truststores, change {{internode_encryption}} to the > desired level of encryption (recommended {{all}}) and restart Cassandra > 2. Change {{optional=false}} and restart Cassandra to enforce #1 > If {{optional}} defaulted to {{false}} as it does right now we'd need a third > restart to first change {{optional}} to {{true}}, which given my > understanding of the OptionalSslHandler isn't really relevant. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15262) server_encryption_options is not backwards compatible with 3.11
[ https://issues.apache.org/jira/browse/CASSANDRA-15262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15262: - Fix Version/s: 4.0-alpha > server_encryption_options is not backwards compatible with 3.11 > --- > > Key: CASSANDRA-15262 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15262 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Fix For: 4.0, 4.0-alpha > > > The current `server_encryption_options` configuration options are as follows: > {noformat} > server_encryption_options: > # set to true for allowing secure incoming connections > enabled: false > # If enabled and optional are both set to true, encrypted and unencrypted > connections are handled on the storage_port > optional: false > # if enabled, will open up an encrypted listening socket on > ssl_storage_port. Should be used > # during upgrade to 4.0; otherwise, set to false. > enable_legacy_ssl_storage_port: false > # on outbound connections, determine which type of peers to securely > connect to. 'enabled' must be set to true. > internode_encryption: none > keystore: conf/.keystore > keystore_password: cassandra > truststore: conf/.truststore > truststore_password: cassandra > # More advanced defaults below: > # protocol: TLS > # store_type: JKS > # cipher_suites: > [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA] > # require_client_auth: false > # require_endpoint_verification: false > {noformat} > A couple of issues here: > 1. optional defaults to false, which will break existing TLS configurations > for (from what I can tell) no particularly good reason > 2. The provided protocol and cipher suites are not good ideas (in particular > encouraging anyone to use CBC ciphers is a bad plan > I propose that before the 4.0 cut we fixup server_encryption_options and even > client_encryption_options : > # Change the default {{optional}} setting to true. As the new Netty code > intelligently decides to open a TLS connection or not this is the more > sensible default (saves operators a step while transitioning to TLS as well) > # Update the defaults to what netty actually defaults to -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14764) Evaluate 12 Node Breaking Point, compression=none, encryption=none, coalescing=off
[ https://issues.apache.org/jira/browse/CASSANDRA-14764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-14764: - Fix Version/s: 4.0-beta > Evaluate 12 Node Breaking Point, compression=none, encryption=none, > coalescing=off > -- > > Key: CASSANDRA-14764 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14764 > Project: Cassandra > Issue Type: Sub-task > Components: Legacy/Streaming and Messaging >Reporter: Joseph Lynch >Assignee: Vinay Chella >Priority: Normal > Fix For: 4.0-beta > > Attachments: i-03341e1c52de6ea3e-after-queue-change.svg, > i-07cd92e844d66d801-after-queue-bound.svg, i-07cd92e844d66d801-hint-play.svg, > i-07cd92e844d66d801-uninlined-with-jvm-methods.svg, ttop.txt > > > *Setup:* > * Cassandra: 12 (2*6) node i3.xlarge AWS instance (4 cpu cores, 30GB ram) > running cassandra trunk off of jasobrown/14503 jdd7ec5a2 (Jasons patched > internode messaging branch) vs the same footprint running 3.0.17 > * Two datacenters with 100ms latency between them > * No compression, encryption, or coalescing turned on > *Test #1:* > ndbench sent 1.5k QPS at a coordinator level to one datacenter (RF=3*2 = 6 so > 3k global replica QPS) of 4kb single partition BATCH mutations at LOCAL_ONE. > This represents about 250 QPS per coordinator in the first datacenter or 60 > QPS per core. The goal was to observe P99 write and read latencies under > various QPS. > *Result:* > The good news is since the CASSANDRA-14503 changes, instead of keeping the > mutations on heap we put the message into hints instead and don't run out of > memory. The bad news is that the {{MessagingService-NettyOutbound-Thread's}} > would occasionally enter a degraded state where they would just spin on a > core. I've attached flame graphs showing the CPU state as [~jasobrown] > applied fixes to the {{OutboundMessagingConnection}} class. > *Follow Ups:* > [~jasobrown] has committed a number of fixes onto his > {{jasobrown/14503-collab}} branch including: > 1. Limiting the amount of time spent dequeuing messages if they are expired > (previously if messages entered the queue faster than we could dequeue them > we'd just inifinte loop on the consumer side) > 2. Don't call {{dequeueMessages}} from within {{dequeueMessages}} created > callbacks. > We're continuing to use CPU flamegraphs to figure out where we're looping and > fixing bugs as we find them. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14747) Evaluate 200 node, compression=none, encryption=none, coalescing=off
[ https://issues.apache.org/jira/browse/CASSANDRA-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-14747: - Fix Version/s: 4.0-beta > Evaluate 200 node, compression=none, encryption=none, coalescing=off > - > > Key: CASSANDRA-14747 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14747 > Project: Cassandra > Issue Type: Sub-task > Components: Legacy/Testing >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Fix For: 4.0-beta > > Attachments: 3.0.17-QPS.png, 4.0.1-QPS.png, > 4.0.11-after-jolynch-tweaks.svg, 4.0.12-after-unconditional-flush.svg, > 4.0.15-after-sndbuf-fix.svg, 4.0.7-before-my-changes.svg, > 4.0_errors_showing_heap_pressure.txt, > 4.0_heap_histogram_showing_many_MessageOuts.txt, > i-0ed2acd2dfacab7c1-after-looping-fixes.svg, > trunk_14503_v2_cpuflamegraph.svg, trunk_vs_3.0.17_latency_under_load.png, > ttop_NettyOutbound-Thread_spinning.txt, > useast1c-i-0e1ddfe8b2f769060-mutation-flame.svg, > useast1e-i-08635fa1631601538_flamegraph_96node.svg, > useast1e-i-08635fa1631601538_ttop_netty_outbound_threads_96nodes, > useast1e-i-08635fa1631601538_uninlinedcpuflamegraph.0_96node_60sec_profile.svg > > > Tracks evaluating a 200 node cluster with all internode settings off (no > compression, no encryption, no coalescing). -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14746) Ensure Netty Internode Messaging Refactor is Solid
[ https://issues.apache.org/jira/browse/CASSANDRA-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-14746: - Fix Version/s: 4.0-beta > Ensure Netty Internode Messaging Refactor is Solid > -- > > Key: CASSANDRA-14746 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14746 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Streaming and Messaging >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Labels: 4.0-QA > Fix For: 4.0, 4.0-beta > > > Before we release 4.0 let's ensure that the internode messaging refactor is > 100% solid. As internode messaging is naturally used in many code paths and > widely configurable we have a large number of cluster configurations and test > configurations that must be vetted. > We plan to vary the following: > * Version of Cassandra 3.0.17 vs 4.0-alpha > * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes > * Client request rates varying between 1k QPS and 100k QPS of varying sizes > and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...) > * Internode compression > * Internode SSL (as well as openssl vs jdk) > * Internode Coalescing options > We are looking to measure the following as appropriate: > * Latency distributions of reads and writes (lower is better) > * Scaling limit, aka maximum throughput before violating p99 latency > deadline of 10ms @ LOCAL_QUORUM, on a fixed hardware deployment for 100% > writes, 100% reads and 50-50 writes+reads (higher is better) > * Thread counts (lower is better) > * Context switches (lower is better) > * On-CPU time of tasks (higher periods without context switch is better) > * GC allocation rates / throughput for a fixed size heap (lower allocation > better) > * Streaming recovery time for a single node failure, i.e. can Cassandra > saturate the NIC > > The goal is that 4.0 should have better latency, more throughput, fewer > threads, fewer context switches, less GC allocation, and faster recovery > time. I'm putting Jason Brown as the reviewer since he implemented most of > the internode refactor. > Current collaborators driving this QA task: Dinesh Joshi, Jordan West, Joey > Lynch (Netflix), Vinay Chella (Netflix) > Owning committer(s): Jason Brown -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15181) Ensure Nodes can Start and Stop
[ https://issues.apache.org/jira/browse/CASSANDRA-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15181: - Fix Version/s: 4.0-beta > Ensure Nodes can Start and Stop > --- > > Key: CASSANDRA-15181 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15181 > Project: Cassandra > Issue Type: Sub-task > Components: Legacy/Streaming and Messaging, Test/benchmark >Reporter: Joseph Lynch >Assignee: Vinay Chella >Priority: High > Fix For: 4.0-beta > > > Let's load a cluster up with data and start killing nodes. We can do hard > failures (node terminations) and soft failures (process kills) We plan to > observe the following: > * Can nodes successfully bootstrap? > * How long does it take to bootstrap > * What are the effects of TLS on and off (e.g. on stream time) > * Are hints properly played after a node restart > * Do nodes properly shutdown and start back up. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14688) Update protocol spec and class level doc with protocol checksumming details
[ https://issues.apache.org/jira/browse/CASSANDRA-14688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-14688: - Fix Version/s: 4.0-beta > Update protocol spec and class level doc with protocol checksumming details > --- > > Key: CASSANDRA-14688 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14688 > Project: Cassandra > Issue Type: Task > Components: Legacy/Documentation and Website >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Normal > Fix For: 4.0, 4.0-beta > > > CASSANDRA-13304 provides an option to add checksumming to the frame body of > native protocol messages. The native protocol spec needs to be updated to > reflect this ASAP. We should also verify that the javadoc comments describing > the on-wire format in > {{o.a.c.transport.frame.checksum.ChecksummingTransformer}} are up to date. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15228) Commit Log should not use sync markers
[ https://issues.apache.org/jira/browse/CASSANDRA-15228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15228: - Fix Version/s: 4.0-alpha > Commit Log should not use sync markers > -- > > Key: CASSANDRA-15228 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15228 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log >Reporter: Benedict >Priority: Normal > Fix For: 4.0, 4.0-alpha > > > The sync markers existed to permit file re-use. Since we no longer re-use > files, they no longer provide any value. However, they _can_ corrupt the > commit log for replay in the event of a process crash. Before we release > 4.0, we should ideally remove the sync markers entirely. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements
[ https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16918949#comment-16918949 ] Joseph Lynch commented on CASSANDRA-14801: -- [~benedict] do you think this should block the first alpha or it can wait for beta? > calculatePendingRanges no longer safe for multiple adjacent range movements > --- > > Key: CASSANDRA-14801 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14801 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Coordination, Legacy/Distributed Metadata >Reporter: Benedict >Priority: Normal > Fix For: 4.0 > > > Correctness depended upon the narrowing to a {{Set}}, > which we no longer do - we maintain a collection of all {{Replica}}. Our > {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result > contain the same endpoint multiple times; and our {{EndpointsForToken}} > obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, > resulting in cluster-wide failures for writes to the affected token ranges > for the duration of the range movement. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-10190) Python 3 support for cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-10190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-10190: - Fix Version/s: 4.0-alpha > Python 3 support for cqlsh > -- > > Key: CASSANDRA-10190 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10190 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Tools >Reporter: Andrew Pennebaker >Assignee: Patrick Bannister >Priority: Normal > Labels: cqlsh > Fix For: 4.0-alpha > > Attachments: coverage_notes.txt > > > Users who operate in a Python 3 environment may have trouble launching cqlsh. > Could we please update cqlsh's syntax to run in Python 3? > As a workaround, users can setup pyenv, and cd to a directory with a > .python-version containing "2.7". But it would be nice if cqlsh supported > modern Python versions out of the box. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)
[ https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-13938: - Fix Version/s: (was: 4.0) > Default repair is broken, crashes other nodes participating in repair (in > trunk) > > > Key: CASSANDRA-13938 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13938 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Nate McCall >Assignee: Jason Brown >Priority: Urgent > Attachments: 13938.yaml, test.sh > > > Running through a simple scenario to test some of the new repair features, I > was not able to make a repair command work. Further, the exception seemed to > trigger a nasty failure state that basically shuts down the netty connections > for messaging *and* CQL on the nodes transferring back data to the node being > repaired. The following steps reproduce this issue consistently. > Cassandra stress profile (probably not necessary, but this one provides a > really simple schema and consistent data shape): > {noformat} > keyspace: standard_long > keyspace_definition: | > CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy', > 'replication_factor':3}; > table: test_data > table_definition: | > CREATE TABLE test_data ( > key text, > ts bigint, > val text, > PRIMARY KEY (key, ts) > ) WITH COMPACT STORAGE AND > CLUSTERING ORDER BY (ts DESC) AND > bloom_filter_fp_chance=0.01 AND > caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND > comment='' AND > dclocal_read_repair_chance=0.00 AND > gc_grace_seconds=864000 AND > read_repair_chance=0.00 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > columnspec: > - name: key > population: uniform(1..5000) # 50 million records available > - name: ts > cluster: gaussian(1..50) # Up to 50 inserts per record > - name: val > population: gaussian(128..1024) # varrying size of value data > insert: > partitions: fixed(1) # only one insert per batch for individual partitions > select: fixed(1)/1 # each insert comes in one at a time > batchtype: UNLOGGED > queries: > single: > cql: select * from test_data where key = ? and ts = ? limit 1; > series: > cql: select key,ts,val from test_data where key = ? limit 10; > {noformat} > The commands to build and run: > {noformat} > ccm create 4_0_test -v git:trunk -n 3 -s > ccm stress user profile=./histo-test-schema.yml > ops\(insert=20,single=1,series=1\) duration=15s -rate threads=4 > # flush the memtable just to get everything on disk > ccm node1 nodetool flush > ccm node2 nodetool flush > ccm node3 nodetool flush > # disable hints for nodes 2 and 3 > ccm node2 nodetool disablehandoff > ccm node3 nodetool disablehandoff > # stop node1 > ccm node1 stop > ccm stress user profile=./histo-test-schema.yml > ops\(insert=20,single=1,series=1\) duration=45s -rate threads=4 > # wait 10 seconds > ccm node1 start > # Note that we are local to ccm's nodetool install 'cause repair preview is > not reported yet > node1/bin/nodetool repair --preview > node1/bin/nodetool repair standard_long test_data > {noformat} > The error outputs from the last repair command follow. First, this is stdout > from node1: > {noformat} > $ node1/bin/nodetool repair standard_long test_data > objc[47876]: Class JavaLaunchHelper is implemented in both > /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java > (0x10274d4c0) and > /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/libinstrument.dylib > (0x1047b64e0). One of the two will be used. Which one is undefined. > [2017-10-05 14:31:52,425] Starting repair command #4 > (7e1a9150-a98e-11e7-ad86-cbd2801b8de2), repairing keyspace standard_long with > repair options (parallelism: parallel, primary range: false, incremental: > true, job threads: 1, ColumnFamilies: [test_data], dataCenters: [], hosts: > [], previewKind: NONE, # of ranges: 3, pull repair: false, force repair: > false) > [2017-10-05 14:32:07,045] Repair session 7e2e8e80-a98e-11e7-ad86-cbd2801b8de2 > for range [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]] failed with error Stream failed > [2017-10-05 14:32:07,048] null > [2017-10-05 14:32:07,050] Repair command #4 finished in 14 seconds > error: Repair job has failed with the error message: [2017-10-05 > 14:32:07,048] null > -- StackTrace -- > java.lang.RuntimeException: Repair job has failed with the error message: > [2017-10-05 14:32:07,048] null > at
[jira] [Updated] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)
[ https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-13938: - Fix Version/s: 4.0 > Default repair is broken, crashes other nodes participating in repair (in > trunk) > > > Key: CASSANDRA-13938 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13938 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Nate McCall >Assignee: Jason Brown >Priority: Urgent > Fix For: 4.0 > > Attachments: 13938.yaml, test.sh > > > Running through a simple scenario to test some of the new repair features, I > was not able to make a repair command work. Further, the exception seemed to > trigger a nasty failure state that basically shuts down the netty connections > for messaging *and* CQL on the nodes transferring back data to the node being > repaired. The following steps reproduce this issue consistently. > Cassandra stress profile (probably not necessary, but this one provides a > really simple schema and consistent data shape): > {noformat} > keyspace: standard_long > keyspace_definition: | > CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy', > 'replication_factor':3}; > table: test_data > table_definition: | > CREATE TABLE test_data ( > key text, > ts bigint, > val text, > PRIMARY KEY (key, ts) > ) WITH COMPACT STORAGE AND > CLUSTERING ORDER BY (ts DESC) AND > bloom_filter_fp_chance=0.01 AND > caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND > comment='' AND > dclocal_read_repair_chance=0.00 AND > gc_grace_seconds=864000 AND > read_repair_chance=0.00 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > columnspec: > - name: key > population: uniform(1..5000) # 50 million records available > - name: ts > cluster: gaussian(1..50) # Up to 50 inserts per record > - name: val > population: gaussian(128..1024) # varrying size of value data > insert: > partitions: fixed(1) # only one insert per batch for individual partitions > select: fixed(1)/1 # each insert comes in one at a time > batchtype: UNLOGGED > queries: > single: > cql: select * from test_data where key = ? and ts = ? limit 1; > series: > cql: select key,ts,val from test_data where key = ? limit 10; > {noformat} > The commands to build and run: > {noformat} > ccm create 4_0_test -v git:trunk -n 3 -s > ccm stress user profile=./histo-test-schema.yml > ops\(insert=20,single=1,series=1\) duration=15s -rate threads=4 > # flush the memtable just to get everything on disk > ccm node1 nodetool flush > ccm node2 nodetool flush > ccm node3 nodetool flush > # disable hints for nodes 2 and 3 > ccm node2 nodetool disablehandoff > ccm node3 nodetool disablehandoff > # stop node1 > ccm node1 stop > ccm stress user profile=./histo-test-schema.yml > ops\(insert=20,single=1,series=1\) duration=45s -rate threads=4 > # wait 10 seconds > ccm node1 start > # Note that we are local to ccm's nodetool install 'cause repair preview is > not reported yet > node1/bin/nodetool repair --preview > node1/bin/nodetool repair standard_long test_data > {noformat} > The error outputs from the last repair command follow. First, this is stdout > from node1: > {noformat} > $ node1/bin/nodetool repair standard_long test_data > objc[47876]: Class JavaLaunchHelper is implemented in both > /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java > (0x10274d4c0) and > /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/libinstrument.dylib > (0x1047b64e0). One of the two will be used. Which one is undefined. > [2017-10-05 14:31:52,425] Starting repair command #4 > (7e1a9150-a98e-11e7-ad86-cbd2801b8de2), repairing keyspace standard_long with > repair options (parallelism: parallel, primary range: false, incremental: > true, job threads: 1, ColumnFamilies: [test_data], dataCenters: [], hosts: > [], previewKind: NONE, # of ranges: 3, pull repair: false, force repair: > false) > [2017-10-05 14:32:07,045] Repair session 7e2e8e80-a98e-11e7-ad86-cbd2801b8de2 > for range [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]] failed with error Stream failed > [2017-10-05 14:32:07,048] null > [2017-10-05 14:32:07,050] Repair command #4 finished in 14 seconds > error: Repair job has failed with the error message: [2017-10-05 > 14:32:07,048] null > -- StackTrace -- > java.lang.RuntimeException: Repair job has failed with the error message: > [2017-10-05 14:32:07,048] null > at
[jira] [Updated] (CASSANDRA-13938) Default repair is broken, crashes other nodes participating in repair (in trunk)
[ https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-13938: - Fix Version/s: (was: 4.x) 4.0 > Default repair is broken, crashes other nodes participating in repair (in > trunk) > > > Key: CASSANDRA-13938 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13938 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Nate McCall >Assignee: Jason Brown >Priority: Urgent > Fix For: 4.0 > > Attachments: 13938.yaml, test.sh > > > Running through a simple scenario to test some of the new repair features, I > was not able to make a repair command work. Further, the exception seemed to > trigger a nasty failure state that basically shuts down the netty connections > for messaging *and* CQL on the nodes transferring back data to the node being > repaired. The following steps reproduce this issue consistently. > Cassandra stress profile (probably not necessary, but this one provides a > really simple schema and consistent data shape): > {noformat} > keyspace: standard_long > keyspace_definition: | > CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy', > 'replication_factor':3}; > table: test_data > table_definition: | > CREATE TABLE test_data ( > key text, > ts bigint, > val text, > PRIMARY KEY (key, ts) > ) WITH COMPACT STORAGE AND > CLUSTERING ORDER BY (ts DESC) AND > bloom_filter_fp_chance=0.01 AND > caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND > comment='' AND > dclocal_read_repair_chance=0.00 AND > gc_grace_seconds=864000 AND > read_repair_chance=0.00 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > columnspec: > - name: key > population: uniform(1..5000) # 50 million records available > - name: ts > cluster: gaussian(1..50) # Up to 50 inserts per record > - name: val > population: gaussian(128..1024) # varrying size of value data > insert: > partitions: fixed(1) # only one insert per batch for individual partitions > select: fixed(1)/1 # each insert comes in one at a time > batchtype: UNLOGGED > queries: > single: > cql: select * from test_data where key = ? and ts = ? limit 1; > series: > cql: select key,ts,val from test_data where key = ? limit 10; > {noformat} > The commands to build and run: > {noformat} > ccm create 4_0_test -v git:trunk -n 3 -s > ccm stress user profile=./histo-test-schema.yml > ops\(insert=20,single=1,series=1\) duration=15s -rate threads=4 > # flush the memtable just to get everything on disk > ccm node1 nodetool flush > ccm node2 nodetool flush > ccm node3 nodetool flush > # disable hints for nodes 2 and 3 > ccm node2 nodetool disablehandoff > ccm node3 nodetool disablehandoff > # stop node1 > ccm node1 stop > ccm stress user profile=./histo-test-schema.yml > ops\(insert=20,single=1,series=1\) duration=45s -rate threads=4 > # wait 10 seconds > ccm node1 start > # Note that we are local to ccm's nodetool install 'cause repair preview is > not reported yet > node1/bin/nodetool repair --preview > node1/bin/nodetool repair standard_long test_data > {noformat} > The error outputs from the last repair command follow. First, this is stdout > from node1: > {noformat} > $ node1/bin/nodetool repair standard_long test_data > objc[47876]: Class JavaLaunchHelper is implemented in both > /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java > (0x10274d4c0) and > /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/libinstrument.dylib > (0x1047b64e0). One of the two will be used. Which one is undefined. > [2017-10-05 14:31:52,425] Starting repair command #4 > (7e1a9150-a98e-11e7-ad86-cbd2801b8de2), repairing keyspace standard_long with > repair options (parallelism: parallel, primary range: false, incremental: > true, job threads: 1, ColumnFamilies: [test_data], dataCenters: [], hosts: > [], previewKind: NONE, # of ranges: 3, pull repair: false, force repair: > false) > [2017-10-05 14:32:07,045] Repair session 7e2e8e80-a98e-11e7-ad86-cbd2801b8de2 > for range [(3074457345618258602,-9223372036854775808], > (-9223372036854775808,-3074457345618258603], > (-3074457345618258603,3074457345618258602]] failed with error Stream failed > [2017-10-05 14:32:07,048] null > [2017-10-05 14:32:07,050] Repair command #4 finished in 14 seconds > error: Repair job has failed with the error message: [2017-10-05 > 14:32:07,048] null > -- StackTrace -- > java.lang.RuntimeException: Repair job has failed with the error message: > [2017-10-05 14:32:07,048] null > at
[jira] [Updated] (CASSANDRA-15146) Transitional TLS server configuration options are overly complex
[ https://issues.apache.org/jira/browse/CASSANDRA-15146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15146: - Fix Version/s: 4.0 > Transitional TLS server configuration options are overly complex > > > Key: CASSANDRA-15146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15146 > Project: Cassandra > Issue Type: Bug > Components: Feature/Encryption, Local/Config >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Fix For: 4.0 > > > It appears as part of the port from transitional client TLS to transitional > server TLS in CASSANDRA-10404 (the ability to switch a cluster to using > {{internode_encryption}} without listening on two ports and without downtime) > we carried the {{enabled}} setting over from the client implementation. I > believe that the {{enabled}} option is redundant to {{internode_encryption}} > and {{optional}} and it should therefore be removed prior to the 4.0 release > where we will have to start respecting that interface. > Current trunk yaml: > {noformat} > server_encryption_options: > > # set to true for allowing secure incoming connections > > enabled: false > > # If enabled and optional are both set to true, encrypted and unencrypted > connections are handled on the storage_port > optional: false > > > > > # if enabled, will open up an encrypted listening socket on > ssl_storage_port. Should be used > # during upgrade to 4.0; otherwise, set to false. > > enable_legacy_ssl_storage_port: false > > # on outbound connections, determine which type of peers to securely > connect to. 'enabled' must be set to true. > internode_encryption: none > > keystore: conf/.keystore > > keystore_password: cassandra > > truststore: conf/.truststore > > truststore_password: cassandra > {noformat} > I propose we eliminate {{enabled}} and just use {{optional}} and > {{internode_encryption}} to determine the listener setup. I also propose we > change the default of {{optional}} to true. We could also re-name > {{optional}} since it's a new option but I think it's good to stay consistent > with the client and use {{optional}}. > ||optional||internode_encryption||description|| > |true|none|(default) No encryption is used but if a server reaches out with > it we'll use it| > |false|dc|Encryption is required for inter-dc communication, but not intra-dc| > |false|all|Encryption is required for all communication| > |false|none|We only listen for unencrypted connections| > |true|dc|Encryption is used for inter-dc communication but is not required| > |true|all|Encryption is used for all communication but is not required| > From these states it is clear when we should be accepting TLS connections > (all except for false and none) as well as when we must enforce it. > To transition without downtime from an un-encrypted cluster to an encrypted > cluster the user would do the following: > 1. After adding valid truststores, change {{internode_encryption}} to the > desired level of encryption (recommended {{all}}) and restart Cassandra > 2. Change {{optional=false}} and restart Cassandra to enforce #1 > If {{optional}} defaulted to {{false}} as it does right now we'd need a third > restart to first change {{optional}} to {{true}}, which given my > understanding of the OptionalSslHandler isn't really relevant. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15262) server_encryption_options is not backwards compatible with 3.11
[ https://issues.apache.org/jira/browse/CASSANDRA-15262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15262: - Fix Version/s: 4.0 > server_encryption_options is not backwards compatible with 3.11 > --- > > Key: CASSANDRA-15262 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15262 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Fix For: 4.0 > > > The current `server_encryption_options` configuration options are as follows: > {noformat} > server_encryption_options: > # set to true for allowing secure incoming connections > enabled: false > # If enabled and optional are both set to true, encrypted and unencrypted > connections are handled on the storage_port > optional: false > # if enabled, will open up an encrypted listening socket on > ssl_storage_port. Should be used > # during upgrade to 4.0; otherwise, set to false. > enable_legacy_ssl_storage_port: false > # on outbound connections, determine which type of peers to securely > connect to. 'enabled' must be set to true. > internode_encryption: none > keystore: conf/.keystore > keystore_password: cassandra > truststore: conf/.truststore > truststore_password: cassandra > # More advanced defaults below: > # protocol: TLS > # store_type: JKS > # cipher_suites: > [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA] > # require_client_auth: false > # require_endpoint_verification: false > {noformat} > A couple of issues here: > 1. optional defaults to false, which will break existing TLS configurations > for (from what I can tell) no particularly good reason > 2. The provided protocol and cipher suites are not good ideas (in particular > encouraging anyone to use CBC ciphers is a bad plan > I propose that before the 4.0 cut we fixup server_encryption_options and even > client_encryption_options : > # Change the default {{optional}} setting to true. As the new Netty code > intelligently decides to open a TLS connection or not this is the more > sensible default (saves operators a step while transitioning to TLS as well) > # Update the defaults to what netty actually defaults to -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Fix Version/s: 4.0 > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Labels: 4.0-QA > Fix For: 4.0 > > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > cassandra_comparative_performance_all_flamegraphs.html, > image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, > trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, > write_scaling_lq_eq_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15294) Allow easy use of custom security providers
[ https://issues.apache.org/jira/browse/CASSANDRA-15294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15294: - Impacts: Security (was: None) > Allow easy use of custom security providers > --- > > Key: CASSANDRA-15294 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15294 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: Joseph Lynch >Priority: Normal > > As more users are switching to using {{AES-GCM}} TLS they are increasingly > running into extremely poor performance with the JDK implementations (e.g. > [JDK-8046943|https://bugs.openjdk.java.net/browse/JDK-8046943]). It's not > just TLS either, generally speaking Java crypto can be really slow, including > for example MD5 hashing which powers our digests (CASSANDRA-14611). > There have been a few community attempts to fix this via customer java > security providers, for example Google's > [conscrypt|https://github.com/google/conscrypt] and recently Amazon's > [ACCP|https://github.com/corretto/amazon-corretto-crypto-provider] which are > basically portions of OpenSSL/BoringSSL that are statically linked in and > exposed via JNI. These approaches are similar in spirit to what > [netty-tcnative|https://github.com/netty/netty-tcnative] is doing for TLS in > C* trunk. > Since there may be tradeoffs to using various providers for various functions > (e.g. {{conscrypt}} may be faster or slower than {{accp}} in certain use > cases and in other cases you may want to use JDK providers for ease of > upgrading) it would be useful if Cassandra supported pluggable providers per > use case. For example we could use {{conscrypt}} for TLS, {{accp}} for MD5 > digesting, and the {{SUN}} provider for everything else. There is a small > amount of JVM wiring that needs to be done for this and it could unlock > 10-25% CPU capacity improvements. > We can then use this pluggability to test different providers and if one is > strictly dominant we can just check that one in in libs and default to it. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15294) Allow easy use of custom security providers
Joseph Lynch created CASSANDRA-15294: Summary: Allow easy use of custom security providers Key: CASSANDRA-15294 URL: https://issues.apache.org/jira/browse/CASSANDRA-15294 Project: Cassandra Issue Type: Improvement Components: Local/Config Reporter: Joseph Lynch As more users are switching to using {{AES-GCM}} TLS they are increasingly running into extremely poor performance with the JDK implementations (e.g. [JDK-8046943|https://bugs.openjdk.java.net/browse/JDK-8046943]). It's not just TLS either, generally speaking Java crypto can be really slow, including for example MD5 hashing which powers our digests (CASSANDRA-14611). There have been a few community attempts to fix this via customer java security providers, for example Google's [conscrypt|https://github.com/google/conscrypt] and recently Amazon's [ACCP|https://github.com/corretto/amazon-corretto-crypto-provider] which are basically portions of OpenSSL/BoringSSL that are statically linked in and exposed via JNI. These approaches are similar in spirit to what [netty-tcnative|https://github.com/netty/netty-tcnative] is doing for TLS in C* trunk. Since there may be tradeoffs to using various providers for various functions (e.g. {{conscrypt}} may be faster or slower than {{accp}} in certain use cases and in other cases you may want to use JDK providers for ease of upgrading) it would be useful if Cassandra supported pluggable providers per use case. For example we could use {{conscrypt}} for TLS, {{accp}} for MD5 digesting, and the {{SUN}} provider for everything else. There is a small amount of JVM wiring that needs to be done for this and it could unlock 10-25% CPU capacity improvements. We can then use this pluggability to test different providers and if one is strictly dominant we can just check that one in in libs and default to it. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Labels: 4.0-QA (was: ) > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Labels: 4.0-QA > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > cassandra_comparative_performance_all_flamegraphs.html, > image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, > trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, > write_scaling_lq_eq_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Resolution: Fixed Status: Resolved (was: Open) I posted our final analysis, we still need to do more testing but I think that can be done at smaller scales and we have enough follow ups from this test as it is. > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > cassandra_comparative_performance_all_flamegraphs.html, > image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, > trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, > write_scaling_lq_eq_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15262) server_encryption_options is not backwards compatible with 3.11
[ https://issues.apache.org/jira/browse/CASSANDRA-15262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15262: - Severity: Low Complexity: Low Hanging Fruit Discovered By: Performance Regression Test Bug Category: Parent values: Correctness(12982)Level 1 values: Semantic Failure(12988) Status: Open (was: Triage Needed) > server_encryption_options is not backwards compatible with 3.11 > --- > > Key: CASSANDRA-15262 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15262 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > > The current `server_encryption_options` configuration options are as follows: > {noformat} > server_encryption_options: > # set to true for allowing secure incoming connections > enabled: false > # If enabled and optional are both set to true, encrypted and unencrypted > connections are handled on the storage_port > optional: false > # if enabled, will open up an encrypted listening socket on > ssl_storage_port. Should be used > # during upgrade to 4.0; otherwise, set to false. > enable_legacy_ssl_storage_port: false > # on outbound connections, determine which type of peers to securely > connect to. 'enabled' must be set to true. > internode_encryption: none > keystore: conf/.keystore > keystore_password: cassandra > truststore: conf/.truststore > truststore_password: cassandra > # More advanced defaults below: > # protocol: TLS > # store_type: JKS > # cipher_suites: > [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA] > # require_client_auth: false > # require_endpoint_verification: false > {noformat} > A couple of issues here: > 1. optional defaults to false, which will break existing TLS configurations > for (from what I can tell) no particularly good reason > 2. The provided protocol and cipher suites are not good ideas (in particular > encouraging anyone to use CBC ciphers is a bad plan > I propose that before the 4.0 cut we fixup server_encryption_options and even > client_encryption_options : > # Change the default {{optional}} setting to true. As the new Netty code > intelligently decides to open a TLS connection or not this is the more > sensible default (saves operators a step while transitioning to TLS as well) > # Update the defaults to what netty actually defaults to -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15214) OOMs caught and not rethrown
[ https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901555#comment-16901555 ] Joseph Lynch commented on CASSANDRA-15214: -- [~yifanc] If you are ok with it I can add your test cases to [jvmquake|https://github.com/Netflix-Skunkworks/jvmquake/tree/master/tests] to ensure it handles all edge cases. For what it's worth jvmquake is a strict superset of jvmkill and I wouldn't advocate for using jvmkill (I'm biased though). In my production experience jvmquake actually works at detecting GC spirals of death that C* runs into while jvmkill simply doesn't work as C* doesn't actually go OOM, it just death spirals. See the "hard oom" [test cases|https://github.com/Netflix-Skunkworks/jvmquake/blob/master/tests/test_hard_ooms.py] for example where jvmkill won't work while jvmquake will work. > OOMs caught and not rethrown > > > Key: CASSANDRA-15214 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15214 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client, Messaging/Internode >Reporter: Benedict >Priority: Normal > Fix For: 4.0 > > Attachments: oom-experiments.zip > > > Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, > so presently there is no way to ensure that an OOM reaches the JVM handler to > trigger a crash/heapdump. > It may be that the simplest most consistent way to do this would be to have a > single thread spawned at startup that waits for any exceptions we must > propagate to the Runtime. > We could probably submit a patch upstream to Netty, but for a guaranteed > future proof approach, it may be worth paying the cost of a single thread. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15262) server_encryption_options is not backwards compatible with 3.11
Joseph Lynch created CASSANDRA-15262: Summary: server_encryption_options is not backwards compatible with 3.11 Key: CASSANDRA-15262 URL: https://issues.apache.org/jira/browse/CASSANDRA-15262 Project: Cassandra Issue Type: Bug Components: Local/Config Reporter: Joseph Lynch Assignee: Joseph Lynch The current `server_encryption_options` configuration options are as follows: {noformat} server_encryption_options: # set to true for allowing secure incoming connections enabled: false # If enabled and optional are both set to true, encrypted and unencrypted connections are handled on the storage_port optional: false # if enabled, will open up an encrypted listening socket on ssl_storage_port. Should be used # during upgrade to 4.0; otherwise, set to false. enable_legacy_ssl_storage_port: false # on outbound connections, determine which type of peers to securely connect to. 'enabled' must be set to true. internode_encryption: none keystore: conf/.keystore keystore_password: cassandra truststore: conf/.truststore truststore_password: cassandra # More advanced defaults below: # protocol: TLS # store_type: JKS # cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA] # require_client_auth: false # require_endpoint_verification: false {noformat} A couple of issues here: 1. optional defaults to false, which will break existing TLS configurations for (from what I can tell) no particularly good reason 2. The provided protocol and cipher suites are not good ideas (in particular encouraging anyone to use CBC ciphers is a bad plan I propose that before the 4.0 cut we fixup server_encryption_options and even client_encryption_options : # Change the default {{optional}} setting to true. As the new Netty code intelligently decides to open a TLS connection or not this is the more sensible default (saves operators a step while transitioning to TLS as well) # Update the defaults to what netty actually defaults to -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: cassandra_comparative_performance_all_flamegraphs.html > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > cassandra_comparative_performance_all_flamegraphs.html, > image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, > trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, > write_scaling_lq_eq_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901482#comment-16901482 ] Joseph Lynch commented on CASSANDRA-15175: -- The analysis so far: [^cassandra_comparative_performance_all_flamegraphs.html] > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, > trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, > write_scaling_lq_eq_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: write_scaling_local_one_summary.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, > trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, > write_scaling_lq_eq_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: write_scaling_lq_eq_summary.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, > trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, > write_scaling_lq_eq_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16901476#comment-16901476 ] Joseph Lynch commented on CASSANDRA-15175: -- Write scaling test LOCAL_ONE looks good: !write_scaling_local_one_summary.png! Write scaling test at LQ reads + EQ writes looks ok, but not great: !write_scaling_lq_eq_summary.png! > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, > trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png, write_scaling_local_one_summary.png, > write_scaling_lq_eq_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: image-2019-08-06-14-20-25-140.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > image-2019-08-06-14-20-25-140.png, odd_netty_jdk_tls_cpu_usage.png, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_LQ_14400cRPS-14400cWPS.svg, trunk_LQ_21600cRPS-14400cWPS.svg, > trunk_Q_21600cRPS-7200cWPS.svg, trunk_allocation_Q_21k_cRPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: (was: trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png) > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_58kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_22kcWPS.png, > trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_wEQ_rLQ_7kcRPS_7kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_162kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15214) OOMs caught and not rethrown
[ https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886475#comment-16886475 ] Joseph Lynch edited comment on CASSANDRA-15214 at 7/16/19 9:36 PM: --- We've (Netlfix) found handling OOMs to be generally hard to do correctly in all the various Java codebases we have so we built an agent solution which attaches to the JVM in [https://github.com/Netflix-Skunkworks/jvmquake]. I think the only reason that we couldn't just directly include that in C* is because it's a C JVMTI agent instead of a Java one, but perhaps we could just solve this with some documentation and making it really easy to include agents (which is useful regardless)? I can also spend some time and see if I can make it a java agent instead of a c one. The following is the patch for supporting easy pluggable agents for C*: {noformat} diff --git a/conf/cassandra-env.sh b/conf/cassandra-env.sh index d6c48be0a3..92061db3ab 100644 --- a/conf/cassandra-env.sh +++ b/conf/cassandra-env.sh @@ -134,6 +134,29 @@ do JVM_OPTS="$JVM_OPTS $opt" done +# Pull in any agents present in CASSANDRA_HOME +for agent_file in ${CASSANDRA_HOME}/agents/*.jar; do + if [ -e "${agent_file}" ]; then +base_file="${agent_file%.jar}" +if [ -s "${base_file}.options" ]; then + options=`cat ${base_file}.options` + agent_file="${agent_file}=${options}" +fi +JVM_OPTS="$JVM_OPTS -javaagent:${agent_file}" + fi +done + +for agent_file in ${CASSANDRA_HOME}/agents/*.so; do + if [ -e "${agent_file}" ]; then +base_file="${agent_file%.so}" +if [ -s "${base_file}.options" ]; then + options=`cat ${base_file}.options` + agent_file="${agent_file}=${options}" +fi +JVM_OPTS="$JVM_OPTS -agentpath:${agent_file}" + fi +done {noformat} Then we can just drop agents into the {{CASSANDRA_HOME/agents}} folder and they are loaded automatically by Cassandra. From a security perspective this is identical to "drop a jar". was (Author: jolynch): We've (Netlfix) found handling OOMs to be generally hard to do correctly in all the various Java codebases we have so we built an agent solution which attaches to the JVM in [https://github.com/Netflix-Skunkworks/jvmquake]. I think the only reason that we couldn't just directly include that in C* is because it's a C JVMTI agent instead of a Java one, but perhaps we could just solve this with some documentation and making it really easy to include agents (which is useful regardless)? The following is the patch for supporting easy pluggable agents for C*: {noformat} diff --git a/conf/cassandra-env.sh b/conf/cassandra-env.sh index d6c48be0a3..92061db3ab 100644 --- a/conf/cassandra-env.sh +++ b/conf/cassandra-env.sh @@ -134,6 +134,29 @@ do JVM_OPTS="$JVM_OPTS $opt" done +# Pull in any agents present in CASSANDRA_HOME +for agent_file in ${CASSANDRA_HOME}/agents/*.jar; do + if [ -e "${agent_file}" ]; then +base_file="${agent_file%.jar}" +if [ -s "${base_file}.options" ]; then + options=`cat ${base_file}.options` + agent_file="${agent_file}=${options}" +fi +JVM_OPTS="$JVM_OPTS -javaagent:${agent_file}" + fi +done + +for agent_file in ${CASSANDRA_HOME}/agents/*.so; do + if [ -e "${agent_file}" ]; then +base_file="${agent_file%.so}" +if [ -s "${base_file}.options" ]; then + options=`cat ${base_file}.options` + agent_file="${agent_file}=${options}" +fi +JVM_OPTS="$JVM_OPTS -agentpath:${agent_file}" + fi +done {noformat} Then we can just drop agents into the {{CASSANDRA_HOME/agents}} folder and they are loaded automatically by Cassandra. From a security perspective this is identical to "drop a jar". > OOMs caught and not rethrown > > > Key: CASSANDRA-15214 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15214 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client, Messaging/Internode >Reporter: Benedict >Priority: Normal > Fix For: 4.0 > > > Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, > so presently there is no way to ensure that an OOM reaches the JVM handler to > trigger a crash/heapdump. > It may be that the simplest most consistent way to do this would be to have a > single thread spawned at startup that waits for any exceptions we must > propagate to the Runtime. > We could probably submit a patch upstream to Netty, but for a guaranteed > future proof approach, it may be worth paying the cost of a single thread. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15214) OOMs caught and not rethrown
[ https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886475#comment-16886475 ] Joseph Lynch commented on CASSANDRA-15214: -- We've (Netlfix) found handling OOMs to be generally hard to do correctly in all the various Java codebases we have so we built an agent solution which attaches to the JVM in [https://github.com/Netflix-Skunkworks/jvmquake]. I think the only reason that we couldn't just directly include that in C* is because it's a C JVMTI agent instead of a Java one, but perhaps we could just solve this with some documentation and making it really easy to include agents (which is useful regardless)? The following is the patch for supporting easy pluggable agents for C*: {noformat} diff --git a/conf/cassandra-env.sh b/conf/cassandra-env.sh index d6c48be0a3..92061db3ab 100644 --- a/conf/cassandra-env.sh +++ b/conf/cassandra-env.sh @@ -134,6 +134,29 @@ do JVM_OPTS="$JVM_OPTS $opt" done +# Pull in any agents present in CASSANDRA_HOME +for agent_file in ${CASSANDRA_HOME}/agents/*.jar; do + if [ -e "${agent_file}" ]; then +base_file="${agent_file%.jar}" +if [ -s "${base_file}.options" ]; then + options=`cat ${base_file}.options` + agent_file="${agent_file}=${options}" +fi +JVM_OPTS="$JVM_OPTS -javaagent:${agent_file}" + fi +done + +for agent_file in ${CASSANDRA_HOME}/agents/*.so; do + if [ -e "${agent_file}" ]; then +base_file="${agent_file%.so}" +if [ -s "${base_file}.options" ]; then + options=`cat ${base_file}.options` + agent_file="${agent_file}=${options}" +fi +JVM_OPTS="$JVM_OPTS -agentpath:${agent_file}" + fi +done {noformat} Then we can just drop agents into the {{CASSANDRA_HOME/agents}} folder and they are loaded automatically by Cassandra. From a security perspective this is identical to "drop a jar". > OOMs caught and not rethrown > > > Key: CASSANDRA-15214 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15214 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client, Messaging/Internode >Reporter: Benedict >Priority: Normal > Fix For: 4.0 > > > Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, > so presently there is no way to ensure that an OOM reaches the JVM handler to > trigger a crash/heapdump. > It may be that the simplest most consistent way to do this would be to have a > single thread spawned at startup that waits for any exceptions we must > propagate to the Runtime. > We could probably submit a patch upstream to Netty, but for a guaranteed > future proof approach, it may be worth paying the cost of a single thread. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15224) DynamicSnitch.applyConfigChanges can corrupt snitch state
[ https://issues.apache.org/jira/browse/CASSANDRA-15224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886472#comment-16886472 ] Joseph Lynch commented on CASSANDRA-15224: -- This is at least fixed in the trunk patch for CASSANDRA-14459. I may be able to isolate those changes for backport to the 3.x series. > DynamicSnitch.applyConfigChanges can corrupt snitch state > - > > Key: CASSANDRA-15224 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15224 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Benedict >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0.x > > > This method is not synchronised, and doesn’t wait for the cancelled task to > complete (which could already be running), so we could have two updates in > flight simultaneously and corrupt the internal state of the collection -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: (was: trunk_vs_30x_write_LO_108kcWPS_7kcRPS.png) > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_write_LO_108kcWPS_7kcRPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_108kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_72kcWPS.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Description: Tracks evaluating a 192 node cluster with compression and encryption on. First test is a read scaling test [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] |Test Setup| | |Baseline|3.0.19 @d7d00036| |Candiate|trunk @abb0e177| | | | |Workload| | |Write size|4kb random| |Read size|4kb random| |Per Node Data|110GiB| |Generator|ndbench| |Key Distribution|Uniform| |SSTable Compr|Off| |Internode TLS|On (jdk)| |Internode Compr|On| |Compaction|LCS (320 MiB)| |Repair|Off| | | | |Hardware| | |Instance Type|i3.xlarge| |Deployment|96 us-east-1, 96 eu-west-1| |Region node count|96| | | | |OS Settings| | |IO scheduler|kyber| |Net qdisc|tc-fq| |readahead|32kb| |Java Version|OpenJDK 1.8.0_202 (Zulu)| | | | Second test is a [write scaling test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: was: Tracks evaluating a 192 node cluster with compression and encryption on. Test setup at (reproduced below) [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] |Test Setup| | |Baseline|3.0.19 @d7d00036| |Candiate|trunk @abb0e177| | | | |Workload| | |Write size|4kb random| |Read size|4kb random| |Per Node Data|110GiB| |Generator|ndbench| |Key Distribution|Uniform| |SSTable Compr|Off| |Internode TLS|On (jdk)| |Internode Compr|On| |Compaction|LCS (320 MiB)| |Repair|Off| | | | |Hardware| | |Instance Type|i3.xlarge| |Deployment|96 us-east-1, 96 eu-west-1| |Region node count|96| | | | |OS Settings| | |IO scheduler|kyber| |Net qdisc|tc-fq| |readahead|32kb| |Java Version|OpenJDK 1.8.0_202 (Zulu)| | | | > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a read scaling test > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Description: Tracks evaluating a 192 node cluster with compression and encryption on. First test is a [read scaling test |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] |Test Setup| | |Baseline|3.0.19 @d7d00036| |Candiate|trunk @abb0e177| | | | |Workload| | |Write size|4kb random| |Read size|4kb random| |Per Node Data|110GiB| |Generator|ndbench| |Key Distribution|Uniform| |SSTable Compr|Off| |Internode TLS|On (jdk)| |Internode Compr|On| |Compaction|LCS (320 MiB)| |Repair|Off| | | | |Hardware| | |Instance Type|i3.xlarge| |Deployment|96 us-east-1, 96 eu-west-1| |Region node count|96| | | | |OS Settings| | |IO scheduler|kyber| |Net qdisc|tc-fq| |readahead|32kb| |Java Version|OpenJDK 1.8.0_202 (Zulu)| | | | Second test is a [write scaling test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: was: Tracks evaluating a 192 node cluster with compression and encryption on. First test is a read scaling test [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] |Test Setup| | |Baseline|3.0.19 @d7d00036| |Candiate|trunk @abb0e177| | | | |Workload| | |Write size|4kb random| |Read size|4kb random| |Per Node Data|110GiB| |Generator|ndbench| |Key Distribution|Uniform| |SSTable Compr|Off| |Internode TLS|On (jdk)| |Internode Compr|On| |Compaction|LCS (320 MiB)| |Repair|Off| | | | |Hardware| | |Instance Type|i3.xlarge| |Deployment|96 us-east-1, 96 eu-west-1| |Region node count|96| | | | |OS Settings| | |IO scheduler|kyber| |Net qdisc|tc-fq| |readahead|32kb| |Java Version|OpenJDK 1.8.0_202 (Zulu)| | | | Second test is a [write scaling test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > First test is a [read scaling test > |https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | > Second test is a [write scaling > test|https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=428858608]: -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail:
[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16878164#comment-16878164 ] Joseph Lynch commented on CASSANDRA-15175: -- The QUORUM test is completed, results look pretty good on the read side: !trunk_vs_30x_Q_tcnative_summary.png! > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_Q_tcnative_summary.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_Q_tcnative_summary.png, > trunk_vs_30x_summary.png, trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_summary.png, > trunk_vs_30x_write_LO_7kcRPS_7kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875587#comment-16875587 ] Joseph Lynch commented on CASSANDRA-15175: -- [~norman] Yes, I think we have enough evidence that the latency regression was probably not Netty TLS (although I am somewhat surprised CPU time with tcnative is about the same as pre-netty jdk TLS) and probably in how we're using it instead. It is possible that the default cipher choice for netty on Java 8 may want to be revised, or at least noted somewhere in the documentation that Java 8 has a performance limitation with GCM ciphers? > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_Q_36kcRPS_7200cWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_Q_36kcRPS_7200cWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_allocation_Q_21k_cRPS.svg > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_allocation_Q_21k_cRPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_Q_21600cRPS-7200cWPS.svg > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_Q_21600cRPS-7200cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_Q_21kcRPS_7200cWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_Q_21kcRPS_7200cWPS.png, > trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874410#comment-16874410 ] Joseph Lynch commented on CASSANDRA-15175: -- Alright, I've run the scaling test with both (jdk) TLS and (openssl) TLS (I dropped the statically linked boringssl jar). With jdk TLS after switching the default cipher to {{TLS_RSA_WITH_AES_128_CBC_SHA}}: !trunk_vs_30x_LQ_jdk_summary.png! With openssl TLS again with the default cipher the same as 30x: !trunk_vs_30x_LQ_tcnative_summary.png! So the summary is, we have a minor regression in average performance for LOCAL_QUORUM. Flamegraphs are attached for root cause. The good news is that the tail is significantly better. Action items: * Make sure that cipher we use isn't GCM by default * Determine why writes got slower, still outstanding I am now moving on to a QUORUM test. > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_LQ_tcnative_summary.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_LQ_jdk_summary.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_LQ_jdk_summary.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_tcnative_summary.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_openssl_21kcRPS_14kcWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_LQ_64kcRPS_14kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_64kcRPS_14kcWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_LQ_21kcRPS_14kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_LQ_21kcRPS_14kcWPS.png, > trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871894#comment-16871894 ] Joseph Lynch commented on CASSANDRA-15175: -- I switched to the same cipher that 3.0 is running and saw a reduction of on CPU time to 12.9% (compared to 3.0's 8.5%). This is a significant improvement but still not quite equal. Interestingly with that improvement average latency is now on par with 3.0 in the local quorum test. [^trunk_LQ_21600cRPS-14400cWPS.svg] [^30x_LQ_21600cRPS-14400cWPS.svg] I'm going to finish off this round of jdk TLS testing and then switch to tcnative tomorrow and test that. > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: 30x_LQ_21600cRPS-14400cWPS.svg trunk_LQ_21600cRPS-14400cWPS.svg > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > 30x_LQ_21600cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_LQ_21600cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871656#comment-16871656 ] Joseph Lynch edited comment on CASSANDRA-15175 at 6/24/19 11:22 PM: I have completed the {{LOCAL_ONE}} scaling test. I have summarized the test in the following graph: !trunk_vs_30x_summary.png! As we can see, even with the extra TLS CPU requirements, trunk was able to significantly outperform the status quo 3.0.x cluster across the load spectrum for this consistency level I am proceeding with other consistency levels and gathering additional data. So far I have noticed the following issues during these tests which I will gather more data on and follow up with in other tickets (and edit here with ticket numbers once I have them): # JDK Netty TLS appears significantly more CPU intensive than the previous Java Sockets implementation. [~norman] is taking a look from the Netty side and we can follow up and make sure we're not creating improperly (looking at the flamegraphs it looks like we may have a buffer sizing issue) # When a node was terminated and replaced, the new node appeared to sit for a very long time waiting for schema pulls to complete (I think it was waiting on the node it was replacing but I haven't fully debugged this). # Nodetool netstats doesn't report progress properly for the file count (percent, single file, and size still seem right; this is probably CASSANDRA-14192 # When we re-load NTS keyspaces from disk we throw warnings about "Ignoring Unrecognized strategy option" for datacenters that we are not in # After a node shuts down there is a burst of re-connections on the urgent port prior to actual shutdown (I _think_ this is pre-existing and I'm just noticing it because of the new logging) Also while setting up the {{LOCAL_QUORUM}} test I found the following trying to understand why I was seeing a higher number of blocking read repairs on the trunk cluster than the 30x cluster: # -When I stop and start nodes, it appears that hints may not always playback. In particular the high blocking read repairs were coming from neighbors of the node I had restarted a few times to test tcnative openssl integration. I checked the neighbor's hints directories and sure enough there were pending hints there that were not playing at all (they had been there for over 8 hours and still not played).- (Edit: This is a bad default. The default hinted_handoff_throttle_in_kb is 1024 but it is divided by the number of nodes in the cluster. In this case the size of 192 meant we were playing hints at a rate of ~5 kbps, which meant if we were down for even a few minutes we would essentially lose those mutations before the 24 hour hint expiry window) # -Repair appears to fail on the default system_traces when run with {{-full}} and {{-os}- (Edit: this is operator error, we shouldn't pass -local to a SimpleStrategy keyspace) {noformat} cass-perf-trunk-14746--useast1c-i-00a32889835534b75:~$ nodetool repair -os -full -local [2019-06-23 23:29:30,210] Starting repair command #1 (bfbc7ba0-960e-11e9-b238-77fd1c2e9b1c), repairing keyspace perftest with repair options (parallelism: parallel, primary range: false, incremental: false, job threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], previewKind: NONE, # of ranges: 6, pull repair: false, force repair: false, optimise streams: true) [2019-06-23 23:52:08,248] Repair session c0573500-960e-11e9-b238-77fd1c2e9b1c for range [(384307168575030403,384307170010857891], (192153585909716729,384307168575030403]] finished (progress: 10%) [2019-06-23 23:52:26,393] Repair session c0307320-960e-11e9-b238-77fd1c2e9b1c for range [(1808575567,192153584473889241], (192153584473889241,192153585909716729]] finished (progress: 20%) [2019-06-23 23:52:28,298] Repair session c059f420-960e-11e9-b238-77fd1c2e9b1c for range [(576460752676171565,576460754111999053], (384307170010857891,576460752676171565]] finished (progress: 30%) [2019-06-23 23:52:28,302] Repair completed successfully [2019-06-23 23:52:28,310] Repair command #1 finished in 22 minutes 58 seconds [2019-06-23 23:52:28,331] Replication factor is 1. No repair is needed for keyspace 'system_auth' [2019-06-23 23:52:28,350] Starting repair command #2 (f52c1c70-9611-11e9-b238-77fd1c2e9b1c), repairing keyspace system_traces with repair options (parallelism: parallel, primary range: false, incremental: false, job threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], previewKind: NONE, # of ranges: 2, pull repair: false, force repair: false, optimise streams: true) [2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not be empty [2019-06-23 23:52:28,351] Repair command #2 finished with error error: Repair job has failed with the error message: [2019-06-23 23:52:28,351] Repair command #2 failed with
[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871774#comment-16871774 ] Joseph Lynch commented on CASSANDRA-15175: -- [~norman] yeah sadly I don't see any exceptions in the C* code that correlate with that exception, but I can try enabling more verbose logging on particular netty modules if you think it will help? Also fwiw I think that the 3.0 branch in this test is using {{TLS_RSA_WITH_AES_128_CBC_SHA}} as the default cipher instead of {{TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384}}. I don't really know what I'm talking about when it comes to TLS cipher suites but it appears from my reading of https://bugs.openjdk.java.net/browse/JDK-8046943 that {{GCM}} is very slow in Java 8 (apparently fixed in Java 9). That might explain why we're spending so much CPU time in GaloisCountMode (which I assume is GCM). I can try using {{TLS_RSA_WITH_AES_256_CBC_SHA}} with both as a fair test? > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871656#comment-16871656 ] Joseph Lynch edited comment on CASSANDRA-15175 at 6/24/19 7:23 PM: --- I have completed the {{LOCAL_ONE}} scaling test. I have summarized the test in the following graph: !trunk_vs_30x_summary.png! As we can see, even with the extra TLS CPU requirements, trunk was able to significantly outperform the status quo 3.0.x cluster across the load spectrum for this consistency level I am proceeding with other consistency levels and gathering additional data. So far I have noticed the following issues during these tests which I will gather more data on and follow up with in other tickets (and edit here with ticket numbers once I have them): # JDK Netty TLS appears significantly more CPU intensive than the previous Java Sockets implementation. [~norman] is taking a look from the Netty side and we can follow up and make sure we're not creating improperly (looking at the flamegraphs it looks like we may have a buffer sizing issue) # When a node was terminated and replaced, the new node appeared to sit for a very long time waiting for schema pulls to complete (I think it was waiting on the node it was replacing but I haven't fully debugged this). # Nodetool netstats doesn't report progress properly for the file count (percent, single file, and size still seem right; this is probably CASSANDRA-14192 # When we re-load NTS keyspaces from disk we throw warnings about "Ignoring Unrecognized strategy option" for datacenters that we are not in # After a node shuts down there is a burst of re-connections on the urgent port prior to actual shutdown (I _think_ this is pre-existing and I'm just noticing it because of the new logging) Also while setting up the {{LOCAL_QUORUM}} test I found the following trying to understand why I was seeing a higher number of blocking read repairs on the trunk cluster than the 30x cluster: # When I stop and start nodes, it appears that hints may not always playback. In particular the high blocking read repairs were coming from neighbors of the node I had restarted a few times to test tcnative openssl integration. I checked the neighbor's hints directories and sure enough there were pending hints there that were not playing at all (they had been there for over 8 hours and still not played). # -Repair appears to fail on the default system_traces when run with {{-full}} and {{-os}- (Edit: this is operator error, we shouldn't pass -local to a SimpleStrategy keyspace) {noformat} cass-perf-trunk-14746--useast1c-i-00a32889835534b75:~$ nodetool repair -os -full -local [2019-06-23 23:29:30,210] Starting repair command #1 (bfbc7ba0-960e-11e9-b238-77fd1c2e9b1c), repairing keyspace perftest with repair options (parallelism: parallel, primary range: false, incremental: false, job threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], previewKind: NONE, # of ranges: 6, pull repair: false, force repair: false, optimise streams: true) [2019-06-23 23:52:08,248] Repair session c0573500-960e-11e9-b238-77fd1c2e9b1c for range [(384307168575030403,384307170010857891], (192153585909716729,384307168575030403]] finished (progress: 10%) [2019-06-23 23:52:26,393] Repair session c0307320-960e-11e9-b238-77fd1c2e9b1c for range [(1808575567,192153584473889241], (192153584473889241,192153585909716729]] finished (progress: 20%) [2019-06-23 23:52:28,298] Repair session c059f420-960e-11e9-b238-77fd1c2e9b1c for range [(576460752676171565,576460754111999053], (384307170010857891,576460752676171565]] finished (progress: 30%) [2019-06-23 23:52:28,302] Repair completed successfully [2019-06-23 23:52:28,310] Repair command #1 finished in 22 minutes 58 seconds [2019-06-23 23:52:28,331] Replication factor is 1. No repair is needed for keyspace 'system_auth' [2019-06-23 23:52:28,350] Starting repair command #2 (f52c1c70-9611-11e9-b238-77fd1c2e9b1c), repairing keyspace system_traces with repair options (parallelism: parallel, primary range: false, incremental: false, job threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], previewKind: NONE, # of ranges: 2, pull repair: false, force repair: false, optimise streams: true) [2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not be empty [2019-06-23 23:52:28,351] Repair command #2 finished with error error: Repair job has failed with the error message: [2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not be empty. Check the logs on the repair participants for further details -- StackTrace -- java.lang.RuntimeException: Repair job has failed with the error message: [2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not be empty. Check the logs on the repair participants for further details at
[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871714#comment-16871714 ] Joseph Lynch commented on CASSANDRA-15175: -- {quote} [~jolynch] one question... when using JDK TLS do you see any errors at all or you just see more CPU usage and thats it ? {quote} I don't see any errors in our logs but we are spending more CPU handling {{ShortBufferExceptions}} internally to Netty then may make sense. I took the following screenshots from the [^trunk_LQ_14400cRPS-14400cWPS.svg] flamegraph. !odd_netty_jdk_tls_cpu_usage.png! !ShortbufferExceptions.png! Other than the flamegraph and degraded latency in {{LOCAL_QUORUM}} mode (where C* nodes actually have to talk to each other through the internode messaging framework), things appear about the same (no errors that I can see). > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: ShortbufferExceptions.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, ShortbufferExceptions.png, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: odd_netty_jdk_tls_cpu_usage.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > odd_netty_jdk_tls_cpu_usage.png, trunk_14400cRPS-14400cWPS.svg, > trunk_187000cRPS-14400cWPS.svg, trunk_187kcRPS_14kcWPS.png, > trunk_22000cRPS-14400cWPS-jdk.svg, trunk_22000cRPS-14400cWPS-openssl.svg, > trunk_220kcRPS_14kcWPS.png, trunk_252kcRPS-14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_LQ_14400cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_LQ_14400cRPS-14400cWPS.svg > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_LQ_14400cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_LQ_14kcRPS_14kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_LQ_14kcRPS_14kcWPS.png, trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16871656#comment-16871656 ] Joseph Lynch commented on CASSANDRA-15175: -- I have completed the {{LOCAL_ONE}} scaling test. I have summarized the test in the following graph: !trunk_vs_30x_summary.png! As we can see, even with the extra TLS CPU requirements, trunk was able to significantly outperform the status quo 3.0.x cluster across the load spectrum for this consistency level I am proceeding with other consistency levels and gathering additional data. So far I have noticed the following issues during these tests which I will gather more data on and follow up with in other tickets (and edit here with ticket numbers once I have them): # JDK Netty TLS appears significantly more CPU intensive than the previous Java Sockets implementation. [~norman] is taking a look from the Netty side and we can follow up and make sure we're not creating improperly (looking at the flamegraphs it looks like we may have a buffer sizing issue) # When a node was terminated and replaced, the new node appeared to sit for a very long time waiting for schema pulls to complete (I think it was waiting on the node it was replacing but I haven't fully debugged this). # Nodetool netstats doesn't report progress properly for the file count (percent, single file, and size still seem right; this is probably CASSANDRA-14192 # When we re-load NTS keyspaces from disk we throw warnings about "Ignoring Unrecognized strategy option" for datacenters that we are not in # After a node shuts down there is a burst of re-connections on the urgent port prior to actual shutdown (I _think_ this is pre-existing and I'm just noticing it because of the new logging) Also while setting up the {{LOCAL_QUORUM}} test I found the following trying to understand why I was seeing a higher number of blocking read repairs on the trunk cluster than the 30x cluster: # When I stop and start nodes, it appears that hints may not always playback. In particular the high blocking read repairs were coming from neighbors of the node I had restarted a few times to test tcnative openssl integration. I checked the neighbor's hints directories and sure enough there were pending hints there that were not playing at all (they had been there for over 8 hours and still not played). # Repair appears to fail on the default system_traces when run with {{-full}} and \{{-os} {noformat} cass-perf-trunk-14746--useast1c-i-00a32889835534b75:~$ nodetool repair -os -full -local [2019-06-23 23:29:30,210] Starting repair command #1 (bfbc7ba0-960e-11e9-b238-77fd1c2e9b1c), repairing keyspace perftest with repair options (parallelism: parallel, primary range: false, incremental: false, job threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], previewKind: NONE, # of ranges: 6, pull repair: false, force repair: false, optimise streams: true) [2019-06-23 23:52:08,248] Repair session c0573500-960e-11e9-b238-77fd1c2e9b1c for range [(384307168575030403,384307170010857891], (192153585909716729,384307168575030403]] finished (progress: 10%) [2019-06-23 23:52:26,393] Repair session c0307320-960e-11e9-b238-77fd1c2e9b1c for range [(1808575567,192153584473889241], (192153584473889241,192153585909716729]] finished (progress: 20%) [2019-06-23 23:52:28,298] Repair session c059f420-960e-11e9-b238-77fd1c2e9b1c for range [(576460752676171565,576460754111999053], (384307170010857891,576460752676171565]] finished (progress: 30%) [2019-06-23 23:52:28,302] Repair completed successfully [2019-06-23 23:52:28,310] Repair command #1 finished in 22 minutes 58 seconds [2019-06-23 23:52:28,331] Replication factor is 1. No repair is needed for keyspace 'system_auth' [2019-06-23 23:52:28,350] Starting repair command #2 (f52c1c70-9611-11e9-b238-77fd1c2e9b1c), repairing keyspace system_traces with repair options (parallelism: parallel, primary range: false, incremental: false, job threads: 1, ColumnFamilies: [], dataCenters: [us-east-1], hosts: [], previewKind: NONE, # of ranges: 2, pull repair: false, force repair: false, optimise streams: true) [2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not be empty [2019-06-23 23:52:28,351] Repair command #2 finished with error error: Repair job has failed with the error message: [2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not be empty. Check the logs on the repair participants for further details -- StackTrace -- java.lang.RuntimeException: Repair job has failed with the error message: [2019-06-23 23:52:28,351] Repair command #2 failed with error Endpoints can not be empty. Check the logs on the repair participants for further details at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:122) at
[jira] [Created] (CASSANDRA-15181) Ensure Nodes can Start and Stop
Joseph Lynch created CASSANDRA-15181: Summary: Ensure Nodes can Start and Stop Key: CASSANDRA-15181 URL: https://issues.apache.org/jira/browse/CASSANDRA-15181 Project: Cassandra Issue Type: Sub-task Components: Legacy/Streaming and Messaging, Test/benchmark Reporter: Joseph Lynch Assignee: Vinay Chella Let's load a cluster up with data and start killing nodes. We can do hard failures (node terminations) and soft failures (process kills) We plan to observe the following: * Can nodes successfully bootstrap? * How long does it take to bootstrap * What are the effects of TLS on and off (e.g. on stream time) * Are hints properly played after a node restart * Do nodes properly shutdown and start back up. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15181) Ensure Nodes can Start and Stop
[ https://issues.apache.org/jira/browse/CASSANDRA-15181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15181: - Complexity: Normal Priority: High (was: Normal) Change Category: Operability Status: Open (was: Triage Needed) > Ensure Nodes can Start and Stop > --- > > Key: CASSANDRA-15181 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15181 > Project: Cassandra > Issue Type: Sub-task > Components: Legacy/Streaming and Messaging, Test/benchmark >Reporter: Joseph Lynch >Assignee: Vinay Chella >Priority: High > > Let's load a cluster up with data and start killing nodes. We can do hard > failures (node terminations) and soft failures (process kills) We plan to > observe the following: > * Can nodes successfully bootstrap? > * How long does it take to bootstrap > * What are the effects of TLS on and off (e.g. on stream time) > * Are hints properly played after a node restart > * Do nodes properly shutdown and start back up. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-14764) Evaluate 12 Node Breaking Point, compression=none, encryption=none, coalescing=off
[ https://issues.apache.org/jira/browse/CASSANDRA-14764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch reassigned CASSANDRA-14764: Assignee: Vinay Chella > Evaluate 12 Node Breaking Point, compression=none, encryption=none, > coalescing=off > -- > > Key: CASSANDRA-14764 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14764 > Project: Cassandra > Issue Type: Sub-task > Components: Legacy/Streaming and Messaging >Reporter: Joseph Lynch >Assignee: Vinay Chella >Priority: Normal > Attachments: i-03341e1c52de6ea3e-after-queue-change.svg, > i-07cd92e844d66d801-after-queue-bound.svg, i-07cd92e844d66d801-hint-play.svg, > i-07cd92e844d66d801-uninlined-with-jvm-methods.svg, ttop.txt > > > *Setup:* > * Cassandra: 12 (2*6) node i3.xlarge AWS instance (4 cpu cores, 30GB ram) > running cassandra trunk off of jasobrown/14503 jdd7ec5a2 (Jasons patched > internode messaging branch) vs the same footprint running 3.0.17 > * Two datacenters with 100ms latency between them > * No compression, encryption, or coalescing turned on > *Test #1:* > ndbench sent 1.5k QPS at a coordinator level to one datacenter (RF=3*2 = 6 so > 3k global replica QPS) of 4kb single partition BATCH mutations at LOCAL_ONE. > This represents about 250 QPS per coordinator in the first datacenter or 60 > QPS per core. The goal was to observe P99 write and read latencies under > various QPS. > *Result:* > The good news is since the CASSANDRA-14503 changes, instead of keeping the > mutations on heap we put the message into hints instead and don't run out of > memory. The bad news is that the {{MessagingService-NettyOutbound-Thread's}} > would occasionally enter a degraded state where they would just spin on a > core. I've attached flame graphs showing the CPU state as [~jasobrown] > applied fixes to the {{OutboundMessagingConnection}} class. > *Follow Ups:* > [~jasobrown] has committed a number of fixes onto his > {{jasobrown/14503-collab}} branch including: > 1. Limiting the amount of time spent dequeuing messages if they are expired > (previously if messages entered the queue faster than we could dequeue them > we'd just inifinte loop on the consumer side) > 2. Don't call {{dequeueMessages}} from within {{dequeueMessages}} created > callbacks. > We're continuing to use CPU flamegraphs to figure out where we're looping and > fixing bugs as we find them. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_vs_30x_summary.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png, > trunk_vs_30x_summary.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: (was: trunk_252kcRPS-14kcWPS.png) > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_252kcRPS-14kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_252kcRPS-14kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_252kcRPS-14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_220kcRPS_14kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_220kcRPS_14kcWPS.png, > trunk_93500cRPS-14400cWPS.svg, trunk_vs_30x_125kcRPS_14kcWPS.png, > trunk_vs_30x_14kRPS_14kcWPS_load.png, trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Description: Tracks evaluating a 192 node cluster with compression and encryption on. Test setup at (reproduced below) [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] |Test Setup| | |Baseline|3.0.19 @d7d00036| |Candiate|trunk @abb0e177| | | | |Workload| | |Write size|4kb random| |Read size|4kb random| |Per Node Data|110GiB| |Generator|ndbench| |Key Distribution|Uniform| |SSTable Compr|Off| |Internode TLS|On (jdk)| |Internode Compr|On| |Compaction|LCS (320 MiB)| |Repair|Off| | | | |Hardware| | |Instance Type|i3.xlarge| |Deployment|96 us-east-1, 96 eu-west-1| |Region node count|96| | | | |OS Settings| | |IO scheduler|kyber| |Net qdisc|tc-fq| |readahead|32kb| |Java Version|OpenJDK 1.8.0_202 (Zulu)| | | | was: Tracks evaluating a 192 node cluster with compression and encryption on. Test setup at (reproduced below) [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] |Test Setup| | |Baseline|3.0.19 @d7d00036| |Candiate|trunk @abb0e177| | | | |Workload| | |Write size|4kb random| |Read size|4kb random| |Per Node Data|110GiB| |Generator|ndbench| |Key Distribution|Uniform| |SSTable Compr|Off| |Internode TLS|On| |Internode Compr|On| |Compaction|LCS (320 MiB)| |Repair|Off| | | | |Hardware| | |Instance Type|i3.xlarge| |Deployment|96 us-east-1, 96 eu-west-1| |Region node count|96| | | | |OS Settings| | |IO scheduler|kyber| |Net qdisc|tc-fq| |readahead|32kb| |Java Version|OpenJDK 1.8.0_202 (Zulu)| | | | > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_93500cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On (jdk)| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_22000cRPS-14400cWPS-openssl.svg trunk_22000cRPS-14400cWPS-jdk.svg > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_22000cRPS-14400cWPS-jdk.svg, > trunk_22000cRPS-14400cWPS-openssl.svg, trunk_93500cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870615#comment-16870615 ] Joseph Lynch commented on CASSANDRA-15175: -- We just use the default protocol and cipher suite via netty's SslContextBuilder. I believe that means {{TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384}} if I understand the following logging properly: {noformat} $ grep "Default cipher " debug.log -C 2 DEBUG [main] 2019-06-23 18:24:31,158 Slf4JLogger.java:71 - netty-tcnative not in the classpath; OpenSslEngine will be unavailable. DEBUG [main] 2019-06-23 18:24:31,735 Slf4JLogger.java:76 - Default protocols (JDK): [TLSv1.2, TLSv1.1, TLSv1] DEBUG [main] 2019-06-23 18:24:31,736 Slf4JLogger.java:76 - Default cipher suites (JDK): [TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_RSA_WITH_AES_128_GCM_SHA256, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_256_CBC_SHA] {noformat} > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15175) Evaluate 200 node, compression=on, encryption=all
[ https://issues.apache.org/jira/browse/CASSANDRA-15175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-15175: - Attachment: trunk_187kcRPS_14kcWPS.png > Evaluate 200 node, compression=on, encryption=all > - > > Key: CASSANDRA-15175 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15175 > Project: Cassandra > Issue Type: Sub-task > Components: Test/benchmark >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Normal > Attachments: 30x_14400cRPS-14400cWPS.svg, > trunk_14400cRPS-14400cWPS.svg, trunk_187000cRPS-14400cWPS.svg, > trunk_187kcRPS_14kcWPS.png, trunk_93500cRPS-14400cWPS.svg, > trunk_vs_30x_125kcRPS_14kcWPS.png, trunk_vs_30x_14kRPS_14kcWPS_load.png, > trunk_vs_30x_14kcRPS_14kcWPS.png, > trunk_vs_30x_14kcRPS_14kcWPS_schedstat_delays.png, > trunk_vs_30x_156kcRPS_14kcWPS.png, trunk_vs_30x_24kcRPS_14kcWPS.png, > trunk_vs_30x_24kcRPS_14kcWPS_load.png, trunk_vs_30x_31kcRPS_14kcWPS.png, > trunk_vs_30x_62kcRPS_14kcWPS.png, trunk_vs_30x_93kcRPS_14kcWPS.png > > > Tracks evaluating a 192 node cluster with compression and encryption on. > Test setup at (reproduced below) > [https://docs.google.com/spreadsheets/d/1Vq_wC2q-rcG7UWim-t2leZZ4GgcuAjSREMFbG0QGy20/edit#gid=1336583053] > > |Test Setup| | > |Baseline|3.0.19 > @d7d00036| > |Candiate|trunk > @abb0e177| > | | | > |Workload| | > |Write size|4kb random| > |Read size|4kb random| > |Per Node Data|110GiB| > |Generator|ndbench| > |Key Distribution|Uniform| > |SSTable Compr|Off| > |Internode TLS|On| > |Internode Compr|On| > |Compaction|LCS (320 MiB)| > |Repair|Off| > | | | > |Hardware| | > |Instance Type|i3.xlarge| > |Deployment|96 us-east-1, 96 eu-west-1| > |Region node count|96| > | | | > |OS Settings| | > |IO scheduler|kyber| > |Net qdisc|tc-fq| > |readahead|32kb| > |Java Version|OpenJDK 1.8.0_202 (Zulu)| > | | | -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org