[ https://issues.apache.org/jira/browse/CASSANDRA-19477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17829945#comment-17829945 ]
Stefan Miklosovic commented on CASSANDRA-19477: ----------------------------------------------- [CASSANDRA-19477-5.0|https://github.com/instaclustr/cassandra/tree/CASSANDRA-19477-5.0] {noformat} java17_pre-commit_tests ✓ j17_build 3m 49s ✓ j17_cqlsh_dtests_py311 6m 4s ✓ j17_cqlsh_dtests_py311_vnode 6m 13s ✓ j17_cqlsh_dtests_py38 6m 3s ✓ j17_cqlsh_dtests_py38_vnode 6m 20s ✓ j17_cqlshlib_cython_tests 7m 25s ✓ j17_cqlshlib_tests 6m 27s ✓ j17_dtests 32m 3s ✓ j17_jvm_dtests 23m 21s ✓ j17_jvm_dtests_latest_vnode 14m 2s ✓ j17_jvm_dtests_latest_vnode_repeat 40m 42s ✓ j17_jvm_dtests_repeat 39m 52s ✓ j17_unit_tests 17m 44s ✓ j17_unit_tests_repeat 2m 28s ✓ j17_utests_latest 15m 45s ✓ j17_utests_latest_repeat 2m 34s ✓ j17_utests_oa_repeat 0m 13s ✕ j17_dtests_latest 34m 34s configuration_test.TestConfiguration test_change_durable_writes ✕ j17_dtests_vnode 32m 7s ✕ j17_utests_oa 15m 57s org.apache.cassandra.net.ConnectionTest testTimeout java17_separate_tests java11_pre-commit_tests ✓ j11_build 6m 59s ✓ j11_cqlsh_dtests_py311 9m 42s ✓ j11_cqlsh_dtests_py311_vnode 7m 41s ✓ j11_cqlsh_dtests_py38 8m 20s ✓ j11_cqlsh_dtests_py38_vnode 7m 59s ✓ j11_cqlshlib_cython_tests 11m 40s ✓ j11_cqlshlib_tests 9m 16s ✓ j11_dtests 38m 39s ✓ j11_dtests_vnode 35m 20s ✓ j11_jvm_dtests 23m 29s ✓ j11_jvm_dtests_latest_vnode 14m 40s ✓ j11_jvm_dtests_latest_vnode_repeat 47m 40s ✓ j11_jvm_dtests_repeat 41m 59s ✓ j11_simulator_dtests 5m 54s ✓ j11_unit_tests 19m 38s ✓ j11_unit_tests_repeat 3m 26s ✓ j11_utests_latest 21m 24s ✓ j11_utests_latest_repeat 3m 46s ✓ j11_utests_oa 21m 15s ✓ j11_utests_oa_repeat 8m 28s ✓ j11_utests_system_keyspace_directory 16m 17s ✓ j11_utests_system_keyspace_directory_repeat 3m 58s ✓ j17_cqlsh_dtests_py311 5m 56s ✓ j17_cqlsh_dtests_py311_vnode 6m 46s ✓ j17_cqlsh_dtests_py38 6m 9s ✓ j17_cqlsh_dtests_py38_vnode 6m 55s ✓ j17_cqlshlib_cython_tests 7m 32s ✓ j17_cqlshlib_tests 6m 27s ✓ j17_dtests 33m 37s ✓ j17_dtests_vnode 32m 31s ✓ j17_jvm_dtests 23m 16s ✓ j17_jvm_dtests_latest_vnode 13m 28s ✓ j17_jvm_dtests_latest_vnode_repeat 40m 42s ✓ j17_jvm_dtests_repeat 41m 28s ✓ j17_unit_tests 14m 40s ✓ j17_unit_tests_repeat 0m 16s ✓ j17_utests_latest_repeat 0m 14s ✓ j17_utests_oa 15m 51s ✓ j17_utests_oa_repeat 7m 58s ✕ j11_dtests_latest 35m 2s configuration_test.TestConfiguration test_change_durable_writes ✕ j17_dtests_latest 33m 55s configuration_test.TestConfiguration test_change_durable_writes ✕ j17_utests_latest 16m 38s org.apache.cassandra.cql3.validation.operations.SelectTest testCreatingUDFWithSameNameAsBuiltin_PrefersCompatibleArgs org.apache.cassandra.cql3.validation.operations.SelectTest testCreatingUDFWithSameNameAsBuiltin_FullyQualifiedFunctionNameWorks java11_separate_tests {noformat} [java17_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4063/workflows/15b9eab1-70d5-4490-836a-49cc9169c2aa] [java17_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4063/workflows/dfe8ff72-6036-497d-a520-7f68ef35ce43] [java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4063/workflows/8ad3afe0-612f-4911-a07e-c874cfaed3b9] [java11_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4063/workflows/ddfc2765-f3d6-4244-8ee4-c07f2b413db6] > Do not go to disk to get HintsStore.getTotalFileSize > ---------------------------------------------------- > > Key: CASSANDRA-19477 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19477 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Hints > Reporter: Jon Haddad > Assignee: Stefan Miklosovic > Priority: Normal > Fix For: 4.1.x, 5.0-rc, 5.x > > Attachments: flamegraph.cpu.html > > Time Spent: 4h 10m > Remaining Estimate: 0h > > When testing a cluster with more requests than it could handle, I noticed > significant CPU time (25%) spent in HintsStore.getTotalFileSize. Here's what > I'm seeing from profiling: > 10% of CPU time spent in HintsDescriptor.fileName which only does this: > > {noformat} > return String.format("%s-%s-%s.hints", hostId, timestamp, version);{noformat} > At a bare minimum here we should create this string up front with the host > and version and eliminate 2 of the 3 substitutions, but I think it's probably > faster to use a StringBuilder and avoid the underlying regular expression > altogether. > 12% of the time is spent in org.apache.cassandra.io.util.File.length. It > looks like this is called once for each hint file on disk for each host we're > hinting to. In the case of an overloaded cluster, this is significant. It > would be better if we were to track the file size in memory for each hint > file and reference that rather than go to the filesystem. > These fairly small changes should make Cassandra more reliable when under > load spikes. > CPU Flame graph attached. > I only tested this in 4.1 but it looks like this is present up to trunk. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org