[ 
https://issues.apache.org/jira/browse/CASSANDRA-19477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17830401#comment-17830401
 ] 

Stefan Miklosovic commented on CASSANDRA-19477:
-----------------------------------------------

[CASSANDRA-19477-trunk|https://github.com/instaclustr/cassandra/tree/CASSANDRA-19477-trunk]
{noformat}
java17_pre-commit_tests                         
  ✓ j17_build                                        3m 57s
  ✓ j17_cqlsh_dtests_py311                            7m 2s
  ✓ j17_cqlsh_dtests_py311_vnode                     7m 32s
  ✓ j17_cqlsh_dtests_py38                            6m 50s
  ✓ j17_cqlsh_dtests_py38_vnode                      7m 16s
  ✓ j17_cqlshlib_cython_tests                        7m 39s
  ✓ j17_cqlshlib_tests                               6m 31s
  ✓ j17_dtests                                      34m 33s
  ✓ j17_dtests_vnode                                35m 10s
  ✓ j17_jvm_dtests_latest_vnode_repeat              26m 31s
  ✓ j17_jvm_dtests_repeat                            28m 7s
  ✓ j17_unit_tests                                  16m 26s
  ✓ j17_unit_tests_repeat                            0m 18s
  ✓ j17_utests_latest                               13m 59s
  ✓ j17_utests_latest_repeat                         0m 13s
  ✓ j17_utests_oa_repeat                             0m 29s
  ✕ j17_dtests_latest                               34m 36s
      offline_tools_test.TestOfflineTools test_sstablelevelreset
      offline_tools_test.TestOfflineTools test_sstableofflinerelevel
      configuration_test.TestConfiguration test_change_durable_writes
      configuration_test.TestConfiguration test_change_durable_writes
  ✕ j17_jvm_dtests                                  27m 59s
      
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN TIMEOUTED
  ✕ j17_jvm_dtests_latest_vnode                     22m 44s
      junit.framework.TestSuite 
org.apache.cassandra.fuzz.harry.integration.model.InJVMTokenAwareExecutorTest 
TIMEOUTED
  ✕ j17_utests_oa                                   13m 58s
      org.apache.cassandra.db.compaction.CompactionsBytemanTest 
testSSTableNotEnoughDiskSpaceForCompactionGetsDropped
java17_separate_tests                            
java11_pre-commit_tests                         
  ✓ j11_build                                        7m 57s
  ✓ j11_cqlsh_dtests_py311                            7m 7s
  ✓ j11_cqlsh_dtests_py311_vnode                    10m 13s
  ✓ j11_cqlsh_dtests_py38                             8m 1s
  ✓ j11_cqlsh_dtests_py38_vnode                     10m 25s
  ✓ j11_cqlshlib_cython_tests                        7m 28s
  ✓ j11_cqlshlib_tests                               9m 40s
  ✓ j11_dtests_vnode                                36m 58s
  ✓ j11_jvm_dtests_latest_vnode                     25m 28s
  ✓ j11_jvm_dtests_latest_vnode_repeat              29m 22s
  ✓ j11_jvm_dtests_repeat                            28m 7s
  ✓ j11_unit_tests                                  15m 17s
  ✓ j11_unit_tests_repeat                            0m 30s
  ✓ j11_utests_latest                               16m 56s
  ✓ j11_utests_latest_repeat                         0m 34s
  ✓ j11_utests_oa                                   13m 58s
  ✓ j11_utests_oa_repeat                              1m 0s
  ✓ j11_utests_system_keyspace_directory             18m 1s
  ✓ j11_utests_system_keyspace_directory_repeat      3m 39s
  ✓ j17_cqlsh_dtests_py311                            7m 6s
  ✓ j17_cqlsh_dtests_py311_vnode                     7m 27s
  ✓ j17_cqlsh_dtests_py38                            6m 51s
  ✓ j17_cqlsh_dtests_py38_vnode                      7m 14s
  ✓ j17_cqlshlib_cython_tests                        7m 38s
  ✓ j17_cqlshlib_tests                               6m 57s
  ✓ j17_dtests                                      32m 21s
  ✓ j17_dtests_vnode                                34m 24s
  ✓ j17_jvm_dtests_latest_vnode                     22m 45s
  ✓ j17_jvm_dtests_latest_vnode_repeat              26m 32s
  ✓ j17_jvm_dtests_repeat                           28m 21s
  ✓ j17_unit_tests_repeat                            0m 16s
  ✓ j17_utests_latest                               15m 34s
  ✓ j17_utests_latest_repeat                         0m 36s
  ✓ j17_utests_oa                                   13m 43s
  ✓ j17_utests_oa_repeat                             0m 17s
  ✕ j11_dtests                                      37m 26s
      pushed_notifications_test.TestPushedNotifications 
test_move_single_node_localhost
  ✕ j11_dtests_latest                               40m 40s
      bootstrap_test.TestBootstrap test_bootstrap_with_reset_bootstrap_state
      offline_tools_test.TestOfflineTools test_sstablelevelreset
      offline_tools_test.TestOfflineTools test_sstableofflinerelevel
      configuration_test.TestConfiguration test_change_durable_writes
  ✕ j11_jvm_dtests                                  27m 33s
      org.apache.cassandra.fuzz.ring.ConsistentBootstrapTest 
coordinatorIsBehindTest
  ✕ j11_simulator_dtests                            10m 37s
      org.apache.cassandra.simulator.test.HarrySimulatorTest test
      org.apache.cassandra.simulator.test.ShortPaxosSimulationTest 
simulationTest
  ✕ j17_dtests_latest                               36m 31s
      bootstrap_test.TestBootstrap test_bootstrap_with_reset_bootstrap_state
      offline_tools_test.TestOfflineTools test_sstablelevelreset
      offline_tools_test.TestOfflineTools test_sstableofflinerelevel
      configuration_test.TestConfiguration test_change_durable_writes
  ✕ j17_jvm_dtests                                  26m 11s
      org.apache.cassandra.fuzz.ring.ConsistentBootstrapTest 
coordinatorIsBehindTest
      
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testOptionalMtlsModeDoNotAllowNonSSLConnections
      
org.apache.cassandra.distributed.test.NativeTransportEncryptionOptionsTest 
testEndpointVerificationEnabledIpNotInSAN
  ✕ j17_unit_tests                                  14m 20s
      org.apache.cassandra.db.guardrails.GuardrailMaximumTimestampTest 
testEnabledWarn
java11_separate_tests                            
{noformat}

[java17_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4067/workflows/467f088d-2bb6-4e61-878b-5931043bc654]
[java17_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4067/workflows/5f9cf2c6-369d-4257-8958-2288a70c7ed7]
[java11_pre-commit_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4067/workflows/78bfce10-f3d2-433c-bdc0-7841a7e46244]
[java11_separate_tests|https://app.circleci.com/pipelines/github/instaclustr/cassandra/4067/workflows/66a60733-9c8a-4c58-893f-e699bae36cda]


> Do not go to disk to get HintsStore.getTotalFileSize
> ----------------------------------------------------
>
>                 Key: CASSANDRA-19477
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19477
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Consistency/Hints
>            Reporter: Jon Haddad
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 4.1.x, 5.0-rc, 5.x
>
>         Attachments: flame-cassandra0-patched-2024-03-25_00-40-47.html, 
> flame-cassandra0-release-2024-03-25_00-16-44.html, flamegraph.cpu.html, 
> image-2024-03-24-17-57-32-560.png, image-2024-03-24-18-08-36-918.png, 
> image-2024-03-24-18-16-50-370.png, image-2024-03-24-18-17-48-334.png, 
> image-2024-03-24-18-20-07-734.png
>
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> When testing a cluster with more requests than it could handle, I noticed 
> significant CPU time (25%) spent in HintsStore.getTotalFileSize.  Here's what 
> I'm seeing from profiling:
> 10% of CPU time spent in HintsDescriptor.fileName which only does this:
>  
> {noformat}
> return String.format("%s-%s-%s.hints", hostId, timestamp, version);{noformat}
> At a bare minimum here we should create this string up front with the host 
> and version and eliminate 2 of the 3 substitutions, but I think it's probably 
> faster to use a StringBuilder and avoid the underlying regular expression 
> altogether.
> 12% of the time is spent in org.apache.cassandra.io.util.File.length.  It 
> looks like this is called once for each hint file on disk for each host we're 
> hinting to.  In the case of an overloaded cluster, this is significant.  It 
> would be better if we were to track the file size in memory for each hint 
> file and reference that rather than go to the filesystem.
> These fairly small changes should make Cassandra more reliable when under 
> load spikes.
> CPU Flame graph attached.
> I only tested this in 4.1 but it looks like this is present up to trunk.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to