Jon Haddad created CASSANDRA-19477:
--------------------------------------
Summary: Significant CPU overhead in HintsStore.getTotalFileSize
Key: CASSANDRA-19477
URL: https://issues.apache.org/jira/browse/CASSANDRA-19477
Project: Cassandra
Issue Type: Bug
Components: Consistency/Hints
Reporter: Jon Haddad
Attachments: flamegraph.cpu.html
When testing a cluster with more requests than it could handle, I noticed
significant CPU time (25%) spent in HintsStore.getTotalFileSize. Here's what
I'm seeing from profiling:
10% of CPU time spent in HintsDescriptor.fileName which only does this:
{noformat}
return String.format("%s-%s-%s.hints", hostId, timestamp, version);{noformat}
At a bare minimum here we should create this string up front with the host and
version and eliminate 2 of the 3 substitutions, but I think it's probably
faster to use a StringBuilder and avoid the underlying regular expression
altogether.
12% of the time is spent in org.apache.cassandra.io.util.File.length. It looks
like this is called once for each hint file on disk for each host we're hinting
to. In the case of an overloaded cluster, this is significant. It would be
better if we were to track the file size in memory for each hint file and
reference that rather than go to the filesystem.
These fairly small changes should make Cassandra more reliable when under load
spikes.
CPU Flame graph attached.
I only tested this in 4.1 but it looks like this is present up to trunk.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]