[jira] [Commented] (CASSANDRA-20465) Reduce runtime overhead of org.apache.cassandra.schema.TableMetadataRef#get usage

Dmitry Konstantinov (Jira) Wed, 09 Jul 2025 03:21:48 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-20465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18004082#comment-18004082
 ]


Dmitry Konstantinov commented on CASSANDRA-20465:
-------------------------------------------------

[https://github.com/apache/cassandra/pull/4232] contains an initial draft of 
the changes:
 # I moved out table metadata retrieval to init part of flush and compaction, 
to avoid the retrieval in a loop
 # Regarding ColumnFamilyStore#isRowCacheEnabled and 
KeyCacheSupport.getCacheKey - I've decided to not touch this part as of now 
because there is a potential lock contention related to get capacity, which is 
better to resolve separately (CASSANDRA-19429)
 # I've implemented a simple version-based caching logic in 
TableMetadataRef/KeyspaceMetadataRef  to avoid the complexity related to the 
update order issues
 # Merging of Accord logic have introduced one more similar place 
(ConsensusMigrationMutationHelper.validateSafeToExecuteNonTransactionally):
!image-2025-07-09-11-04-45-311.png|width=500!  
As of now I have moved ColumnFamilyStore.getIfExists(tableId) invocation into 
if block to not do it when it is actually not needed (plain write scenarios)

 # I have an open question regarding 
org.apache.cassandra.schema.Schema#getKeyspaceInstance/getKeyspaceMetadata 
method logic, they use methods like 
SchemaConstants.isLocalSystemKeyspace(keyspaceName) and 
SchemaConstants.isVirtualSystemKeyspace(keyspaceName) which use toLowerCase to 
make it case-insensitive but just after such checks we retrieve information 
from maps using plain String equals logic, so we assume that the values are low 
case actually and toLowerCase looks as unnecessary overhead here.
So, I am thinking here about introducing a simplified version of 
SchemaConstants.isVirtualSystemKeyspace/isLocalSystemKeyspace methods which do 
not use toLowerCase logic.
An alternative, more complicated, approach can be introducing KeyspaceName 
flyweight object with will store name using low case explicitly, it can be also 
interned for faster comparison.

Async profiler CPU profiles before and after changes collected for a single 
node test:
{code:java}
cassandra-stress "write n=10m" -rate threads=100 {code}
[^write_before.html] vs [^write_after.html]

> Reduce runtime overhead of org.apache.cassandra.schema.TableMetadataRef#get 
> usage 
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-20465
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20465
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Cluster/Schema, Transactional Cluster Metadata
>            Reporter: Dmitry Konstantinov
>            Assignee: Dmitry Konstantinov
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: 5.1_cpu.html, image-2025-03-20-22-43-09-044.png, 
> image-2025-03-20-22-46-34-571.png, image-2025-03-20-22-52-40-818.png, 
> image-2025-03-20-22-53-25-001.png, image-2025-03-20-22-56-31-298.png, 
> image-2025-03-20-22-58-00-837.png, image-2025-07-09-11-04-45-311.png, 
> write_after.html, write_before.html
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Before TCM changes org.apache.cassandra.schema.TableMetadataRef#get 
> invocation was cheap (it just returned a field value), now it does a lookup 
> from Schema every time with a search a BTree + nota very cheap check is it a 
> system keyspace.
> Diff:
> !image-2025-03-20-22-43-09-044.png|width=300!
> We have several places in code which uses TableMetadataRef#get and assume a 
> low cost for it.
> Currently we have about 0.93% of CPU spent for this operation in total. If we 
> check percentage for (compaction + flush) threads - it is 5.4% and 9.4% for 
> compaction only ( [^5.1_cpu.html] ).
> Not sure if it is easy to reduce overheads in TableMetadataRef#get itself but 
> we also can avoid them in many cases by a small adjustment of a logic on an 
> invoker side to avoid too frequent usage of TableMetadataRef#get:
> 1) org.apache.cassandra.db.ColumnFamilyStore#isRowCacheEnabled - by default 
> row cache is fully disabled - probably it is better to check if it is enabled 
> as a first condition:
> !image-2025-03-20-22-46-34-571.png|width=300!
> 2) org.apache.cassandra.db.memtable.TrieMemtable#getFlushSet - we can lookup 
> metadata once at the beginning of getFlushSet logic
> !image-2025-03-20-22-52-40-818.png|width=300! 
> !image-2025-03-20-22-53-25-001.png|width=300!
> 3) org.apache.cassandra.io.sstable.SSTableIdentityIterator.create - to think 
> if we can retrieve TableMetadata at the beginning a compaction and use during 
> it..
> !image-2025-03-20-22-56-31-298.png|width=300!
> 4) org.apache.cassandra.io.sstable.keycache.KeyCacheSupport.getCacheKey - to 
> think if we can retrieve only needed id/indexName fields once (at leas t and 
> id does not look like a dynamically changed parameter ..)
> !image-2025-03-20-22-58-00-837.png|width=300!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-20465) Reduce runtime overhead of org.apache.cassandra.schema.TableMetadataRef#get usage

Reply via email to