[
https://issues.apache.org/jira/browse/CASSANDRA-20465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitry Konstantinov updated CASSANDRA-20465:
--------------------------------------------
Description:
Before TCM changes org.apache.cassandra.schema.TableMetadataRef#get invocation
was cheap (it just returned a field value), now it does a lookup from Schema
every time with a search a BTree + nota very cheap check is it a system
keyspace.
Diff:
!image-2025-03-20-22-43-09-044.png|width=300!
We have several places in code which uses TableMetadataRef#get and assume a low
cost for it.
Currently we have about 0.93% of CPU spent for this operation in total. If we
check percentage for (compaction + flush) threads - it is 5.4% (
[^5.1_cpu.html] ).
Not sure if it is easy to reduce overheads in TableMetadataRef#get itself but
we also can avoid them in many cases by a small adjustment of a logic on an
invoker side:
1) org.apache.cassandra.db.ColumnFamilyStore#isRowCacheEnabled - by default row
cache is fully disabled - probably it is better to check if it is enabled as a
first condition:
!image-2025-03-20-22-46-34-571.png|width=300!
2) org.apache.cassandra.db.memtable.TrieMemtable#getFlushSet - we can lookup
metadata once at the beginning of getFlushSet logic
!image-2025-03-20-22-52-40-818.png|width=300!
!image-2025-03-20-22-53-25-001.png|width=300!
3) org.apache.cassandra.io.sstable.SSTableIdentityIterator.create - to think if
we can keep TableMetadata during a compaction..
!image-2025-03-20-22-56-31-298.png|width=300!
4) org.apache.cassandra.io.sstable.keycache.KeyCacheSupport.getCacheKey - to
think if we can retrieve only needed id/indexName fields once ..
!image-2025-03-20-22-58-00-837.png|width=300!
was:
Before TCM changes org.apache.cassandra.schema.TableMetadataRef#get invocation
was cheap (it just returned a field value), now it does a lookup from Schema
every time with a search a BTree + nota very cheap check is it a system
keyspace.
Diff:
!image-2025-03-20-22-43-09-044.png|width=300!
We have several places in code which uses TableMetadataRef#get and assume a low
cost for it.
Currently we have about 0.93% of CPU spent for this operation in total. If we
check only compaction + flush scope - 5.4% ( [^5.1_cpu.html] ).
Not sure if it is easy to reduce overheads in TableMetadataRef#get itself but
we also can avoid them in many cases by a small adjustment of a logic on an
invoker side:
1) org.apache.cassandra.db.ColumnFamilyStore#isRowCacheEnabled - by default row
cache is fully disabled - probably it is better to check if it is enabled as a
first condition:
!image-2025-03-20-22-46-34-571.png|width=300!
2) org.apache.cassandra.db.memtable.TrieMemtable#getFlushSet - we can lookup
metadata once at the beginning of getFlushSet logic
!image-2025-03-20-22-52-40-818.png|width=300!
!image-2025-03-20-22-53-25-001.png|width=300!
3) org.apache.cassandra.io.sstable.SSTableIdentityIterator.create - to think if
we can keep TableMetadata during a compaction..
!image-2025-03-20-22-56-31-298.png|width=300!
4) org.apache.cassandra.io.sstable.keycache.KeyCacheSupport.getCacheKey - to
think if we can retrieve only needed id/indexName fields once ..
!image-2025-03-20-22-58-00-837.png|width=300!
> Reduce runtime overhead of org.apache.cassandra.schema.TableMetadataRef#get
> usage
> ----------------------------------------------------------------------------------
>
> Key: CASSANDRA-20465
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20465
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Cluster/Schema, Transactional Cluster Metadata
> Reporter: Dmitry Konstantinov
> Assignee: Dmitry Konstantinov
> Priority: Normal
> Fix For: 5.x
>
> Attachments: 5.1_cpu.html, image-2025-03-20-22-43-09-044.png,
> image-2025-03-20-22-46-34-571.png, image-2025-03-20-22-52-40-818.png,
> image-2025-03-20-22-53-25-001.png, image-2025-03-20-22-56-31-298.png,
> image-2025-03-20-22-58-00-837.png
>
>
> Before TCM changes org.apache.cassandra.schema.TableMetadataRef#get
> invocation was cheap (it just returned a field value), now it does a lookup
> from Schema every time with a search a BTree + nota very cheap check is it a
> system keyspace.
> Diff:
> !image-2025-03-20-22-43-09-044.png|width=300!
> We have several places in code which uses TableMetadataRef#get and assume a
> low cost for it.
> Currently we have about 0.93% of CPU spent for this operation in total. If we
> check percentage for (compaction + flush) threads - it is 5.4% (
> [^5.1_cpu.html] ).
> Not sure if it is easy to reduce overheads in TableMetadataRef#get itself but
> we also can avoid them in many cases by a small adjustment of a logic on an
> invoker side:
> 1) org.apache.cassandra.db.ColumnFamilyStore#isRowCacheEnabled - by default
> row cache is fully disabled - probably it is better to check if it is enabled
> as a first condition:
> !image-2025-03-20-22-46-34-571.png|width=300!
> 2) org.apache.cassandra.db.memtable.TrieMemtable#getFlushSet - we can lookup
> metadata once at the beginning of getFlushSet logic
> !image-2025-03-20-22-52-40-818.png|width=300!
> !image-2025-03-20-22-53-25-001.png|width=300!
> 3) org.apache.cassandra.io.sstable.SSTableIdentityIterator.create - to think
> if we can keep TableMetadata during a compaction..
> !image-2025-03-20-22-56-31-298.png|width=300!
> 4) org.apache.cassandra.io.sstable.keycache.KeyCacheSupport.getCacheKey - to
> think if we can retrieve only needed id/indexName fields once ..
> !image-2025-03-20-22-58-00-837.png|width=300!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]