[
https://issues.apache.org/jira/browse/CASSANDRA-21459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jon Haddad reassigned CASSANDRA-21459:
--------------------------------------
Assignee: Jon Haddad
> Eliminate per-row enum array allocation in cursor compaction clustering
> deserialization
> ---------------------------------------------------------------------------------------
>
> Key: CASSANDRA-21459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21459
> Project: Apache Cassandra
> Issue Type: Sub-task
> Components: Local/Compaction
> Reporter: Jon Haddad
> Assignee: Jon Haddad
> Priority: Normal
>
> {{ClusteringPrefix.Kind.values()}} is called once per row read
> ({{ClusteringDescriptor.loadClustering}}, both call sites) and once per range
> tombstone marker written ({{SSTableCursorWriter.writeRangeTombstone}}). Java
> clones the enum constants array on every {{values()}} call, so the cursor
> compaction path — which is intended to be allocation-free per row — allocates
> a fresh ~40-byte {{Kind[]}} for every row read from every source sstable and
> for every range tombstone marker written.
> The fix caches the array once in a {{static final}} field
> ({{Kind.ALL_KINDS}}) and indexes into the shared copy at the three cursor
> hot-path sites.
> Found via JFR allocation profiling
> ({{jdk.ObjectAllocationInNewTLAB}}/{{OutsideTLAB}} with stack traces) during
> cursor compaction: with the patch, the {{ClusteringPrefix$Kind[]}} allocation
> site disappears from the profile entirely. In an allocation-scaling
> measurement comparing a 1,200-row compaction against a 12,000-row compaction,
> allocation growth drops from 1,487,448 to 449,488 bytes; the remainder is
> attributable to test-environment {{Ref}} debug tracking and chunk-cache
> machinery rather than cursor code.
> The same {{values()}} pattern exists on the iterator deserialization path
> ({{ClusteringPrefix.serializer}}, three sites). Those are left unchanged here
> to keep this patch minimal and scoped to the cursor path; they can be
> addressed separately if desired.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]