Goodness Ayinmode created CASSANDRA-19959:
---------------------------------------------
Summary: Out of memory (OOM) risks due to unbound growth in
collections
Key: CASSANDRA-19959
URL: https://issues.apache.org/jira/browse/CASSANDRA-19959
Project: Cassandra
Issue Type: Improvement
Reporter: Goodness Ayinmode
I noticed some methods with collections that could cause OOM issues. For
example in [
Keyspace.getValidColumnFamilies,|https://github.com/apache/cassandra/blob/02f38208b15b119b3038482c5e36f05c14e2a4cf/src/java/org/apache/cassandra/db/Keyspace.java#L707]
this method retrieves a set of valid ColumnFamilyStore objects based on the
provided column family name. When cfNames.length == 0, it iterates over all the
column family stores returned by getColumnFamilyStores() and then adds each to
the valid set. For each cfstore, If autoAddIndexes is true,
getIndexColumnFamilyStores(cfStore) is called and will add additional index
column family stores to the set (valid). Since the set grows in size as more
column families and indexes are added, when a large number of column families
or indexes are all added at once, there is a potential for significant memory
consumption increasing the risk of OOM errors.
This risk also appears in
[Sets$Literal.prepare|https://github.com/apache/cassandra/blob/662ce36a7be5a03560bb0395a4bced09d3c34a0c/src/java/org/apache/cassandra/cql3/Sets.java#L136],
[PendingAntiCompaction$AcquisitionCallback.apply|https://github.com/apache/cassandra/blob/662ce36a7be5a03560bb0395a4bced09d3c34a0c/src/java/org/apache/cassandra/db/repair/PendingAntiCompaction.java#L291]
,
[RepairSession.start|https://github.com/apache/cassandra/blob/662ce36a7be5a03560bb0395a4bced09d3c34a0c/src/java/org/apache/cassandra/repair/RepairSession.java#L272],
[RepairedState.addAll|https://github.com/apache/cassandra/blob/02f38208b15b119b3038482c5e36f05c14e2a4cf/src/java/org/apache/cassandra/repair/consistent/RepairedState.java#L208],
[SEPExecutor.addTask|https://github.com/apache/cassandra/blob/662ce36a7be5a03560bb0395a4bced09d3c34a0c/src/java/org/apache/cassandra/concurrent/SEPExecutor.java#L119],
[SystemDistributedKeyspace.startRepairs|https://github.com/apache/cassandra/blob/662ce36a7be5a03560bb0395a4bced09d3c34a0c/src/java/org/apache/cassandra/schema/SystemDistributedKeyspace.java#L226],
[SingleTableUpdatesCollector.toMutations|https://github.com/apache/cassandra/blob/02f38208b15b119b3038482c5e36f05c14e2a4cf/src/java/org/apache/cassandra/cql3/statements/SingleTableUpdatesCollector.java#L95],
[AbstractReplicaCollection.filter|https://github.com/apache/cassandra/blob/02f38208b15b119b3038482c5e36f05c14e2a4cf/src/java/org/apache/cassandra/locator/AbstractReplicaCollection.java#L504],
[BatchMessage.execute|https://github.com/apache/cassandra/blob/02f38208b15b119b3038482c5e36f05c14e2a4cf/src/java/org/apache/cassandra/transport/messages/BatchMessage.java#L173]
and
[SystemKeyspace.tokensAsSet|https://github.com/apache/cassandra/blob/ea801625f64bdebf78cf03634e30a1fde037f965/src/java/org/apache/cassandra/db/SystemKeyspace.java#L887]
with these methods having collections that show potential unbounded growth and
can cause OOM issues.
If processing all elements at once is not essential, an optimization could be
to batch the processing of elements, by splitting the elements into batches of
smaller chunks and accumulating the results in values per batch or assigning
fixed sizes for the collections when initializing these collections.
Please let me know if my analysis is wrong, or if you have any comments
regarding the optimization suggestion.
Thank you
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]