[
https://issues.apache.org/jira/browse/CASSANDRA-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14708330#comment-14708330
]
Benedict commented on CASSANDRA-9459:
-------------------------------------
A nit question relating to this ticket: In each of the main Transaction
implementations we seem to be making unnecessary use of streams. These are a
"hot" code paths on systems using indexes, and I'm not convinced we should be
incurring the GC burden, especially when clarity isn't meaningfully improved.
Compare
{code}
try (OpOrder.Group opGroup = Keyspace.writeOrder.start())
{
Index.Indexer[] indexers = Arrays.stream(indexes)
.map(i -> i.indexerFor(key,
nowInSec, opGroup, Type.CLEANUP))
.toArray(Index.Indexer[]::new);
Arrays.stream(indexers).forEach(Index.Indexer::begin);
if (partitionDelete != null)
Arrays.stream(indexers).forEach(indexer ->
indexer.partitionDelete(partitionDelete));
if (row != null)
Arrays.stream(indexers).forEach(indexer ->
indexer.removeRow(row));
Arrays.stream(indexers).forEach(Index.Indexer::finish);
}
{code}
with
{code}
try (OpOrder.Group opGroup = Keyspace.writeOrder.start())
{
for (Index index : indexes)
{
Index.Indexer indexer = index.indexerFor(key, nowInSec,
opGroup, Type.CLEANUP);
indexer.begin();
if (partitionDelete != null)
indexer.partitionDelete(partitionDelete);
if (row != null)
indexer.removeRow(row);
indexer.finish();
}
}
{code}
I'm pretty convinced the latter is clearer, and in the former I count at least
8 unnecessary allocations for the first stream, and 4 for each of the rest. A
few of these allocations are >= 64Kb, and I estimate total allocation per row
on the order of (but probably a little less than) 1Kb. Conversely, the
old-style code performs no allocations besides that of the {{Indexer}}.
> SecondaryIndex API redesign
> ---------------------------
>
> Key: CASSANDRA-9459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9459
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Sam Tunnicliffe
> Assignee: Sam Tunnicliffe
> Fix For: 3.0 beta 1
>
>
> For some time now the index subsystem has been a pain point and in large part
> this is due to the way that the APIs and principal classes have grown
> organically over the years. It would be a good idea to conduct a wholesale
> review of the area and see if we can come up with something a bit more
> coherent.
> A few starting points:
> * There's a lot in AbstractPerColumnSecondaryIndex & its subclasses which
> could be pulled up into SecondaryIndexSearcher (note that to an extent, this
> is done in CASSANDRA-8099).
> * SecondayIndexManager is overly complex and several of its functions should
> be simplified/re-examined. The handling of which columns are indexed and
> index selection on both the read and write paths are somewhat dense and
> unintuitive.
> * The SecondaryIndex class hierarchy is rather convoluted and could use some
> serious rework.
> There are a number of outstanding tickets which we should be able to roll
> into this higher level one as subtasks (but I'll defer doing that until
> getting into the details of the redesign):
> * CASSANDRA-7771
> * CASSANDRA-8103
> * CASSANDRA-9041
> * CASSANDRA-4458
> * CASSANDRA-8505
> Whilst they're not hard dependencies, I propose that this be done on top of
> both CASSANDRA-8099 and CASSANDRA-6717. The former largely because the
> storage engine changes may facilitate a friendlier index API, but also
> because of the changes to SIS mentioned above. As for 6717, the changes to
> schema tables there will help facilitate CASSANDRA-7771.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)