[jira] [Commented] (CASSANDRA-9459) SecondaryIndex API redesign

Benedict (JIRA) Sun, 23 Aug 2015 02:53:45 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14708330#comment-14708330
 ]


Benedict commented on CASSANDRA-9459:
-------------------------------------

A nit question relating to this ticket: In each of the main Transaction 
implementations we seem to be making unnecessary use of streams. These are a 
"hot" code paths on systems using indexes, and I'm not convinced we should be 
incurring the GC burden, especially when clarity isn't meaningfully improved. 
Compare

{code}
            try (OpOrder.Group opGroup = Keyspace.writeOrder.start())
            {
                Index.Indexer[] indexers = Arrays.stream(indexes)
                                                 .map(i -> i.indexerFor(key, 
nowInSec, opGroup, Type.CLEANUP))
                                                 .toArray(Index.Indexer[]::new);

                Arrays.stream(indexers).forEach(Index.Indexer::begin);

                if (partitionDelete != null)
                    Arrays.stream(indexers).forEach(indexer -> 
indexer.partitionDelete(partitionDelete));

                if (row != null)
                    Arrays.stream(indexers).forEach(indexer -> 
indexer.removeRow(row));

                Arrays.stream(indexers).forEach(Index.Indexer::finish);
            }
{code}

with

{code}
            try (OpOrder.Group opGroup = Keyspace.writeOrder.start())
            {
                for (Index index : indexes)
                {
                    Index.Indexer indexer = index.indexerFor(key, nowInSec, 
opGroup, Type.CLEANUP);
                    indexer.begin();
                    if (partitionDelete != null)
                        indexer.partitionDelete(partitionDelete);
                    if (row != null)
                        indexer.removeRow(row);
                    indexer.finish();
                }
            }
{code} 

I'm pretty convinced the latter is clearer, and in the former I count at least 
8 unnecessary allocations for the first stream, and 4 for each of the rest. A 
few of these allocations are >= 64Kb, and I estimate total allocation per row 
on the order of (but probably a little less than) 1Kb. Conversely, the 
old-style code performs no allocations besides that of the {{Indexer}}.

> SecondaryIndex API redesign
> ---------------------------
>
>                 Key: CASSANDRA-9459
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9459
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sam Tunnicliffe
>            Assignee: Sam Tunnicliffe
>             Fix For: 3.0 beta 1
>
>
> For some time now the index subsystem has been a pain point and in large part 
> this is due to the way that the APIs and principal classes have grown 
> organically over the years. It would be a good idea to conduct a wholesale 
> review of the area and see if we can come up with something a bit more 
> coherent.
> A few starting points:
> * There's a lot in AbstractPerColumnSecondaryIndex & its subclasses which 
> could be pulled up into SecondaryIndexSearcher (note that to an extent, this 
> is done in CASSANDRA-8099).
> * SecondayIndexManager is overly complex and several of its functions should 
> be simplified/re-examined. The handling of which columns are indexed and 
> index selection on both the read and write paths are somewhat dense and 
> unintuitive.
> * The SecondaryIndex class hierarchy is rather convoluted and could use some 
> serious rework.
> There are a number of outstanding tickets which we should be able to roll 
> into this higher level one as subtasks (but I'll defer doing that until 
> getting into the details of the redesign):
> * CASSANDRA-7771
> * CASSANDRA-8103
> * CASSANDRA-9041
> * CASSANDRA-4458
> * CASSANDRA-8505
> Whilst they're not hard dependencies, I propose that this be done on top of 
> both CASSANDRA-8099 and CASSANDRA-6717. The former largely because the 
> storage engine changes may facilitate a friendlier index API, but also 
> because of the changes to SIS mentioned above. As for 6717, the changes to 
> schema tables there will help facilitate CASSANDRA-7771.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9459) SecondaryIndex API redesign

Reply via email to