[jira] [Commented] (CASSANDRA-9459) SecondaryIndex API redesign

Sylvain Lebresne (JIRA) Fri, 21 Aug 2015 04:20:13 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706556#comment-14706556
 ]


Sylvain Lebresne commented on CASSANDRA-9459:
---------------------------------------------

Alright, follows my comments for the rest of the patch review:
* Would be nice to move the calls of {{SecondaryIndexManager.validate}} out of 
{{UpdateParameters}}/{{CassandraServer}} to a single place, like maybe 
{{Keyspace.apply}} so it's in only one place and we're sure we cannot miss it 
on any path.
* I would remove {{ColumnIndexMetadata}}, inlining the fields directly into 
{{ColumnIndex}} (with proper getters). I think the indirection makes things a 
bit harder to follow (and more verbose) for no benefit I can see (typically we 
store the {{baseCfs}} twice, in both {{ColumnIndex}} and 
{{ColumnIndexMetadata}}. Or {{CassandraIndexSearcher}} holds bothe the 
{{CassandraIndex}} and {{CassandraIndexMetadata}} even though they are 
essentially the same thing). If you really hate removing it (but I _really_ 
think it would be cleaner to do so), I think we should at least rename it, 
cause it currently sounds like it's a specialisation of {{IndexMetadata}} while 
it's not (and more generally it doesn't fit into what we call "Metadata" in 
general).
* Still not a fan of {{ColumnIndexFunctions}}. I think having subclasses of 
{{CassandraIndex}} would be a lot more idiomatic (since this, to me, fits the 
exact definition of what inheritance is about) and hence simpler (since more 
direct/expected). Would also be a tiny bit less verbose since you won't have to 
basically pass the index to most function.
* To have {{CassandraIndexSearcher}} be an {{Index.Searcher}} but 
{{ColumnIndexSearcher}} (and its subclasses) not be one is a bit 
surprising/inconsistent naming wise. And really, having 2 separate class here 
feels more complexity than necessary (after all, {{CassandraIndex}} is our 
"ColumnIndex" implementation so why 2 searcher class?). So I would just merge 
{{ColumnIndexSearcher}} into {{CassandraIndexSearcher}} (having thus 
{{KeysSearcher}} and {{CompositesSearcher}} be sublcasses of 
{{CassandraIndexSearcher}}).
* In {{ColumnFamilyStore.scrubDataDirectories}}, we basically use 
{{!index.isCustom()}} to select index that have a backing CFS. Why not using 
SecondaryIndexManager.getAllIndexStorageTables() instead, that's more 
consistent with how we do it in other places (I'll note you've mostly just kept 
the way it was done before, but could still be nice to clean it imo).

I've also pushed [a 
commit|https://github.com/pcmanus/cassandra/commits/9459-nits] with a few very 
minor nits that would have taken longer to explain that do.


> SecondaryIndex API redesign
> ---------------------------
>
>                 Key: CASSANDRA-9459
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9459
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sam Tunnicliffe
>            Assignee: Sam Tunnicliffe
>             Fix For: 3.0 beta 1
>
>
> For some time now the index subsystem has been a pain point and in large part 
> this is due to the way that the APIs and principal classes have grown 
> organically over the years. It would be a good idea to conduct a wholesale 
> review of the area and see if we can come up with something a bit more 
> coherent.
> A few starting points:
> * There's a lot in AbstractPerColumnSecondaryIndex & its subclasses which 
> could be pulled up into SecondaryIndexSearcher (note that to an extent, this 
> is done in CASSANDRA-8099).
> * SecondayIndexManager is overly complex and several of its functions should 
> be simplified/re-examined. The handling of which columns are indexed and 
> index selection on both the read and write paths are somewhat dense and 
> unintuitive.
> * The SecondaryIndex class hierarchy is rather convoluted and could use some 
> serious rework.
> There are a number of outstanding tickets which we should be able to roll 
> into this higher level one as subtasks (but I'll defer doing that until 
> getting into the details of the redesign):
> * CASSANDRA-7771
> * CASSANDRA-8103
> * CASSANDRA-9041
> * CASSANDRA-4458
> * CASSANDRA-8505
> Whilst they're not hard dependencies, I propose that this be done on top of 
> both CASSANDRA-8099 and CASSANDRA-6717. The former largely because the 
> storage engine changes may facilitate a friendlier index API, but also 
> because of the changes to SIS mentioned above. As for 6717, the changes to 
> schema tables there will help facilitate CASSANDRA-7771.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9459) SecondaryIndex API redesign

Reply via email to