[
https://issues.apache.org/jira/browse/CASSANDRA-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14681642#comment-14681642
]
Sergio Bossa commented on CASSANDRA-9459:
-----------------------------------------
The search path looks good as well, except I think CASSANDRA-8717 is actually
broken, probably because of CASSANDRA-8099, but I think it's worth discussing
here.
According to a previous comment by [~slebresne], the limit should be now
applied *after* post reconciliation, but it _seems_ to me the limit is actually
applied *twice*, and both times in a way that IMHO breaks the "continuous"
range iteration required by Stratio, and generally any top-k implementation:
1) It is applied via {{CountingPartitionIterator}} while sending concurrent
range requests in {{RangeCommandIterator#sendNextRequests}}: this means to me
that each range (actually, the concatenation of all concurrently queried ones)
will limit its returned result set, which prevents to correctly implement top-k
queries (unless you can top-k sort on each replica).
2) It is further applied in {{StorageProxy#getRangeSlice}} via another
{{CountingPartitionIterator}}, which will pass a "limited" iterator down to the
{{Index}} post-processor {{BiFunction}}.
That said, my knowledge of CASSANDRA-8099 isn't deep, so I might be missing
something in my analysis.
I'll now proceed with a last round of review and get back with some final
feedback.
> SecondaryIndex API redesign
> ---------------------------
>
> Key: CASSANDRA-9459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9459
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Sam Tunnicliffe
> Assignee: Sam Tunnicliffe
> Fix For: 3.0 beta 1
>
>
> For some time now the index subsystem has been a pain point and in large part
> this is due to the way that the APIs and principal classes have grown
> organically over the years. It would be a good idea to conduct a wholesale
> review of the area and see if we can come up with something a bit more
> coherent.
> A few starting points:
> * There's a lot in AbstractPerColumnSecondaryIndex & its subclasses which
> could be pulled up into SecondaryIndexSearcher (note that to an extent, this
> is done in CASSANDRA-8099).
> * SecondayIndexManager is overly complex and several of its functions should
> be simplified/re-examined. The handling of which columns are indexed and
> index selection on both the read and write paths are somewhat dense and
> unintuitive.
> * The SecondaryIndex class hierarchy is rather convoluted and could use some
> serious rework.
> There are a number of outstanding tickets which we should be able to roll
> into this higher level one as subtasks (but I'll defer doing that until
> getting into the details of the redesign):
> * CASSANDRA-7771
> * CASSANDRA-8103
> * CASSANDRA-9041
> * CASSANDRA-4458
> * CASSANDRA-8505
> Whilst they're not hard dependencies, I propose that this be done on top of
> both CASSANDRA-8099 and CASSANDRA-6717. The former largely because the
> storage engine changes may facilitate a friendlier index API, but also
> because of the changes to SIS mentioned above. As for 6717, the changes to
> schema tables there will help facilitate CASSANDRA-7771.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)