[
https://issues.apache.org/jira/browse/CASSANDRA-8550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tyler Hobbs updated CASSANDRA-8550:
-----------------------------------
Reviewer: Benjamin Lerer
> Internal pagination in CQL3 index queries creating substantial overhead
> -----------------------------------------------------------------------
>
> Key: CASSANDRA-8550
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8550
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Samuel Klock
> Assignee: Tyler Hobbs
> Fix For: 2.0.12, 2.1.3
>
> Attachments: 8550-2.0.txt
>
>
> While benchmarking CQL3 secondary indexes in 2.1.2, we've noticed substantial
> performance degradation as the volume of indexed data increases. In trying
> to figure out what's going on, we found that a major factor contributing to
> this degradation appears to be logic in
> {{o.a.c.db.index.composites.CompositesSearcher}} used to paginate scans of
> index tables. In particular, in the use cases we've explored, this short
> algorithm used to select a page size appears to be the culprit:
> {code:java}
> private int meanColumns =
> Math.max(index.getIndexCfs().getMeanColumns(), 1);
> // We shouldn't fetch only 1 row as this provides buggy paging in
> case the first row doesn't satisfy all clauses
> private int rowsPerQuery = Math.max(Math.min(filter.maxRows(),
> filter.maxColumns() / meanColumns), 2);
> {code}
> In indexes where the cardinality doesn't scale linearly with the volume of
> data indexed, it seems likely that the value of {{meanColumns}} will steadily
> rise in write-heavy workloads. In the cases we've explored,
> {{filter.maxColumns()}} returns a small enough number (related to the lesser
> of the native-protocol page size or the user-specified limit for the query)
> that, after {{meanColumns}} reaches a few thousand, {{rowsPerQuery}} (the
> page size) is consistently set to 2.
> The resulting overhead is severe. In our environment, if we fix
> {{rowsPerQuery}} to some reasonably large constant (e.g., 5,000), queries
> that with the existing logic would require over two minutes to complete can
> run in under ten seconds.
> Using a constant clearly seems like the wrong answer. But the overhead the
> existing algorithm seems to introduce suggests that it isn't the right answer
> either. An intuitive solution might be to use the minimum of
> {{filter.maxRows()}} and {{filter.maxColumns()}} (or 2 if both of those are
> 1), but it's not immediately clear that there aren't safety considerations
> the algorithm is attempting to account for that this strategy does not.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)