Benedict created CASSANDRA-6830:
-----------------------------------
Summary: Changes to SSTable Index file
Key: CASSANDRA-6830
URL: https://issues.apache.org/jira/browse/CASSANDRA-6830
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Benedict
Priority: Minor
Fix For: 3.0
Building on the ideas introduced in CASSANDRA-6709, and _possibly_ obseleting
them before they are introduced:
Once we have CASSANDRA-6810, we could make the following change to the
(current) index file: instead of producing a sorted decoratedkey file, we could
instead generate a near\-optimal hash table of murmurhash\-of\-key \-> position
in data/(6810\-)index file. This index might permit multiple locations for each
hash, in which case all locations would need to be checked, but a hash table
could be built that minimises this (whilst also maximising compact
representation on disk)
This then might completely obviate the need for a separate key cache, as we
simply rely on whatever buffer cache we have to map in/out the pages we need
for our query in any index. We should be able to guarantee we only ever need to
look at one page for any query. Once we bring page-caching in process, the size
of the pages we actually choose to cache could be configurable which would
bring behaviour to near same as key cache currently stands, except more
compact, and also effectively auto-sizing itself to optimally reduce reads (by
using more buffer cache space if it is helpful, and yielding it to other reads
otherwise).
The obvious disadvantage is that partition key ranges become a little more
expensive, but (the?/)an index summary should reduce the problem here, so that
binary search for a start point can be targeted to a few or single
(6810\-)index page.
--
This message was sent by Atlassian JIRA
(v6.2#6252)