Benedict created CASSANDRA-6830:
-----------------------------------

             Summary: Changes to SSTable Index file
                 Key: CASSANDRA-6830
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6830
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Benedict
            Priority: Minor
             Fix For: 3.0


Building on the ideas introduced in CASSANDRA-6709, and _possibly_ obseleting 
them before they are introduced:

Once we have CASSANDRA-6810, we could make the following change to the 
(current) index file: instead of producing a sorted decoratedkey file, we could 
instead generate a near\-optimal hash table of murmurhash\-of\-key \-> position 
in data/(6810\-)index file. This index might permit multiple locations for each 
hash, in which case all locations would need to be checked, but a hash table 
could be built that minimises this (whilst also maximising compact 
representation on disk)

This then might completely obviate the need for a separate key cache, as we 
simply rely on whatever buffer cache we have to map in/out the pages we need 
for our query in any index. We should be able to guarantee we only ever need to 
look at one page for any query. Once we bring page-caching in process, the size 
of the pages we actually choose to cache could be configurable which would 
bring behaviour to near same as key cache currently stands, except more 
compact, and also effectively auto-sizing itself to optimally reduce reads (by 
using more buffer cache space if it is helpful, and yielding it to other reads 
otherwise).

The obvious disadvantage is that partition key ranges become a little more 
expensive, but (the?/)an index summary should reduce the problem here, so that 
binary search for a start point can be targeted to a few or single 
(6810\-)index page.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to