[ 
https://issues.apache.org/jira/browse/CASSANDRA-8931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646225#comment-14646225
 ] 

Jonathan Ellis commented on CASSANDRA-8931:
-------------------------------------------

Good idea.  This will save a lot of memory.

> IndexSummary (and Index) should store the token, and the minimal key to 
> unambiguously direct a query
> ----------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8931
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8931
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Assignee: Stefania
>              Labels: performance
>
> Since these files are likely sticking around a little longer, it is probably 
> worth optimising them. A relatively simple change to Index and IndexSummary 
> could reduce the amount of space required significantly, reduce the CPU 
> burden of lookup, and hopefully bound the amount of space needed as key size 
> grows. On writing first we always store the token before the key (if it is 
> different to the key); then we simply truncate the whole record to the 
> minimum length necessary to answer an inequality search. Since the data file 
> contains the key also, we can corroborate we have the right key once we've 
> looked up. Since BFs are used to reduce unnecessary lookups, we don't save 
> much by ruling the false positives out one step earlier. 
>  An improved follow up version would be to use a trie of shortest length to 
> answer inequality lookups, as this would also ensure very long keys with 
> common prefixes would not significantly increase the size of the index or 
> summary. This would translate to a trie index for the summary keying into a 
> static trie page for the index.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to