[ 
https://issues.apache.org/jira/browse/CASSANDRA-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13406487#comment-13406487
 ] 

Jason Rutherglen commented on CASSANDRA-4324:
---------------------------------------------

The benchmark idea is interesting, however it will not take into account the 
fact that the FST will be able to store more keys and use less RAM.  With 
greater key granularity, a seek to a given value will be faster?  Is there an 
existing benchmark framework that will for example generate the keys?

In general the big win with the FST is the amount of RAM consumed should be far 
less.  That is fairly easy to measure by generating N keys and comparing the 
RAM usage, which with the existing IndexSummary will include object pointers.  

This article describes the improvements seen using Wikipedia using the FST, up 
to 52% less RAM used, and 22% faster.  Though we need to perform our own 
benchmarks because an MD5 key is different than a dictionary of words.

http://blog.mikemccandless.com/2011/01/finite-state-transducers-part-2.html


                
> Implement Lucene FST in for key index
> -------------------------------------
>
>                 Key: CASSANDRA-4324
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4324
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jason Rutherglen
>            Assignee: Jason Rutherglen
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: CASSANDRA-4324.patch
>
>
> The Lucene FST data structure offers a compact and fast system for indexing 
> Cassandra keys.  More keys may be loaded which in turn should seeks faster.
> * Update the IndexSummary class to make use of the Lucene FST, overriding the 
> serialization mechanism.
> * Alter SSTableReader to make use of the FST seek mechanism

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to