[ 
https://issues.apache.org/jira/browse/CASSANDRA-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284921#comment-13284921
 ] 

Jonathan Ellis commented on CASSANDRA-3762:
-------------------------------------------

Hmm, the size business is a bit tricky.

I agree that having serialize() return the bytes written is cleaner than 
retaining a separate [estimated]serializedSize method.  (We could let 
AutoSavingCache wrap its output in a CountingOutputStream that just records the 
length of what its wrapped stream is given.)  *However*, since we're pretty 
much stuck w/ an estimate for estimatedTotalBytes anyway, it feels strange to 
count the same data two different ways.

But, if we stick with the current design, we're doing *three* get() calls on 
each key -- not sequentially either, first we get() everything to estimate 
total, then we get() each in turn twice to serialize and add in size to 
progress.

One alternative would be to use an entry set instead of key set in the saving, 
but CLHM doesn't expose this.

What if we redefined the problem a bit?  Instead of having 
totalBytes/bytesComplete in CompactionInfo, we could have long total/complete 
and String units.  Then we could just return progress in terms of key count for 
cache saving.
                
> AutoSaving KeyCache and System load time improvements.
> ------------------------------------------------------
>
>                 Key: CASSANDRA-3762
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3762
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.2
>            Reporter: Vijay
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.2
>
>         Attachments: 0001-CASSANDRA-3762-v2.patch, 
> 0001-CASSANDRA-3762-v3.patch, 0001-CASSANDRA-3762-v4.patch, 
> 0001-CASSANDRA-3762-v5.patch, 0001-SavedKeyCache-load-time-improvements.patch
>
>
> CASSANDRA-2392 saves the index summary to the disk... but when we have saved 
> cache we will still scan through the index to get the data out.
> We might be able to separate this from SSTR.load and let it load the index 
> summary, once all the SST's are loaded we might be able to check the 
> bloomfilter and do a random IO on fewer Index's to populate the KeyCache.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to