[
https://issues.apache.org/jira/browse/CASSANDRA-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189198#comment-13189198
]
Vijay commented on CASSANDRA-2392:
----------------------------------
Hi Pavel
>> To avoid any seeks in the PRIMARY_INDEX file upon IndexSummary.deserialize I
>> suggest to save key (only BB part) as well as index position on
>> IndexSummary.serialize.
Will do, The initial idea was to save some disk space as they keys in some
cases can be really long :) and with the index seeks was not that bad in my
initial tests but i will save it in v2.
>> I would also suggest to save dataPosition from the primary index into
>> summaries file to avoid adding serialization to SegmentedFile because
>> SegmentedFile serialize(...)/deserialize(...) are not really a
>> serialize/deserialize
I am not sure how saving dataPosition will help as we only have summaries
between 128Keys or more and how will we mark a boundary with it? For example
each row is 100MB big.
>> can you please explain this chunk of code to me?
The idea is to save the summary when SSTable creation/load completes (as there
isnt any temporary state for them and they fit in memory). If corrupted or
deleted or not there we will just recalculate them instead of depending on them.
> Saving IndexSummaries to disk
> -----------------------------
>
> Key: CASSANDRA-2392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2392
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Chris Goffinet
> Assignee: Vijay
> Priority: Minor
> Fix For: 1.1
>
> Attachments: 0001-re-factor-first-and-last.patch,
> 0001-save-summaries-to-disk.patch, 0002-save-summaries-to-disk.patch
>
>
> For nodes with millions of keys, doing rolling restarts that take over 10
> minutes per node can be painful if you have 100 node cluster. All of our time
> is spent on doing index summary computations on startup. It would be great if
> we could save those to disk as well. Our indexes are quite large.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira