[
https://issues.apache.org/jira/browse/CASSANDRA-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190751#comment-13190751
]
Pavel Yaskevich commented on CASSANDRA-2392:
--------------------------------------------
bq. But the main idea is to reduce the code and the checks which we have to do
just to populate the first and last variable. IMO it is better served in Index
Summary which already has the needed checks. by using maybeAddEntry() and
marking other private everywhere we dont need extra checks else where to
populate the fields... first and last in a index is also a summary :)
Correct me if I'm wrong but as I see in SSTableReader.load(...) that condition
"SSTable.last == IndexSummary.last" is not a guaranteed thing which means that
IndexSummary.last has a different semantics from SSTable.last. According to
checks - I don't see many of those and IndexSummary in it's current state does
not have anything to do with SSTable's last/first variables so I don't really
understand what checks are you talking about? If you really want to be pedantic
about the domain of first/last - I agree that they could belong to the summary
of the SSTable but certainly not to the "index" one :)
bq. Because we read from the disk to populate the Index Summary? If yes i can
make sure that both the patches go into the same release.
Because we would end-up reading more data (e.g. some of the keys and all index
and data positions would be read twice) from different files - primary_index
and summary.
> Saving IndexSummaries to disk
> -----------------------------
>
> Key: CASSANDRA-2392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2392
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Chris Goffinet
> Assignee: Vijay
> Priority: Minor
> Fix For: 1.1
>
> Attachments: 0001-re-factor-first-and-last.patch,
> 0001-save-summaries-to-disk.patch, 0002-save-summaries-to-disk-v2.patch,
> 0002-save-summaries-to-disk-v3.patch, 0002-save-summaries-to-disk.patch
>
>
> For nodes with millions of keys, doing rolling restarts that take over 10
> minutes per node can be painful if you have 100 node cluster. All of our time
> is spent on doing index summary computations on startup. It would be great if
> we could save those to disk as well. Our indexes are quite large.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira