[
https://issues.apache.org/jira/browse/CASSANDRA-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190740#comment-13190740
]
Pavel Yaskevich commented on CASSANDRA-2392:
--------------------------------------------
here is the last things with v3
- {load, save}Summaries methods are leaking file descriptors because {o,
i}Stream is closed only when method handles IOException.
Nit:
{code}
+ FileInputStream input = new FileInputStream(inMemoryDataFile);
+ iStream = new DataInputStream(input);
{code}
and
{code}
+ FileOutputStream input = new FileOutputStream(summaryFile);
+ oStream = new DataOutputStream(input);
{code}
can be changed to
{noformat}
{i,o}Stream = new Data{Input, Output}Stream(new File{Input,
Output}Stream(summaryFile);
{noformat}
because input var is not really needed.
I don't think that "0001-re-factor-first-and-last" is a good idea because by
moving first/last variables to IndexSummary you change their semantics and they
are no longer indicate the first and last key that SSTable keeps but rather
first/last key covered by IndexSummary of the individual SSTable, so I think we
really should just keep those variables in the old place.
Also I'm concerned that CASSANDRA-3762 is marked for 1.2 and this one for 1.1
because if we don't get them in one release that could make start-up times even
longer than right now, which breaks the point of current task, because there is
big chance that key cache would be enabled on the big ColumnFamilies.
> Saving IndexSummaries to disk
> -----------------------------
>
> Key: CASSANDRA-2392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2392
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Chris Goffinet
> Assignee: Vijay
> Priority: Minor
> Fix For: 1.1
>
> Attachments: 0001-re-factor-first-and-last.patch,
> 0001-save-summaries-to-disk.patch, 0002-save-summaries-to-disk-v2.patch,
> 0002-save-summaries-to-disk-v3.patch, 0002-save-summaries-to-disk.patch
>
>
> For nodes with millions of keys, doing rolling restarts that take over 10
> minutes per node can be painful if you have 100 node cluster. All of our time
> is spent on doing index summary computations on startup. It would be great if
> we could save those to disk as well. Our indexes are quite large.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira