[jira] [Commented] (CASSANDRA-2392) Saving IndexSummaries to disk

Pavel Yaskevich (Commented) (JIRA) Sun, 22 Jan 2012 10:59:05 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190740#comment-13190740
 ]


Pavel Yaskevich commented on CASSANDRA-2392:
--------------------------------------------

here is the last things with v3

 - {load, save}Summaries methods are leaking file descriptors because {o, 
i}Stream is closed only when method handles IOException. 

 Nit: 

{code}
+            FileInputStream input = new FileInputStream(inMemoryDataFile);
+            iStream = new DataInputStream(input);
{code}
and
{code}
+            FileOutputStream input = new FileOutputStream(summaryFile);
+            oStream = new DataOutputStream(input);
{code}

can be changed to 
{noformat}
{i,o}Stream = new Data{Input, Output}Stream(new File{Input, 
Output}Stream(summaryFile); 
{noformat}
because input var is not really needed.

I don't think that "0001-re-factor-first-and-last" is a good idea because by 
moving first/last variables to IndexSummary you change their semantics and they 
are no longer indicate the first and last key that SSTable keeps but rather 
first/last key covered by IndexSummary of the individual SSTable, so I think we 
really should just keep those variables in the old place.

Also I'm concerned that CASSANDRA-3762 is marked for 1.2 and this one for 1.1 
because if we don't get them in one release that could make start-up times even 
longer than right now, which breaks the point of current task, because there is 
big chance that key cache would be enabled on the big ColumnFamilies.
                
> Saving IndexSummaries to disk
> -----------------------------
>
>                 Key: CASSANDRA-2392
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2392
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Chris Goffinet
>            Assignee: Vijay
>            Priority: Minor
>             Fix For: 1.1
>
>         Attachments: 0001-re-factor-first-and-last.patch, 
> 0001-save-summaries-to-disk.patch, 0002-save-summaries-to-disk-v2.patch, 
> 0002-save-summaries-to-disk-v3.patch, 0002-save-summaries-to-disk.patch
>
>
> For nodes with millions of keys, doing rolling restarts that take over 10 
> minutes per node can be painful if you have 100 node cluster. All of our time 
> is spent on doing index summary computations on startup. It would be great if 
> we could save those to disk as well. Our indexes are quite large.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2392) Saving IndexSummaries to disk

Reply via email to