[
https://issues.apache.org/jira/browse/CASSANDRA-208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715300#action_12715300
]
Jonathan Ellis commented on CASSANDRA-208:
------------------------------------------
Or, how about this: we just stop storing multiple kinds of data in the SSTable
file, and instead store the index entries and the bloom filter in separate
on-disk files.
Gains (vs interleaving):
- simplicity (don't have to skip non-data keys, EOF is really EOF)
- no more hard-coded "keys" in the sstable that will result in very weird-ass
bugs if a client every uses one
- behaves more like the FS cache expects since each section (index, data,
bloom) is homogeneous and the "if I read block A, I'm more likely to need the
next block" assumption holds more often
- retains goal of loading index on server start w/o seeking
Losses:
- some seeking during flush/compaction to switch between writing data blocks
and index blocks
Although the "no seeks on writes, at all" claim is a cool one, in practice the
amount of seeks we'll be doing is still negligible when buffering is done,
i.e., still a huge win over traditional B-tree design where every update
requires a seek.
Thoughts?
> jvm crashes intermittently during compaction
> --------------------------------------------
>
> Key: CASSANDRA-208
> URL: https://issues.apache.org/jira/browse/CASSANDRA-208
> Project: Cassandra
> Issue Type: Bug
> Affects Versions: trunk
> Environment: arch: x86_64
> os: Linux version 2.6.18-92.1.22.el5
> java: nio2-ea-bin-b99-linux-x64-05_feb_2009
> Reporter: Jiansheng Huang
> Assignee: Jonathan Ellis
> Priority: Critical
> Fix For: 0.3
>
>
> jvm crashes intermittently during compaction. Our test data set is not that
> big, less than 10 GB.
> When jvm is about to crash, we see that it consumes a lot of memory
> (exceeding the max heap size).
> The excessive memory usage during compaction is caused by the maintenance of
> blockIndexes_ in SSTable. this blockIndexes_ was only introduced to the
> apache version.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.