[
https://issues.apache.org/jira/browse/CASSANDRA-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12872334#action_12872334
]
Stu Hood commented on CASSANDRA-1117:
-------------------------------------
I got to thinking about Jonathan's 2-level binary search idea, and realized
that a multiple level binary search would be handled really well by a tree.
The tree I'm imagining would be a tree of depth K+2 where K is the number of
index/data files (2 in our current situation). The 0th level would be a root.
At each of the K levels after the root, you would have inner nodes representing
the segments of the index/data file at that level. The 1st level would contain
the segments for the smallest file, the 2nd level would contain the segments
for the second smallest, and the Kth would contain the segments for the data
file. The K+1th level would contain leaf nodes which would be equivalent to the
contents of the IndexSummary class.
I thiiink I can implement this structure over the weekend if it sounds
worthwhile?
Also, generalizing to multiple levels of indexing means that at some point in
the future, we could write out multiple index files at progressively higher
resolution, giving you a balanced tree on disk. Our INDEX_INTERVAL is intended
to represent the ratio between ram and disk, so theoretical you should always
have enough memory to summarize the index in memory, but in most cases, a lot
of that memory would be better served as row cache.
> Clean up MMAP support
> ---------------------
>
> Key: CASSANDRA-1117
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1117
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Stu Hood
> Assignee: Gary Dusbabek
> Fix For: 0.7
>
> Attachments: 0001-Use-factory-functions-for-RowIndexedReader.patch,
> 0002-Add-SegmentedFile-to-abstract-opening-FileDataInputs.patch,
> 0003-Replace-mmap-file-abstraction-with-SegmentedFile.patch,
> 0004-Rename-SSTableReaderTest-to-SegmentedFileTest.patch,
> 0005-Remove-filename-munging.patch
>
>
> Awareness of MMAP is currently embedded into the SSTableReader implementation
> and IndexSummary. A good number of bugs experienced recently have been due to
> this lack of separation, so it is ripe for abstraction. Additionally, the
> current implementation does not provide a good method for iterating over the
> segments of a file, which is useful for range queries, and lays more stable
> groundwork for #998.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.