[jira] Commented: (CASSANDRA-223) time-based slicing does not work correctly w/ "historial" memtables

Jonathan Ellis (JIRA) Mon, 22 Jun 2009 09:06:30 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12722657#action_12722657
 ]


Jonathan Ellis commented on CASSANDRA-223:
------------------------------------------

I came to the same conclusion.

One partial answer to the files-to-read is to change compaction to guarantee 
log(n) sstable files instead of the current ad-hoc behavior, where n is the 
maximum sstable "generation" number.  (Where "generation" is the number of 
compactions done.)

For each CF, when you flush, you compact until there is nothing already at the 
same generation to compact with.  For example,

flush 1: nothing to merge.  memtable becomes sstable-gen0
flush 2: there is already a sstable-gen0 so you merge.  now you have 
sstable-gen1
flush 3: no gen0, so you store there.  now you have sstable-gen0, sstable-gen1
flush 4: 0 and 1 exist, so you compact (with the new one) to sstable-gen2

etc.

Generation tracking can be done in the sstable filename.

> time-based slicing does not work correctly w/ "historial" memtables
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-223
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-223
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>         Attachments: 223.patch
>
>
> TimeFilter assumes that it is done as soon as it finds a column stamped 
> earlier than what it is filtering on, but when you have a group of 
> "historical" memtables whose columns were written in an arbitrary order this 
> is not a safe assumption.
> It is not even a safe assumption when dealing with a single memtable + 
> sstable pair, as the attached new test shows.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (CASSANDRA-223) time-based slicing does not work correctly w/ "historial" memtables

Reply via email to