[
https://issues.apache.org/jira/browse/CASSANDRA-4667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467839#comment-13467839
]
Jonathan Ellis commented on CASSANDRA-4667:
-------------------------------------------
Let's recap what we're trying to do here.
When every node is up, there's no reason to write batchlog data out to sstables
(which in turn incurs cost like compaction) since the BL write + delete cancel
each other out, and BL data is strictly local, so we don't need to preserve
tombstones for repair.
It still looks to me like the simplest way to achieve this is to look at the
rows being written during flush -- if we have a data row + row level tombstone,
then writing it is effectively an expensive no-op and we can skip it. Rest of
the flush machinery doesn't need to change, including telling the commitlog
that it can start replay after the flush point.
> optimize memtable deletions for batchlog
> ----------------------------------------
>
> Key: CASSANDRA-4667
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4667
> Project: Cassandra
> Issue Type: Sub-task
> Reporter: Aleksey Yeschenko
> Assignee: Aleksey Yeschenko
> Attachments: CASSANDRA-4667-v1.1.patch, CASSANDRA-4667-v2.patch
>
>
> Batchlog writes with the same key are never retried. This means that if a
> batchlog row is in the memtable, it can't be in any of the sstables, ever. In
> such cases we don't need to write a tombstone to disk. We can purge the row
> completely from the memtable and only write a tombstone if the row had been
> flushed already (if it's not in the memtable then it must be in one of the
> sstables).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira