[jira] [Issue Comment Edited] (CASSANDRA-2735) Timestamp Based Compaction Strategy

Yang Yang (JIRA) Fri, 17 Jun 2011 14:54:01 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051345#comment-13051345
 ]


Yang Yang edited comment on CASSANDRA-2735 at 6/17/11 9:52 PM:
---------------------------------------------------------------

there could be a problem with trying to rely on forcing compaction order to 
make counter expiration work:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correct direction 
to attack this problem: the issue here is due to the changing order between 
*individual* counter adds/deletes (auto-expire is same as delete), this order 
can be different between different counters, so you have to fix the order 
between the updates within each counter, not the order between *ensembles of 
counters*. such ensembles of counters do not guarantee any orders at all, due 
to randomness in flushing time, or message delivery (they have similar effects)

the problem with current counter+delete implementation is that counters use 
timestamp() to represent their order, but when they are merged, they lose their 
*individual order* and retain a max timestamp(), which supposedly represents 
the order of the ensemble, but this is meaningless because the it is the order 
of the ensemble is different from the true order.



      was (Author: yangyangyyy):
    there could be a problem with trying to rely on forcing compaction order to 
make counter expiration work:

if you base the intended order on max timestamp of each sstable, the timestamp 
is not trustworthy, because a single malicious client request can bump up its 
timestamp to the future, and arbitrarily change the order of compaction, thus 
rendering the approach in 2735 useless.

you can't base the order on the physical sstable flush time either, since 
different nodes have different flush times.

overall I think trying to fix the compaction order is not the correct direction 
to attack this problem: the issue here is due to the changing order between 
*individual* counter adds/deletes (auto-expire is same as delete), this order 
can be different between different counters, so you have to fix the order 
between the updates within each counter, not the order between *ensembles of 
counters*. such ensembles of counters do not guarantee any orders at all, due 
to randomness in flushing time, or message delivery (they have similar effects)

  
> Timestamp Based Compaction Strategy
> -----------------------------------
>
>                 Key: CASSANDRA-2735
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2735
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Alan Liang
>            Assignee: Alan Liang
>            Priority: Minor
>              Labels: compaction
>         Attachments: 0004-timestamp-bucketed-compaction-strategy.patch
>
>
> Compaction strategy implementation based on max timestamp ordering of the 
> sstables while satisfying max sstable size, min and max compaction 
> thresholds. It also handles expiration of sstables based on a timestamp.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2735) Timestamp Based Compaction Strategy

Reply via email to