[ 
https://issues.apache.org/jira/browse/CASSANDRA-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226463#comment-14226463
 ] 

Björn Hegerfors commented on CASSANDRA-8360:
--------------------------------------------

OK, sounds fair. That essentially means that we want to treat the "incoming 
window" specially. A question worth asking is what we want the incoming window 
for. Currently it is "keep the last unit of base_time_seconds compacted at all 
times". While it respects min_threshold, a value written early in the window 
will essentially be constantly recompacted once every (min_threshold - 1) 
subsequent sstable flushes. I'm fully aware that this might be a bad idea, or 
rather I wasn't sure if it was the right thing to do. Really, it's completely 
inspired by STCS's min_sstable_size which seems to do the same thing, i.e. not 
respect the logarithmic complexity tree-like merging on small enough SSTables. 
(Reminds me a bit of insertion sort being fastest on small enough arrays). So 
base_time_seconds has the same purpose. A problem is that it might be harder 
set a good default on time than on size.

Setting min_sstable_size in STCS to 0 has an near-equivalent in DTCS: setting 
base_time_seconds to 1. The windows will be powers of base_time_seconds (up to 
base_time_seconds of each size), starting at 1 second. Even with this setting, 
data that is an hour old will be in near-hour large windows. The only 
meaningful difference is that SSTables 2 seconds and 10 seconds old will not be 
in the same window. What I mean by this, is that setting base_time_seconds to 1 
is perfectly reasonable, it's just the same as setting min_sstable_size to 0 or 
1 in STCS. I just want to make it clear that base_time_seconds is not really 
something that you should set to 1 hour (3600) just because you want SSTables 
older than 1 hour to be in nice 1-hour chunks. If you set it to 900 with 
min_threshold=4, SSTables older than 1 hour will still be in perfect 1 hour 
chunks (because preceding up to 4 900-second chunks, comes a 4*900=3600-second 
chunk).

So I guess respecting min_threshold in the 'incoming window' is just as right 
as respecting min_threshold when compacting SSTables smaller than 
min_sstable_size in STCS. Which I believe it does. So there's my roundabout way 
of coming to the same conclusion as you, [~jbellis] :). I just have this 
feeling that the meaning of base_time_seconds isn't well understood.

> In DTCS, always compact SSTables in the same time window, even if they are 
> fewer than min_threshold
> ---------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-8360
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8360
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Björn Hegerfors
>            Priority: Minor
>
> DTCS uses min_threshold to decide how many time windows of the same size that 
> need to accumulate before merging into a larger window. The age of an SSTable 
> is determined as its min timestamp, and it always falls into exactly one of 
> the time windows. If multiple SSTables fall into the same window, DTCS 
> considers compacting them, but if they are fewer than min_threshold, it 
> decides not to do it.
> When do more than 1 but fewer than min_threshold SSTables end up in the same 
> time window (except for the current window), you might ask? In the current 
> state, DTCS can spill some extra SSTables into bigger windows when the 
> previous window wasn't fully compacted, which happens all the time when the 
> latest window stops being the current one. Also, repairs and hints can put 
> new SSTables in old windows.
> I think, and [~jjordan] agreed in a comment on CASSANDRA-6602, that DTCS 
> should ignore min_threshold and compact tables in the same windows regardless 
> of how few they are. I guess max_threshold should still be respected.
> [~jjordan] suggested that this should apply to all windows but the current 
> window, where all the new SSTables end up. That could make sense. I'm not 
> clear on whether compacting many SSTables at once is more cost efficient or 
> not, when it comes to the very newest and smallest SSTables. Maybe compacting 
> as soon as 2 SSTables are seen is fine if the initial window size is small 
> enough? I guess the opposite could be the case too; that the very newest 
> SSTables should be compacted very many at a time?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to