[
https://issues.apache.org/jira/browse/CASSANDRA-10280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900985#comment-14900985
]
Björn Hegerfors edited comment on CASSANDRA-10280 at 9/21/15 5:06 PM:
----------------------------------------------------------------------
Yes, I'm absolutely in favor of expressing this in terms of max window size
instead of max SSTable age. And it's also become more and more clear to me that
rather than never compacting SSTables that are too old, we should just keep
fixed size windows around, so that if SSTables come in there (bootstrap,
repairs), compaction will happen.
I haven't looked at the patch, but is there a clear way to express maximum
window size? If base_time_seconds=1 do you then say something like
max_window_seconds=10? And in that case, will the larges windows be 4 or 16? I
guess only 4 would make sense with that name...
I've suggested before declaring how many times a window will be coalesced. But
that might sound really complicated to users. What I mean is a setting like
"window_coalitions" or "write_amplification" which you can set to 5 in order to
get a maximum window size of 4^5=1024 times the base window. But let's go with
whatever is easiest to understand.
EDIT: never mind, I looked at the patch, and it's done exactly how I would have
done it. So +1.
was (Author: bj0rn):
Yes, I'm absolutely in favor of expressing this in terms of max window size
instead of max SSTable age. And it's also become more and more clear to me that
rather than never compacting SSTables that are too old, we should just keep
fixed size windows around, so that if SSTables come in there (bootstrap,
repairs), compaction will happen.
I haven't looked at the patch, but is there a clear way to express maximum
window size? If base_time_seconds=1 do you then say something like
max_window_seconds=10? And in that case, will the larges windows be 4 or 16? I
guess only 4 would make sense with that name...
I've suggested before declaring how many times a window will be coalesced. But
that might sound really complicated to users. What I mean is a setting like
"window_coalitions" or "write_amplification" which you can set to 5 in order to
get a maximum window size of 4^5=1024 times the base window. But let's go with
whatever is easiest to understand.
> Make DTCS work well with old data
> ---------------------------------
>
> Key: CASSANDRA-10280
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10280
> Project: Cassandra
> Issue Type: Sub-task
> Components: Core
> Reporter: Marcus Eriksson
> Assignee: Marcus Eriksson
> Fix For: 3.x, 2.1.x, 2.2.x
>
>
> Operational tasks become incredibly expensive if you keep around a long
> timespan of data with DTCS - with default settings and 1 year of data, the
> oldest window covers about 180 days. Bootstrapping a node with vnodes with
> this data layout will force cassandra to compact very many sstables in this
> window.
> We should probably put a cap on how big the biggest windows can get. We could
> probably default this to something sane based on max_sstable_age (ie, say we
> can reasonably handle 1000 sstables per node, then we can calculate how big
> the windows should be to allow that)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)