[
https://issues.apache.org/jira/browse/CASSANDRA-10306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950373#comment-14950373
]
Antti Nissinen commented on CASSANDRA-10306:
--------------------------------------------
Hi, I am fine with closing. The main thing is that the splitting idea is
progressing :-)
While you are considering the code for splitting of SSTables would you please
also consider the possibilities to manually inactivate and deletion / archiving
of SSTables to release disk space quickly
> Splitting SSTables in time, deleting and archiving SSTables
> -----------------------------------------------------------
>
> Key: CASSANDRA-10306
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10306
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Antti Nissinen
>
> This document is a continuation for
> [CASSANDRA-10195|https://issues.apache.org/jira/browse/CASSANDRA-10195] and
> describes some needs to be able split files in time wise as discussed also in
> [CASSANDRA-8361|https://issues.apache.org/jira/browse/CASSANDRA-8361]. Data
> model is explained shortly, then the practical issues running Cassandra with
> time series data and needs for the splitting capabilities.
> Data model: (snippet from
> [CASSANDRA-9644|https://issues.apache.org/jira/browse/CASSANDRA-9644)]
> Data is time series data. Data is saved so that one row contains a certain
> time span of data for a given metric ( 20 days in this case). The row key
> contains information about the start time of the time span and metrix name.
> Column name gives the offset from the beginning of time span. Column time
> stamp is set to correspond time stamp when adding together the timestamp from
> the row key and the offset (the actual time stamp of data point). Data model
> is analog to KairosDB implementation.
> In the practical application the data is added to real-time into the column
> family. While converting from legacy system old data is pre-loaded in timely
> order by faking the timestamp of the column before starting the real-time
> data collection. However, there is intermittently a need to insert also older
> data to the database due to the fact that is has not been available in
> real-time or additional time series are fed in afterward due to unforeseeable
> needs.
> Adding old data simultaneously with real-time data will lead to SSTables
> that are containing data from a time period exceeding the length of the
> compaction window (TWCS and DTCS). Therefore SSTables are not behaving in
> predictable manner in compaction process.
> Tombstones are masking the data from queries but the release of disk space
> requires that SStables containing tombstones would be compacted together with
> SSTables having the original data. While using TWCS or DTCS and writing
> tombstones with timestamp corresponding the real time SStables containing the
> original data will not end up to be compacted with SSTables having the
> tombstone. Even if writing tombstones by faking the timestamps the SSTable
> should be written apart from the on-going real-time data. Otherwise the
> SSTables have to be splitted (see later).
> TTL is a working method to delete data from column family and releasing disk
> space in a predictable manner. However, setting the correct TTL is not a
> trivial task. Required TTL might change e.g. due to legislation or the
> customer would like to have a longer lifetime for the data.
> The other factor affecting the disk space consumption is the variability of
> the rate how much data is fed to the column family. In certain
> troubleshooting cases the sample rate can be increased ten fold for a large
> portion of collected time series. This will lead to rapid consumption of disk
> space and old data has to be deleted / archived in a such manner that disk
> space will be released in a quick and predictable manner.
> Losing one or more nodes from the cluster and not having a spare hardware
> will also lead to a situation that data from the lost node has to be
> replicated again for the remaining nodes. This will lead to increased disk
> space consumption per node and probably requires some cleaning of older data
> away from the active column family.
> All of the above issues could be of course handled just by adding more disk
> space or nodes to the cluster. In the cloud environment that would a feasible
> option. In the application sitting in real hardware in isolated environment
> this is not a feasible solution due to practical reasons or due to costs.
> Getting new hardware on sites might take a long time e.g. due to custom
> regulations.
> In the application domain (time series data collection) the data is not
> modified after inserting to the column family. There will be only read
> operations and deletion / archiving of old data based on the TTL or operator
> actions.
> The above reasoning will lead to following conclusions and proposals.
> * TWCS and DTCS (with certain modifications) are leading to a well structured
> SSTables where tables are organized in timely manner giving opportunities to
> manage available disk capacity on nodes. Recovering from repairs works also
> (compaction the flood of small SSTables with larger ones).
> * Being able to effectively split the SStables along a given time line would
> lead to SSTable sets on all nodes that would allow deletion or archiving
> SSTables. What would be the mechanism to inactivate SSTables during deletion
> / archiving so that nodes don’t start streaming “missing” data between nodes
> (repairs)?
> * Being able to split existing SSTables along multiple timelines determined
> by TWCS would allow insertion of older data to the column family that would
> eventually be compacted in desired manner in correct time window. Original
> SSTable would be streamed to several SStables according to time windows. In
> the end empty SSTables would be discarded.
> * Splitting action would be a tool to be executed through the nodetool
> command when needed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)