Jonathan Shook created CASSANDRA-10419:
------------------------------------------
Summary: Make JBOD compaction and flushing more robust
Key: CASSANDRA-10419
URL: https://issues.apache.org/jira/browse/CASSANDRA-10419
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Jonathan Shook
Attachments: timeseries-study-overview-jbods.png
With JBOD and several smaller disks, like SSDs at 1.2 TB or lower, it is
possible to run out of space prematurely. With a sufficient ingestion rate,
disk selection logic seems to overselect on certain JBOD targets. This causes a
premature C* shutdown when there is a significant amount of space left. With
DTCS, for example, it should be possible to utilize over 90% of the available
space with certain settings. However in the scenario I tested, only about 50%
was utilized, before a filesystem full error. (see below). It is likely that
this is a scheduling challenge between high rates of ingest and smaller data
directories. It would be good to use an anticipatory model if possible to more
carefully select compaction targets according to fill rates.
The attached image shows a test with 12 1.2TB JBOD data directories. At the
end, the utilizations are:
59GiB, 83GiB, 83GiB, 97GiB, 330GiB, 589GiB, 604GiB, 630GiB, 697GiB, 1.055TiB,
1.083TB, 1092TiB,
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)