Jeremy Hanna commented on CASSANDRA-14499:

I still have a concern about including something into the codebase that shuts 
down node operations automatically - even if it's opt-in.  Considering that 
under normal circumstances, nodes will have around the same amount of data, 
that leads to some fairly normal cascading failure scenarios when this is 
enabled.  That leads me to wonder when this would be useful.

One use case where we see this as valuable is QA/perf/test clusters that may 
not have the full monitoring setup but need to be protected from errant clients 
filling up disks to a point where worse things happen.

So is it that there is not a lot of access to the machine or the VM or the OS 
in those QA/perf/test clusters but there *is* access to Cassandra so utilize 
that access to make sure an errant client doesn't do things that require 
getting access (or contacting the people with access) to the machine to 
rectify, like when the volume fills up?

Would the only circumstances where this is useful be in QA/perf/test clusters 
and therefore cascading failure of the cluster isn't the end of the world?

I'm just concerned that while a very mature user is going to use this 
appropriately, others out there will inadvertently misuse the feature.  If this 
is something that gets into the codebase, I would just want to make extra sure 
that people are aware of both the intended use cases/scenarios and especially 
the risks of cascading failure.  That said, introducing something that may 
introduce cascading failure *automatically* for the purpose of test 
environments seems unwise.

I'm happy to be wrong about the probability of cascading failure or the 
expected use cases, but please help me understand.

> node-level disk quota
> ---------------------
>                 Key: CASSANDRA-14499
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14499
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Jordan West
>            Assignee: Jordan West
>            Priority: Major
> Operators should be able to specify, via YAML, the amount of usable disk 
> space on a node as a percentage of the total available or as an absolute 
> value. If both are specified, the absolute value should take precedence. This 
> allows operators to reserve space available to the database for background 
> tasks -- primarily compaction. When a node reaches its quota, gossip should 
> be disabled to prevent it taking further writes (which would increase the 
> amount of data stored), being involved in reads (which are likely to be more 
> inconsistent over time), or participating in repair (which may increase the 
> amount of space used on the machine). The node re-enables gossip when the 
> amount of data it stores is below the quota.   
> The proposed option differs from {{min_free_space_per_drive_in_mb}}, which 
> reserves some amount of space on each drive that is not usable by the 
> database.  

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to