Re: Out of disk handler proposal

2023-06-29 Thread Nick Vatamaniuc
Looks great, and a nice way to use the Erlang diskmon functionality.

Would it make sense to allow writes which delete data to allow users
to recover some space. Technically it could be a "deleted" update with
a large body, but in most cases users would not do that and would
actually delete data.

Another interesting thing is what happens at the coordinator level.
Currently the patch behavior (I think?) is if at least one worker
(node) responds with "insufficient_storage" the whole request stops
and returns a 507. That's probably the simpler approach. Would there
be a case where the disk sizes on nodes are not equal, and some nodes
would become full while others have more space. In that case we could
consider "insufficient_storage" errors as maintenance mode or
"rexi_EXIT" errors. However, that might hide the problem at the API
level as we'd still return 202 for doc updates.  (But hopefully
diskmon would emit emergency or critical level logs at least).
Another, tangentially related error condition, is when file systems
are remounted read-only on error. In that case disk writes return
"erofs" errors. We had an issue to improve the behavior in that case,
treating them as a maintenance mode / rexi_EXIT error :
https://github.com/apache/couchdb/issues/4168

What if the disk does get full (the internal replicator catches up or
brings in data from other nodes), would it make sense to also handle
"enospc" errors similar to "insufficient_storage" errors?

In diskmon is there a way to raise a warning before we hit the
"insufficient_storage" and at least emit something the logs and/or
have a _node/_local/alarms API to let users check if there are any
warnings or alarms set? That's to give users a chance to manage their
disk capacity before API requests start failing with 507? I don't
think there is anything in the current API for them to get an
indication how close they are to filling the disk.

Cheers,
-Nick

On Thu, Jun 29, 2023 at 10:09 AM Robert Newson  wrote:
>
> Hi All,
>
> out of disk handler
>
> I propose to enhance CouchDB to monitor disk occupancy and react 
> automatically as free space becomes scarce. I've written a working prototype 
> at: https://github.com/apache/couchdb/compare/main...out-of-disk-handler
>
> The `diskmon` application is part of Erlang/OTP and I suggest we use that as 
> the base, since it supports all the platforms we support (and a few more).
>
> The patch reacts differently depending on whether it is database_dir or 
> view_index_dir that runs out of space (of course they might both run out of 
> space at the same time in the common case that the same device is used for 
> both), namely;
>
> 1) Clustered database updates are prohibited (a 507 Insufficient Storage 
> error is returned)
> 2) Background indexing is suspended (no new jobs will be started)
> 3) Querying a stale view is prohibited (a 507 Insufficient Storage error is 
> returned)
> 4) Querying an up-to-date view is permitted
>
> The goal being to leave internal replication running (to avoid data loss) and 
> compaction (as the only action that reduces disk occupancy). I can see adding 
> an option to suspend _all_ writing at, say, 99% full, in order to avoid 
> hitting actual end of disk, but have not coded this up in the branch so far.
>
> At the moment these all activate at once, which I think is not how we want to 
> do this.
>
> I suggest that we have configuration options for;
>
> 1) a global toggle to activate the out of disk handler
> 2) a parameter for the used disk percentage of view_index_dir at which we 
> suspend background indexing, defaulting to 80
> 3) a parameter for the used disk percentage of view_index_dir at which we 
> refuse to update stale indexes, defaulting to 90
> 4) a parameter for the used disk percentage of database_dir at which we 
> suspend writes, defaulting to 90.
>
> What do we all think?
>
>
> B.


Out of disk handler proposal

2023-06-29 Thread Robert Newson
Hi All,

out of disk handler

I propose to enhance CouchDB to monitor disk occupancy and react automatically 
as free space becomes scarce. I've written a working prototype at: 
https://github.com/apache/couchdb/compare/main...out-of-disk-handler

The `diskmon` application is part of Erlang/OTP and I suggest we use that as 
the base, since it supports all the platforms we support (and a few more).

The patch reacts differently depending on whether it is database_dir or 
view_index_dir that runs out of space (of course they might both run out of 
space at the same time in the common case that the same device is used for 
both), namely;

1) Clustered database updates are prohibited (a 507 Insufficient Storage error 
is returned)
2) Background indexing is suspended (no new jobs will be started)
3) Querying a stale view is prohibited (a 507 Insufficient Storage error is 
returned)
4) Querying an up-to-date view is permitted

The goal being to leave internal replication running (to avoid data loss) and 
compaction (as the only action that reduces disk occupancy). I can see adding 
an option to suspend _all_ writing at, say, 99% full, in order to avoid hitting 
actual end of disk, but have not coded this up in the branch so far.

At the moment these all activate at once, which I think is not how we want to 
do this.

I suggest that we have configuration options for;

1) a global toggle to activate the out of disk handler
2) a parameter for the used disk percentage of view_index_dir at which we 
suspend background indexing, defaulting to 80
3) a parameter for the used disk percentage of view_index_dir at which we 
refuse to update stale indexes, defaulting to 90
4) a parameter for the used disk percentage of database_dir at which we suspend 
writes, defaulting to 90.

What do we all think?


B.