[jira] [Commented] (CASSANDRA-20363) Add option to set a custom FSErrorHandler

Stefan Miklosovic (Jira) Wed, 26 Feb 2025 05:33:15 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-20363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930683#comment-17930683
 ]


Stefan Miklosovic commented on CASSANDRA-20363:
-----------------------------------------------

It fails on two unit tests so far

https://app.circleci.com/pipelines/github/instaclustr/cassandra/5509/workflows/4bbfd50f-509b-464f-b3b1-ae2d28f8fc73/jobs/348910/tests

The first is DatabaseDescriptorRefTest, verifying we do not leak anything we 
don't want.

I was thinking more about that and that DDRefTest says:

{code}
/**
 * Verifies that {@link DatabaseDescriptor#clientInitialization()} and a couple 
of <i>apply</i> methods
 * do not somehow lazily initialize any unwanted part of Cassandra like schema, 
commit log or start
 * unexpected threads.
 *
 * {@link DatabaseDescriptor#toolInitialization()} is tested via unit tests 
extending
 * {@link org.apache.cassandra.tools.OfflineToolUtils}.
 */
{code}

If we want to build on top of this, e.g. starting a thread and monitoring, then 
it means it will start a thread as part of DD initialisation and I am not 
completely sure we should do that. It would just leak threads. I suggest to 
just resolve the class name to instantiate, in DatabaseDescriptor, and then 
setting the FS error handler would be done outside of DD initialisation methods.

We might also expand FSErrorHandler interface to add methods like "start" and 
"stop" and do the patch in such a way that it would be already prepared for 
what is coming instead of trying to retrofit that after we do what this ticket 
is about. 

The second is OutOfSpaceTest not sure what's the reason for now ...

> Add option to set a custom FSErrorHandler
> -----------------------------------------
>
>                 Key: CASSANDRA-20363
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20363
>             Project: Apache Cassandra
>          Issue Type: New Feature
>          Components: Legacy/Core
>            Reporter: Tommy Stendahl
>            Assignee: Tommy Stendahl
>            Priority: Normal
>             Fix For: 5.x
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Add java property to override the DefaultFSErrorHandler with a custom 
> implementation.
> The use case I am looking at is a customer deployment that are using network 
> disks and these can go off-line sometimes, I would like to use 
> "disk_failure_policy: stop" but automatically detect when the disk is on-line 
> again and just open gossip and transports so the nodes comes back UP without 
> triggering a restart of the node.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-20363) Add option to set a custom FSErrorHandler

Reply via email to