[
https://issues.apache.org/jira/browse/CASSANDRA-20363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17931072#comment-17931072
]
Tommy Stendahl commented on CASSANDRA-20363:
--------------------------------------------
You ask some very good questions. I have tried to keep things as simple as
possible, maybe to simple:)
I did not propose a yaml config since I knew that Brandon was against that and
a java property looked simple enough. In my opinion, if you replace a default
implementation with your own like this you take a lot of responsibility for
your implementation and maybe not having a yaml config emphasis that.
I would not mind adding parameters in the yaml via the ParameterizedClass but I
can do that via a java property so its not critical.
I was think of using ScheduledThreadPoolExecutor.scheduleAtFixedRate(Runnable
command, long initialDelay, long period, TimeUnit unit) and schedule a task
that simply try and read from the disk and if successful open traffic and
gossip and cancel the scheduling.
We only use one data disk so I did not consider multiple disks. If the disk
never become available it would be as stop normally work, you have to restart
the node. For us restarting the node would be killing the container and a new
one will start.
But the main problem I see now it that I don't think this will work, at least
not reliably, since I failed to consider commit_failure_policy. There is no way
to override the commit_failure_policy so if we enter stop that way it will be a
normal stop and we can't do anything about that. Changing that would require
more work then I feel is reasonable to ask for in this jira. Unless I'm missing
something.
> Add option to set a custom FSErrorHandler
> -----------------------------------------
>
> Key: CASSANDRA-20363
> URL: https://issues.apache.org/jira/browse/CASSANDRA-20363
> Project: Apache Cassandra
> Issue Type: New Feature
> Components: Legacy/Core
> Reporter: Tommy Stendahl
> Assignee: Tommy Stendahl
> Priority: Normal
> Fix For: 5.x
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Add java property to override the DefaultFSErrorHandler with a custom
> implementation.
> The use case I am looking at is a customer deployment that are using network
> disks and these can go off-line sometimes, I would like to use
> "disk_failure_policy: stop" but automatically detect when the disk is on-line
> again and just open gossip and transports so the nodes comes back UP without
> triggering a restart of the node.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]