[jira] [Updated] (CASSANDRA-15191) stop_paranoid disk failure policy is ignored on CorruptSSTableException after node is up

Stefan Miklosovic (Jira) Thu, 16 Jul 2020 08:24:10 -0700


     [ 
https://issues.apache.org/jira/browse/CASSANDRA-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Stefan Miklosovic updated CASSANDRA-15191:
------------------------------------------
    Description: 
There is a bug when disk_failure_policy is set to stop_paranoid and 
CorruptSSTableException is thrown after server is up. The problem is that this 
setting is ignored. Normally, it should stop gossip and transport but it just 
continues to serve requests and an exception is just logged.

 

This patch unifies the exception handling in JVMStabilityInspector and code is 
reworked in such way that this inspector acts as a central place where such 
exceptions are inspected. 

 

The core reason for ignoring that exception is that thrown exception in 
AbstractLocalAwareExecturorService is not CorruptSSTableException but it is 
RuntimeException and that exception is as its cause. Hence it is better if we 
handle this in JVMStabilityInspector which can recursively examine it, hence 
act accordingly.

Behaviour before:

stop_paranoid of disk_failure_policy is ignored when CorruptSSTableException is 
thrown, e.g. on a regular select statement

Behaviour after:

Gossip and transport (cql) is turned off, JVM is still up for further 
investigation e.g. by jmx.

  was:
There is a bug when disk_failure_policy is set to stop_paranoid and 
CorruptSSTableException is thrown after server is up. The problem is that this 
setting is ignored. Normally, it should stop gossip and transport but it just 
continues to serve requests and an exception is just logged.

 

This patch unifies the exception handling in JVMStabilityInspector and code is 
reworked in such way that this inspector acts as a central place where such 
exceptions are inspected. 

 

Behaviour before:

stop_paranoid of disk_failure_policy is ignored when CorruptSSTableException is 
thrown, e.g. on a regular select statement

Behaviour after:

Gossip and transport (cql) is turned off, JVM is still up for further 
investigation e.g. by jmx.


> stop_paranoid disk failure policy is ignored on CorruptSSTableException after 
> node is up
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15191
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15191
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Config
>            Reporter: Vincent White
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>         Attachments: log.txt
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> There is a bug when disk_failure_policy is set to stop_paranoid and 
> CorruptSSTableException is thrown after server is up. The problem is that 
> this setting is ignored. Normally, it should stop gossip and transport but it 
> just continues to serve requests and an exception is just logged.
>  
> This patch unifies the exception handling in JVMStabilityInspector and code 
> is reworked in such way that this inspector acts as a central place where 
> such exceptions are inspected. 
>  
> The core reason for ignoring that exception is that thrown exception in 
> AbstractLocalAwareExecturorService is not CorruptSSTableException but it is 
> RuntimeException and that exception is as its cause. Hence it is better if we 
> handle this in JVMStabilityInspector which can recursively examine it, hence 
> act accordingly.
> Behaviour before:
> stop_paranoid of disk_failure_policy is ignored when CorruptSSTableException 
> is thrown, e.g. on a regular select statement
> Behaviour after:
> Gossip and transport (cql) is turned off, JVM is still up for further 
> investigation e.g. by jmx.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (CASSANDRA-15191) stop_paranoid disk failure policy is ignored on CorruptSSTableException after node is up

Reply via email to