[ 
https://issues.apache.org/jira/browse/CASSANDRA-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162283#comment-17162283
 ] 

David Capwell commented on CASSANDRA-15191:
-------------------------------------------

FYI conversation has been happening in slack: 
https://the-asf.slack.com/archives/CK23JSY2K/p1595280621333400

Updates:

* the tests are flaky, looks like there is a race condition in the test where 
the flag isn't updated yet.  A workaround was added to query multiple times 
with a 5 second sleep in hopes to make the tests stable

> stop_paranoid disk failure policy is ignored on CorruptSSTableException after 
> node is up
> ----------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15191
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15191
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local/Config
>            Reporter: Vincent White
>            Assignee: Stefan Miklosovic
>            Priority: Normal
>             Fix For: 3.11.x, 4.0-beta
>
>         Attachments: log.txt
>
>          Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> There is a bug when disk_failure_policy is set to stop_paranoid and 
> CorruptSSTableException is thrown after server is up. The problem is that 
> this setting is ignored. Normally, it should stop gossip and transport but it 
> just continues to serve requests and an exception is just logged.
>  
> This patch unifies the exception handling in JVMStabilityInspector and code 
> is reworked in such way that this inspector acts as a central place where 
> such exceptions are inspected. 
>  
> The core reason for ignoring that exception is that thrown exception in 
> AbstractLocalAwareExecturorService is not CorruptSSTableException but it is 
> RuntimeException and that exception is as its cause. Hence it is better if we 
> handle this in JVMStabilityInspector which can recursively examine it, hence 
> act accordingly.
> Behaviour before:
> stop_paranoid of disk_failure_policy is ignored when CorruptSSTableException 
> is thrown, e.g. on a regular select statement
> Behaviour after:
> Gossip and transport (cql) is turned off, JVM is still up for further 
> investigation e.g. by jmx.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to