[ 
https://issues.apache.org/jira/browse/CASSANDRA-2116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13035111#comment-13035111
 ] 

Chris Goffinet commented on CASSANDRA-2116:
-------------------------------------------

Unfortunately the best we can get is IOError from Java. For example we use this 
patch to actually detect when our raid array dies, the OS will tell java to 
throw IOError. I think we should error on the side of, if data is corrupt, we 
should let the operator decide what mode he wants. For us, any errors or any 
corruption of data, we want to take the node out right away.

We have been testing this in production for awhile and it works really well 
when disks die, and we also did tests involving removing drives from the system 
while it was serving traffic. 

The Read/Write classes was a similar idea of how the Hadoop code base handles 
this very issue.


> Separate out filesystem errors from generic IOErrors
> ----------------------------------------------------
>
>                 Key: CASSANDRA-2116
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2116
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Chris Goffinet
>            Priority: Minor
>             Fix For: 1.0
>
>         Attachments: 
> 0001-Separate-out-filesystem-errors-from-generic-IOErrors.patch
>
>
> We throw IOErrors everywhere today in the codebase. We should separate out 
> specific errors such as (reading, writing) from filesystem into FSReadError 
> and FSWriteError. This makes it possible in the next ticket to allow certain 
> failure modes (kill the server if reads or writes fail to disk).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to