Re: [PATCH] fix write error handling on SR RAID1

2015-07-11 Thread Joel Sing
On Friday 10 July 2015 22:01:43 Karel Gardas wrote: On Fri, Jul 10, 2015 at 9:34 PM, Chris Cappuccio ch...@nmedia.net wrote: My first impression, offlining the drive after a single chunk failure may be too aggressive as some errors are a result of issues other than drive failures.

Re: [PATCH] fix write error handling on SR RAID1

2015-07-11 Thread Karel Gardas
On Sat, Jul 11, 2015 at 3:44 PM, Joel Sing j...@sing.id.au wrote: Your analysis is incorrect - offlining of chunks is handled via sr_ccb_done(). If lower level I/O indicates an error occurred then the chunk is marked offline, providing that the discipline has redundancy (for example, we do

[PATCH] fix write error handling on SR RAID1

2015-07-10 Thread Karel Gardas
Hello, I think I've found a bug on software RAID1 implementation of handling write errors. IMHO code should check if every write to every chunk succeed. If not, then there is an error which it needs to handle. Proposed patch handles such error by off-lining the problematic drive. The patch