Re: [PATCH] fix write error handling on SR RAID1

2015-07-11 Thread Karel Gardas
On Sat, Jul 11, 2015 at 3:44 PM, Joel Sing wrote: > Your analysis is incorrect - offlining of chunks is handled via sr_ccb_done(). > If lower level I/O indicates an error occurred then the chunk is marked > offline, > providing that the discipline has redundancy (for example, we do not offline >

Re: [PATCH] fix write error handling on SR RAID1

2015-07-11 Thread Joel Sing
On Friday 10 July 2015 22:01:43 Karel Gardas wrote: > On Fri, Jul 10, 2015 at 9:34 PM, Chris Cappuccio wrote: > > My first impression, offlining the drive after a single chunk failure > > may be too aggressive as some errors are a result of issues other than > > drive failures. > > Indeed, it may

Re: [PATCH] fix write error handling on SR RAID1

2015-07-10 Thread Karel Gardas
On Fri, Jul 10, 2015 at 9:34 PM, Chris Cappuccio wrote: > My first impression, offlining the drive after a single chunk failure > may be too aggressive as some errors are a result of issues other than > drive failures. Indeed, it may look as too aggressive, but is my analysis written in comment c

Re: [PATCH] fix write error handling on SR RAID1

2015-07-10 Thread Chris Cappuccio
Karel Gardas [gard...@gmail.com] wrote: > Hello, > > I think I've found a bug on software RAID1 implementation of handling > write errors. IMHO code should check if every write to every chunk > succeed. If not, then there is an error which it needs to handle. > Proposed patch handles such error by

[PATCH] fix write error handling on SR RAID1

2015-07-10 Thread Karel Gardas
Hello, I think I've found a bug on software RAID1 implementation of handling write errors. IMHO code should check if every write to every chunk succeed. If not, then there is an error which it needs to handle. Proposed patch handles such error by off-lining the problematic drive. The patch compile