On Friday 10 July 2015 22:01:43 Karel Gardas wrote: > On Fri, Jul 10, 2015 at 9:34 PM, Chris Cappuccio <ch...@nmedia.net> wrote: > > My first impression, offlining the drive after a single chunk failure > > may be too aggressive as some errors are a result of issues other than > > drive failures. > > Indeed, it may look as too aggressive, but is my analysis written in > comment correct? I mean: if there is a write error for whatever reason > to one or more chunk(s) and if we completely ignore it since at least > one write succeed, then arrays is in incorrect state where some > drive(s) hold(s) correct data and another drive(s) hold(s) previous > data. Since reading is done in round-robin fashion, then there is a > chance that you will read old data in the future. If this is correct, > then I think it calls for fix.
Your analysis is incorrect - offlining of chunks is handled via sr_ccb_done(). If lower level I/O indicates an error occurred then the chunk is marked offline, providing that the discipline has redundancy (for example, we do not offline chunks for RAID 0 or CRYPTO - it usually just makes things worse). This applies to both read and write operations. > If you do not like off-lining drive(s) just after 1 failed read, then > perhaps correct may be to restart whole work unit and enforce writing > again? We can even have some threshold where we may stop and consider > the problematic block really not writeable at the end. Is something > like that better solution? We already offline after a single read or write failure occurs - it would be possible to implement some form of retry algorithm, however at some point we have to trust the lower layers (VFS, disk controller driver, disk hardware, etc).