When a node has single disk and faces EIO, md.nr_disks will change from 1 to 0. In such a case, md_do_recover() will not call kick_recover(). This may lead to a problematic behavior. We should call leave_cluster() if md.nr_disks equals 0.
Reported-by: Hitoshi Mitake <mitake.hito...@lab.ntt.co.jp> Signed-off-by: Bingpeng Zhu <bingpeng....@alibaba-inc.com> --- sheep/md.c | 13 ++++++++----- 1 files changed, 8 insertions(+), 5 deletions(-) diff --git a/sheep/md.c b/sheep/md.c index 378d1f1..0313b95 100644 --- a/sheep/md.c +++ b/sheep/md.c @@ -541,11 +541,14 @@ static void md_do_recover(struct work *work) out: sd_rw_unlock(&md.lock); - if (disk) - update_node_disks(); - - if (nr > 0) - kick_recover(); + if (disk) { + if (nr > 0) { + update_node_disks(); + kick_recover(); + } else { + leave_cluster(); + } + } free(mw); } -- 1.7.1 -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog