We've seen a spurious full resync, because a connection breakage raced with drbd_start_resync(, C_SYNC_TARGET), and the resulting state change request intended to start the resync ended up looking like a local invalidate.
Fix: Double check the state inside the lock, and don't even request that state change, if we had connection or IO problems. Signed-off-by: Philipp Reisner <[email protected]> Signed-off-by: Lars Ellenberg <[email protected]> --- drivers/block/drbd/drbd_worker.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c index f41e224..7f51f88 100644 --- a/drivers/block/drbd/drbd_worker.c +++ b/drivers/block/drbd/drbd_worker.c @@ -1653,7 +1653,9 @@ void drbd_start_resync(struct drbd_conf *mdev, enum drbd_conns side) clear_bit(B_RS_H_DONE, &mdev->flags); write_lock_irq(&global_state_lock); - if (!get_ldev_if_state(mdev, D_NEGOTIATING)) { + /* Did some connection breakage or IO error race with us? */ + if (mdev->state.conn < C_CONNECTED + || !get_ldev_if_state(mdev, D_NEGOTIATING)) { write_unlock_irq(&global_state_lock); mutex_unlock(mdev->state_mutex); return; -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

