We exponentially back off when we encounter connection errors.  If several
errors accumulate, we will eventually wait ages before even trying to
reconnect.

Fix this by resetting the backoff counter after a successful negotiation/
connection with the remote node.  Fixes ceph issue #2802.

Signed-off-by: Sage Weil <[email protected]>
---
 net/ceph/messenger.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 65964c2..28896eb 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -1633,6 +1633,8 @@ static int process_connect(struct ceph_connection *con)
                if (con->in_reply.flags & CEPH_MSG_CONNECT_LOSSY)
                        set_bit(LOSSYTX, &con->flags);
 
+               con->delay = 0;      /* reset backoff memory */
+
                prepare_read_tag(con);
                break;
 
-- 
1.7.9

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to