From: Ruoyu <lian...@ucweb.com> I am sure this is a bug because the variable nr_outstanding_reqs does not reset to zero sometimes after client cancelling the request. Once nr_outstanding_reqs is not zero, sheep process never be killed successfully, neither using kill <pid> nor using dog node kill command.
But I am not sure whether the bug is fixed perfectly because I am not familiar with the sheepdog networking logic. I have to add some error messages to every doutful statements. The result is, I caught one of them. So, I call the function clear_client_info in that place. It seems every thing is fine after the modification. Does anyone help to investigate and fix it? Signed-off-by: Ruoyu <lian...@ucweb.com> Signed-off-by: Hitoshi Mitake <mitake.hito...@lab.ntt.co.jp> --- sheep/request.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-) diff --git a/sheep/request.c b/sheep/request.c index 8a71dc2..11a593d 100644 --- a/sheep/request.c +++ b/sheep/request.c @@ -710,7 +710,16 @@ main_fn void put_request(struct request *req) if (ci->tx_req == NULL) /* There is no request being sent. */ - conn_tx_on(&ci->conn); + if (conn_tx_on(&ci->conn)) { + sd_err("switch on sending flag failure, " + "connection maybe closed"); + /* + * should not free_request(req) here + * because it is already in done list + * clear_client_info will free it + */ + clear_client_info(ci); + } } } } @@ -770,7 +779,9 @@ static void rx_main(struct work *work) return; } - conn_rx_on(&ci->conn); + if (conn_rx_on(&ci->conn)) + sd_err("switch on receiving flag failure, " + "connection maybe closed"); if (is_logging_op(get_sd_op(req->rq.opcode))) { sd_info("req=%p, fd=%d, client=%s:%d, op=%s, data=%s", @@ -846,7 +857,9 @@ static void tx_main(struct work *work) } if (!list_empty(&ci->done_reqs)) - conn_tx_on(&ci->conn); + if (conn_tx_on(&ci->conn)) + sd_err("switch on sending flag failure, " + "connection maybe closed"); } static void destroy_client(struct client_info *ci) @@ -932,8 +945,11 @@ static void client_handler(int fd, int events, void *data) return clear_client_info(ci); if (events & EPOLLIN) { - if (conn_rx_off(&ci->conn) != 0) + if (conn_rx_off(&ci->conn) != 0) { + sd_err("switch off receiving flag failure, " + "connection maybe closed"); return; + } /* * Increment refcnt so that the client_info isn't freed while @@ -946,8 +962,11 @@ static void client_handler(int fd, int events, void *data) } if (events & EPOLLOUT) { - if (conn_tx_off(&ci->conn) != 0) + if (conn_tx_off(&ci->conn) != 0) { + sd_err("switch off sending flag failure, " + "connection maybe closed"); return; + } assert(ci->tx_req == NULL); ci->tx_req = list_first_entry(&ci->done_reqs, struct request, -- 1.8.3.2 -- sheepdog mailing list sheepdog@lists.wpkg.org http://lists.wpkg.org/mailman/listinfo/sheepdog