When a large msg is being sent out over rdma and in the stage of waiting
for read ack from peer, it is moved from rio->write_queue to rio->active_txs.
However the msg in rio->active_txs is not checked by pcs_rdma_next_timeout()
to return a correct timeout back to rpc, as a result the rpc timer is not
started. When the peer somehow becomes unresponsive, the msg at rio->active_txs
can be stuck at waiting for read ack stage forever, because it can't be killed
by the calendar timer since it's now under network I/O. As a result, the rpc
can hang forever without detecting the stuck msg at underlying rdma io.

Apparently pcs_rdma_next_timeout should return the next timeout based on
first msg in rio->active_txs.

Fixes: #VSTOR-105982
https://virtuozzo.atlassian.net/browse/VSTOR-105982

Signed-off-by: Liu Kui <kui....@virtuozzo.com>
---
 fs/fuse/kio/pcs/pcs_rdma_io.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/fs/fuse/kio/pcs/pcs_rdma_io.c b/fs/fuse/kio/pcs/pcs_rdma_io.c
index 2755b13fb8a5..6fa38338ad0c 100644
--- a/fs/fuse/kio/pcs/pcs_rdma_io.c
+++ b/fs/fuse/kio/pcs/pcs_rdma_io.c
@@ -1668,14 +1668,21 @@ static unsigned long pcs_rdma_next_timeout(struct 
pcs_netio *netio)
        struct pcs_rdmaio *rio = rio_from_netio(netio);
        struct pcs_rpc *ep = netio->parent;
        struct pcs_msg *msg;
+       struct rio_tx *tx;
 
        BUG_ON(!mutex_is_locked(&ep->mutex));
 
-       if (list_empty(&rio->write_queue))
-               return 0;
+       if (!list_empty(&rio->active_txs)) {
+               tx = list_first_entry(&rio->active_txs, struct rio_tx, list);
+               return tx->msg->start_time + rio->send_timeout;
+       }
 
-       msg = list_first_entry(&rio->write_queue, struct pcs_msg, list);
-       return msg->start_time + rio->send_timeout;
+       if (!list_empty(&rio->write_queue)) {
+               msg = list_first_entry(&rio->write_queue, struct pcs_msg, list);
+               return msg->start_time + rio->send_timeout;
+       }
+
+       return 0;
 }
 
 static int pcs_rdma_sync_send(struct pcs_netio *netio, struct pcs_msg *msg)
-- 
2.39.5 (Apple Git-154)

_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Reply via email to