From: Greg Jumper <[email protected]>

The function rds_tcp_get_peer_sport() should return the peer port of a
socket, even when the socket is not currently connected, so that RDS
can reliably determine the MPRDS "lane" corresponding to the port.

rds_tcp_get_peer_sport() calls kernel_getpeername() to get the port
number; however, when paths between endpoints frequently drop and
reconnect, kernel_getpeername() can return -ENOTCONN, causing
rds_tcp_get_peer_sport() to return an error, and ultimately causing
RDS to use the wrong lane for a port when reconnecting to a peer.

This patch modifies rds_tcp_get_peer_sport() to directly call the
socket-specific get-name function (inet_getname() in this case) that
kernel_getpeername() also calls.  The socket-specific function offers
an additional argument which, when set to a value greater than 1,
causes the function to return the socket's peer name even when the
socket is not connected, which in turn allows rds_tcp_get_peer_sport()
to return the correct port number.

Signed-off-by: Greg Jumper <[email protected]>
Signed-off-by: Allison Henderson <[email protected]>
---
 net/rds/tcp_listen.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c
index f2c4778be0b3..ba283c8ffa64 100644
--- a/net/rds/tcp_listen.c
+++ b/net/rds/tcp_listen.c
@@ -67,7 +67,14 @@ rds_tcp_get_peer_sport(struct socket *sock)
        } saddr;
        int sport;
 
-       if (kernel_getpeername(sock, &saddr.addr) >= 0) {
+       /* Call the socket's getname() function (inet_getname() in this case)
+        * with a final argument greater than 1 to get the peer's port
+        * regardless of whether the socket is currently connected.
+        * Using peer=2 will get the peer port even during reconnection states
+        * (TCPF_CLOSE, TCPF_SYN_SENT). This avoids -ENOTCONN while
+        * inet_dport still contains the correct peer port.
+        */
+       if (sock->ops->getname(sock, &saddr.addr, 2) >= 0) {
                switch (saddr.addr.sa_family) {
                case AF_INET:
                        sport = ntohs(saddr.sin.sin_port);
@@ -177,7 +184,7 @@ void rds_tcp_conn_slots_available(struct rds_connection 
*conn, bool fan_out)
 
        if (fan_out)
                /* Delegate fan-out to a background worker in order
-                * to allow "kernel_getpeername" to acquire a lock
+                * to allow "sock->ops->getname()" to acquire a lock
                 * on the socket.
                 * The socket is already locked in this context
                 * by either "rds_tcp_recv_path" or "tcp_v{4,6}_rcv",
-- 
2.43.0


Reply via email to