I haven't tried this, but you might be able to close a socket by:

1/ decrease  /proc/sys/net/ipv4/tcp_keepalive_time
  so that keep-alives get sent sooner.  Maybe set to 60,
  set ..._intvl to 5, and _probes to 3.

2/ create a rule with iptables to drop all messages sent
  on the particular connection.
   iptables -A OUTPUT -m multiport --dports ... -sports .. -j DROP


Given the suggested keep alive settings, you should only have to wait 75
seconds after creating the IP tables rule before the connection is
broken.

NeilBrown


On Thu, Feb 20 2020, Degremont, Aurelien wrote:

> Thanks. It feels like the theory is valid.
> Ideally to confirm I would need a way to manually force close the socklnd 
> socket to force the other peer to re-established it.
> Could not find a way to do it for socket opened by kernel threads.
>
> Le 19/02/2020 23:12, « NeilBrown » <[email protected]> a écrit :
>
>     
>     When LNet wants to send a message over a SOCKLND interface,
>     ksocknal_launch_packet() is called.
>     
>     This calls ksocknal_launch_all_connections_locked()
>     This loops over all "routes" to the "peer" to make sure they all have
>     "connections".
>     If it finds a route without a connection (returned by
>     ksocknal_find_connectable_route_locked()) it calls
>     ksocknal_launch_connection_locked() which adds the connection request to
>     ksnd_connd_routes, and wakes up the connd.  The connd thread will then
>     make the connection.
>     
>     Hope that helps.
>     
>     NeilBrown
>     
>     
>     
>     On Wed, Feb 19 2020, Degremont, Aurelien wrote:
>     
>     > Thanks! That's really interesting.
>     > Do you have a code pointer that could show where the code will 
> establish this connection if missing?
>     >
>     > Le 18/02/2020 23:34, « NeilBrown » <[email protected]> a écrit :
>     >
>     >     
>     >     It is not true that:
>     >        LNET will established connections only if asked for by upper 
> layers.
>     >     
>     >     or at least, not in the sense that the upper layers ask for a
>     >     connection.
>     >     Lustre knows nothing about connections.  Even LNet doesn't really 
> know
>     >     about connections. It is only at the socklnd level that connections 
> mean
>     >     much.
>     >     
>     >     Lustre and LNet are message-passing protocols.
>     >     Lustre asks LNet to send a message to a given peer, and gives some
>     >     details of the sort of reply to expect.
>     >     LNet chooses a route and thus a network interface, and asked the 
> LND to
>     >     send the message.
>     >     The socklnd LND will see if it already has a TCP connection.  If it
>     >     does, it will use it.  If not, it will create one.
>     >     
>     >     So yes : it is exactly:
>     >       possible that the server in this case opens the connection itself
>     >       without waiting for the client to reconnect?
>     >     
>     >     NeilBrown
>     >     
>     >     
>     >     On Tue, Feb 18 2020, Aurelien Degremont wrote:
>     >     
>     >     > Thanks for your reply.
>     >     > I think I have a good enough understanding of LNET itself. My 
> question was more about how LNET is being used by Lustre itself.
>     >     >
>     >     > LNET will established connections only if asked for by upper 
> layers. 
>     >     > When I was talking about client and server, I was talking about 
> how Lustre was using it.
>     >     >
>     >     > As far as I understood, Lustre server only contact clients when 
> they need to send LDLM callbacks.
>     >     > They do so through the socket already opened by the client 
> (reverse import).
>     >     > What happened if the socket is closed is what I'm not sure. I 
> though the server is rather waiting for the client to reconnect and if not, 
> is more or less evicting it.
>     >     > Could it be possible that the server in this case opens the 
> connection itself without waiting for the client to reconnect?
>     >     >
>     >     >
>     >     > Aurélien
>     >     >
>     >     > Le 18/02/2020 05:42, « NeilBrown » <[email protected]> a écrit :
>     >     >
>     >     >     
>     >     >     LNet is a peer-to-peer protocol, it has no concept of client 
> and server.
>     >     >     If one host needs to send a message to another but doesn't 
> already have
>     >     >     a connection, it creates a new connection.
>     >     >     I don't yet know enough specifics of the lustre protocol to 
> be certain
>     >     >     of the circumstances when a lustre server will need to 
> initiate a message
>     >     >     to a client, but I imagine that recalling a lock might be one.
>     >     >     
>     >     >     I think you should assume that any LNet node might receive a 
> connection
>     >     >     from any other LNet node (for which they share an LNet 
> network), and
>     >     >     that the connection could come from any port between 512 and 
> 1023
>     >     >     (LNET_ACCEPTOR_MIN_PORT to LNET_ACCEPTOR_MAX_PORT).
>     >     >     
>     >     >     NeilBrown
>     >     >     
>     >     >     
>     >     >     
>     >     >     On Mon, Feb 17 2020, Degremont, Aurelien wrote:
>     >     >     
>     >     >     > Hi all,
>     >     >     >
>     >     >     > From what I've understood so far, LNET listens on port 988 
> by default and peers connect to it using 1021-1023 TCP ports as source ports.
>     >     >     > At Lustre level, servers listen on 988 and clients connect 
> to them using the same source ports 1021-1023.
>     >     >     > So only accepting connections to port 988 on server side 
> sounded pretty safe to me. However, I've seen connections from 1021-1023 to 
> 988, from server hosts to client hosts sometimes.
>     >     >     > I can't understand what mechanism could trigger these 
> connections. Did I miss something?
>     >     >     >
>     >     >     > Thanks
>     >     >     >
>     >     >     > Aurélien
>     >     >     >
>     >     >     > _______________________________________________
>     >     >     > lustre-discuss mailing list
>     >     >     > [email protected]
>     >     >     > 
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>     >     >     
>     >     
>     

Attachment: signature.asc
Description: PGP signature

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to