If you are evicting a client by NID, then use the "nid:" keyword:
lctl set_param mdt.*.evict_client=nid:10.68.178.25@tcp
Otherwise it is expecting the input to be in the form of a client UUID (to allow
evicting a single export from a client mounting the filesystem multiple times).
That said, the client *should* be evicted by the server automatically, so it
isn't
clear why this isn't happening. Possibly this is something at the LNet level
(which unfortunately I don't know much about)?
Cheers, Andreas
> On Dec 6, 2023, at 13:23, Huang, Qiulan via lustre-discuss
> <[email protected]> wrote:
>
>
>
> Hello all,
>
>
> We removed some clients two weeks ago but we see the Lustre server is still
> trying to handle the lnet recovery reply to those clients (the error log is
> posted as below). And they are still listed in the exports dir.
>
>
> I tried to run to evict the clients but failed with the error "no exports
> found"
>
> lctl set_param mdt.*.evict_client=10.68.178.25@tcp
>
>
> Do you know how to clean up the removed the depreciated clients? Any
> suggestions would be greatly appreciated.
>
>
>
> For example:
>
> [root@mds2 ~]# ll /proc/fs/lustre/mdt/data-MDT0000/exports/10.67.178.25@tcp/
> total 0
> -r--r--r-- 1 root root 0 Dec 5 15:41 export
> -r--r--r-- 1 root root 0 Dec 5 15:41 fmd_count
> -r--r--r-- 1 root root 0 Dec 5 15:41 hash
> -rw-r--r-- 1 root root 0 Dec 5 15:41 ldlm_stats
> -r--r--r-- 1 root root 0 Dec 5 15:41 nodemap
> -r--r--r-- 1 root root 0 Dec 5 15:41 open_files
> -r--r--r-- 1 root root 0 Dec 5 15:41 reply_data
> -rw-r--r-- 1 root root 0 Aug 14 10:58 stats
> -r--r--r-- 1 root root 0 Dec 5 15:41 uuid
>
>
>
>
>
> /var/log/messages:Dec 6 12:50:17 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous
> similar message
> /var/log/messages:Dec 6 13:05:17 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec 6 13:05:17 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous
> similar message
> /var/log/messages:Dec 6 13:20:17 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec 6 13:20:17 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous
> similar message
> /var/log/messages:Dec 6 13:35:17 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec 6 13:35:17 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous
> similar message
> /var/log/messages:Dec 6 13:50:17 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec 6 13:50:17 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous
> similar message
> /var/log/messages:Dec 6 14:05:17 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec 6 14:05:17 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous
> similar message
> /var/log/messages:Dec 6 14:20:16 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI
> (10.67.178.25@tcp) recovery failed with -110
> /var/log/messages:Dec 6 14:20:16 mds2 kernel: LNetError:
> 11579:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 1 previous
> similar message
> /var/log/messages:Dec 6 14:30:17 mds2 kernel: LNetError:
> 3806712:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI
> (10.67.176.25@tcp) recovery failed with -111
> /var/log/messages:Dec 6 14:30:17 mds2 kernel: LNetError:
> 3806712:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 3 previous
> similar messages
> /var/log/messages:Dec 6 14:47:14 mds2 kernel: LNetError:
> 3812070:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI
> (10.67.176.25@tcp) recovery failed with -111
> /var/log/messages:Dec 6 14:47:14 mds2 kernel: LNetError:
> 3812070:0:(lib-move.c:4005:lnet_handle_recovery_reply()) Skipped 8 previous
> similar messages
> /var/log/messages:Dec 6 15:02:14 mds2 kernel: LNetError:
> 3817248:0:(lib-move.c:4005:lnet_handle_recovery_reply()) peer NI
> (10.67.176.25@tcp) recovery failed with -111
>
>
> Regards,
> Qiulan
> _______________________________________________
> lustre-discuss mailing list
> [email protected]
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Cheers, Andreas
--
Andreas Dilger
Lustre Principal Architect
Whamcloud
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org