The message is part of normal o2iblnd connection setup. It just means the two peers are negotiating the max number of fragments that will be supported.
It is seen because https://jira.whamcloud.com/browse/LU-15092 changed the default max number of fragments from 256 to 257. If one peer has that patch, but the other doesn’t, then negotiation must occur. There is a separate but related issue involving map_on_demand. In some Lustre versions, map_on_demand must be set to 1 in order for the aforementioned negotiation to succeed. Chris Horn From: lustre-discuss <[email protected]> on behalf of Einar Næss Jensen <[email protected]> Date: Thursday, September 8, 2022 at 6:12 AM To: Moreno Lazaro Diego (ID) <[email protected]>, [email protected] <[email protected]> Subject: Re: [lustre-discuss] max_frags 257 too large Thanks for the clarification, Diego We are in for an upgrade next month so we might just wait and see :) Best Regards EInar Næss Jensen ________________________________________ From: Moreno Lazaro Diego (ID) <[email protected]> Sent: Thursday, September 8, 2022 13:07 To: Einar Næss Jensen; [email protected] Subject: Re: [lustre-discuss] max_frags 257 too large Hi Einar, I've seen this in older versions of Lustre-2.12 where map_on_demand=1 on the ko2iblnd module is needed in order to keep compatibility with newer versions. There are a couple of patches where this requirement for map_on_demand was removed though it's now set by default: https://jira.whamcloud.com/browse/LU-15094<https://jira.whamcloud.com/browse/LU-15094> (don't require map_on_demand to negotiate max_frags) https://jira.whamcloud.com/browse/LU-15186<https://jira.whamcloud.com/browse/LU-15186> (set by default map_on_demand) I know at least DDN includes these patches on their 2.12.6-ddn8 version. I don't see the patches on the latest 2.12.9 on the community version. Maybe you just need to enable map_on_demand and that could solve it. Regards, Diego On 08.09.22, 11:22, "lustre-discuss on behalf of Einar Næss Jensen" <[email protected] on behalf of [email protected]> wrote: We have a case of one of our MDS servers stopped working correctly (we have failed over the mdt to other mds server), and while we wait for vendor response, I see something in our logs which I'm curious about: LNet: 117894:0:(o2iblnd_cb.c:2631:kiblnd_passive_connect()) Can't accept conn from 10.145.30.168@o2ib (version 12): max_frags 257 too large (256 wanted) What does it mean? is it somethinbg to be concerned over? Best Regards Einar Næss Jensen _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org> _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org<http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org>
_______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
