Thanks – I do not have different type IB within one fabric. But with this info 
I found a few nodes that showed that error, but they are not matching the 
errors I see on the server.

Btw - I got the problem resolved on one FS after upgrading to Lustre 2.11

From: Raj [mailto:rajgau...@gmail.com]
Sent: Thursday, June 07, 2018 10:36 AM
To: Hebenstreit, Michael <michael.hebenstr...@intel.com>
Cc: White, Cliff <cliff.wh...@intel.com>; lustre-discuss 
<lustre-discuss@lists.lustre.org>
Subject: Re: [lustre-discuss] server_bulk_callback errors until server reboots

I seen the error when we had mix of FDR (using mlx4) and EDR(using mlx5) 
devices in lustre network. server_bulk_callback should have the corresponding 
client_bulk_callback in client.

http://wiki.lustre.org/Infiniband_Configuration_Howto
On Thu, Jun 7, 2018 at 11:24 AM Hebenstreit, Michael 
<michael.hebenstr...@intel.com<mailto:michael.hebenstr...@intel.com>> wrote:
No, clients do not show any issues.

-----Original Message-----
From: White, Cliff
Sent: Thursday, June 07, 2018 9:26 AM
To: Hebenstreit, Michael 
<michael.hebenstr...@intel.com<mailto:michael.hebenstr...@intel.com>>; 
lustre-discuss 
<lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>>
Subject: Re: [lustre-discuss] server_bulk_callback errors until server reboots


On 6/7/18, 7:00 AM, "lustre-discuss on behalf of Hebenstreit, Michael" 
<lustre-discuss-boun...@lists.lustre.org<mailto:lustre-discuss-boun...@lists.lustre.org>
 on behalf of 
michael.hebenstr...@intel.com<mailto:michael.hebenstr...@intel.com>> wrote:

    Hello

    I have now 2 Lustre systems that suddenly show this error - on a single OST 
the kernel log is filling with messages

    [58858.365663] LustreError: 123642:0:(events.c:447:server_bulk_callback()) 
event type 3, status -61, desc ffff880524f7e000
    [58865.328317] LustreError: 123640:0:(events.c:447:server_bulk_callback()) 
event type 5, status -61, desc ffff880cab4ec800
    [58865.340792] LustreError: 123641:0:(events.c:447:server_bulk_callback()) 
event type 5, status -61, desc ffff880524f7c600
    [58865.353167] LustreError: 123640:0:(events.c:447:server_bulk_callback()) 
event type 3, status -61, desc ffff880cab4ec800
    [58865.365503] LustreError: 123641:0:(events.c:447:server_bulk_callback()) 
event type 3, status -61, desc ffff880524f7c600

    until the server reboots. Clients are on 2.11/RH7.5, servers are on 
2.7.19.10/RH7.4<http://2.7.19.10/RH7.4> . Has anyone experienced this before?

There should be some corresponding error messages on your clients, have you 
checked there?
cliffw

    Thanks
    Michael

    ------------------------------------------------------------------------
    Michael Hebenstreit                 Senior Cluster Architect
    Intel Corporation, MS: RR1-105/H14  Core and Visual Compute Group (DCE)
    4100 Sara 
Road<https://maps.google.com/?q=4100+Sara+Road&entry=gmail&source=g>            
          Tel.:   +1 505-794-3144
    Rio Rancho, NM 87124
    UNITED STATES                       E-mail: 
michael.hebenstr...@intel.com<mailto:michael.hebenstr...@intel.com>



    _______________________________________________
    lustre-discuss mailing list
    lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
    http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to