Dear all,

We're having trouble with a lustre 2.5.3 implementation. This is our setup:


   -

   One server for MGS/MDS/MDT. MDT is served from a raid-6 backed partition
   of 2TB (que tipo de hd?)



   -

   Two OSS/OST in a active/active HA with pacemaker. Both are connected to
   a storage via SAS.



   - One SGI Infinite Storage IS5600 with two raid-6 backed volume groups.
   Each group has two volumes, each volume has 15TB capacity.


Volumes are recognized by OSSs as multipath devices, each voulme has 4
paths. Volumes were created with a GPT partition table and a single
partition.

Volume partitions were then formatted as OSTs with the following command:

# mkfs.lustre --replace --reformat --ost --mkfsoptions="-i 1048576 -E
stride=128,stripe_width=1024"
--mountfsoptions="errors=remount-ro,extents,mballoc" --fsname=lustre1
--mgsnode=10.149.0.153@o2ib1 --index=0 --servicenode=10.149.0.151@o2ib1
--servicenode=10.149.0.152@o2ib1
/dev/mapper/360080e500029eaec0000012656951fcap1


Testing with bonnie++ in a client with the below command:

$ ./bonnie++-1.03e/bonnie++ -m lustre1 -d /mnt/lustre -s 128G:1024k -n 0 -f
-b -u vhpc


No problem creating files inside the lustre mount point, but *rewriting*
the same files results in the errors below:


Mar 18 17:46:13 oss01 multipathd: 8:128: mark as failed

Mar 18 17:46:13 oss01 multipathd: 360080e500029eaec0000012656951fca:
remaining active paths: 3

Mar 18 17:46:13 oss01 kernel: sd 1:0:0:0: [sdi] Unhandled error code

Mar 18 17:46:13 oss01 kernel: sd 1:0:0:0: [sdi] Result:
hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK

Mar 18 17:46:13 oss01 kernel: sd 1:0:0:0: [sdi] CDB: Read(10): 28 00 00 06
d8 22 00 20 00 00

Mar 18 17:46:13 oss01 kernel: __ratelimit: 109 callbacks suppressed

Mar 18 17:46:13 oss01 kernel: device-mapper: multipath: Failing path 8:128.

Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] Unhandled error code

Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] Result:
hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK

Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] CDB: Read(10): 28 00 00 07
18 22 00 18 00 00

Mar 18 17:46:13 oss01 kernel: device-mapper: multipath: Failing path 8:192.

Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] Unhandled error code

Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] Result:
hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK

Mar 18 17:46:13 oss01 kernel: sd 1:0:1:0: [sdm] CDB: Read(10): 28 00 00 06
d8 22 00 20 00 00

Mar 18 17:46:13 oss01 kernel: sd 0:0:1:0: [sde] Unhandled error code

Mar 18 17:46:13 oss01 kernel: sd 0:0:1:0: [sde] Result:
hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK

Mar 18 17:46:13 oss01 kernel: sd 0:0:1:0: [sde] CDB: Read(10): 28 00 00 07
18 22 00 18 00 00

Mar 18 17:46:13 oss01 kernel: device-mapper: multipath: Failing path 8:64.

Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] Unhandled error code

Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] Result:
hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK

Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 07
18 22 00 18 00 00

Mar 18 17:46:13 oss01 kernel: device-mapper: multipath: Failing path 8:0.

Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] Unhandled error code

Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] Result:
hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK

Mar 18 17:46:13 oss01 kernel: sd 0:0:0:0: [sda] CDB: Read(10): 28 00 00 06
d8 22 00 20 00 00

Mar 18 17:46:14 oss01 multipathd: 360080e500029eaec0000012656951fca: sdi -
rdac checker reports path is up

Mar 18 17:46:14 oss01 multipathd: 8:128: reinstated

Mar 18 17:46:14 oss01 multipathd: 360080e500029eaec0000012656951fca:
remaining active paths: 4

Mar 18 17:46:14 oss01 kernel: sd 1:0:0:0: [sdi] Unhandled error code

Mar 18 17:46:14 oss01 kernel: sd 1:0:0:0: [sdi] Result:
hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK

Mar 18 17:46:14 oss01 kernel: sd 1:0:0:0: [sdi] CDB: Read(10): 28 00 00 07
18 22 00 18 00 00

Mar 18 17:46:14 oss01 kernel: device-mapper: multipath: Failing path 8:128.

Mar 18 17:46:14 oss01 kernel: sd 1:0:0:0: [sdi] Unhandled error code

Mar 18 17:46:14 oss01 kernel: sd 1:0:0:0: [sdi] Result:
hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK

Mar 18 17:46:14 oss01 kernel: sd 1:0:0:0: [sdi] CDB: Read(10): 28 00 00 06
d8 22 00 20 00 00

Mar 18 17:46:14 oss01 multipathd: 8:128: mark as failed

Mar 18 17:46:14 oss01 multipathd: 360080e500029eaec0000012656951fca:
remaining active paths: 3

Mar 18 17:46:14 oss01 multipathd: 8:192: mark as failed

Mar 18 17:46:14 oss01 multipathd: 360080e500029eaec0000012656951fca:
remaining active paths: 2

Mar 18 17:46:14 oss01 multipathd: 8:0: mark as failed

Mar 18 17:46:14 oss01 multipathd: 360080e500029eaec0000012656951fca:
remaining active paths: 1

Mar 18 17:46:14 oss01 multipathd: 8:64: mark as failed

Mar 18 17:46:14 oss01 multipathd: 360080e500029eaec0000012656951fca:
Entering recovery mode: max_retries=30

Mar 18 17:46:14 oss01 multipathd: 360080e500029eaec0000012656951fca:
remaining active paths: 0

Mar 18 17:46:14 oss01 multipathd: 360080e500029eaec0000012656951fca:
Entering recovery mode: max_retries=30

Mar 18 17:46:19 oss01 multipathd: 360080e500029eaec0000012656951fca: sdi -
rdac checker reports path is up


Multipath configuration ( /etc/multipath.conf ) is below, and is correct
according to the vendor (SGI).


defaults {

       user_friendly_names no

}


blacklist {

       wwid "*"

}


blacklist_exceptions {

       wwid "360080e500029eaec0000012656951fca"

       wwid "360080e500029eaec0000012956951fcb"

       wwid "360080e500029eaec0000012c56951fcb"

       wwid "360080e500029eaec0000012f56951fcb"

}


devices {

      device {

        vendor                       "SGI"

        product                      "IS.*"

        product_blacklist            "Universal Xport"

        getuid_callout               "/lib/udev/scsi_id --whitelisted
--device=/dev/%n"

        prio                         "rdac"

        features                     "2 pg_init_retries 50"

        hardware_handler             "1 rdac"

        path_grouping_policy         "group_by_prio"

        failback                     "immediate"

        rr_weight                    "uniform"

        no_path_retry                30

        retain_attached_hw_handler   "yes"

        detect_prio                  "yes"

        #rr_min_io                   1000

        path_checker                 "rdac"

        #selector                    "round-robin 0"

        #polling_interval            10

      }

}



multipaths {

       multipath {

               wwid "360080e500029eaec0000012656951fca"

       }

       multipath {

               wwid "360080e500029eaec0000012956951fcb"

       }

       multipath {

               wwid "360080e500029eaec0000012c56951fcb"

       }

       multipath {

               wwid "360080e500029eaec0000012f56951fcb"

       }

}


Many many combinations of OST formating options were tried, internal and
external journaling … But the same errors persist.


The same bonnie++ tests were repeated on all volumes of the storage using
only ext4, all successful.


Finally, I've used the debug daemon with the below command:

# lctl debug_daemon start /tmp/lustre.bin

Message file is attached.


Regards,

Angelo
00000100:00000400:10.0F:1457984210.254246:0:8603:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
  req@ffff8803389e7400 x1528513666170728/t0(0) o3->166949c7-be76-a4b7-ac5d-628e75fa8217@10.149.0.153@o2ib1:0/0 lens 488/432 e 24 to 0 dl 1457984215 ref 2 fl Interpret:/0/0 rc 0/0
00000100:00000400:10.0:1457984210.254257:0:8603:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
  req@ffff880d53eb3000 x1528513666170724/t0(0) o3->166949c7-be76-a4b7-ac5d-628e75fa8217@10.149.0.153@o2ib1:0/0 lens 488/432 e 24 to 0 dl 1457984215 ref 2 fl Interpret:/0/0 rc 0/0
00000100:00000400:10.0:1457984210.254260:0:8603:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
  req@ffff88028bf3b000 x1528513666170720/t0(0) o3->166949c7-be76-a4b7-ac5d-628e75fa8217@10.149.0.153@o2ib1:0/0 lens 488/432 e 24 to 0 dl 1457984215 ref 2 fl Interpret:/0/0 rc 0/0
00010000:02000400:10.0:1457984371.056972:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984371.056977:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:4.0F:1457984396.058765:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1457984396.058770:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:4.0:1457984421.060573:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1457984421.060577:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984446.062495:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984446.062499:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984471.064628:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984471.064630:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984496.066253:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984496.066258:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984521.068160:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984521.068162:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984546.070092:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984546.070093:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:4.0:1457984571.071977:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1457984571.071982:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984596.073803:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984596.073805:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984621.075722:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984621.075724:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984646.077674:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984646.077676:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984671.079490:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984671.079492:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984696.081500:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984696.081502:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984721.083295:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984721.083300:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984746.085185:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984746.085187:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984771.087116:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984771.087118:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00010000:02000400:10.0:1457984796.088989:0:8609:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1457984796.088990:0:8609:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client 166949c7-be76-a4b7-ac5d-628e75fa8217 (at 10.149.0.153@o2ib1) refused reconnection, still busy with 3 active RPCs
00000100:00000400:1.0F:1457985301.739238:0:21566:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1457985294/real 1457985294]  req@ffff880d53eb5800 x1528772254766216/t0(0) o400->MGC10.149.0.153@o2ib1@10.149.0.153@o2ib1:26/25 lens 224/224 e 0 to 1 dl 1457985301 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
00000100:02020000:1.0:1457985301.739254:0:21566:0:(import.c:174:ptlrpc_set_import_discon()) 166-1: MGC10.149.0.153@o2ib1: Connection to MGS (at 10.149.0.153@o2ib1) was lost; in progress operations using this service will fail
00000100:00000400:0.0F:1457985307.739230:0:21556:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1457985301/real 1457985301]  req@ffff880d53eb5000 x1528772254766220/t0(0) o250->MGC10.149.0.153@o2ib1@10.149.0.153@o2ib1:26/25 lens 400/544 e 0 to 1 dl 1457985307 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
00000100:00000400:10.0:1457985315.555376:0:8600:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:1100s); client may timeout.  req@ffff88028bf3b000 x1528513666170720/t0(0) o3->166949c7-be76-a4b7-ac5d-628e75fa8217@10.149.0.153@o2ib1:0/0 lens 488/400 e 24 to 0 dl 1457984215 ref 1 fl Complete:/0/0 rc -5/-5
00000400:02000400:10.0:1457985315.555397:0:8600:0:(watchdog.c:396:lcw_update_time()) Service thread pid 8600 completed after 1700.35s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
00000100:00000400:4.0:1457985315.555401:0:8402:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:1100s); client may timeout.  req@ffff880d53eb3000 x1528513666170724/t0(0) o3->166949c7-be76-a4b7-ac5d-628e75fa8217@10.149.0.153@o2ib1:0/0 lens 488/400 e 24 to 0 dl 1457984215 ref 1 fl Complete:/0/0 rc -5/-5
00000400:02000400:4.0:1457985315.555407:0:8402:0:(watchdog.c:396:lcw_update_time()) Service thread pid 8402 completed after 1700.35s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
00000100:00000400:4.0:1457985315.555457:0:8609:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (6:30s); client may timeout.  req@ffff8801e1cf1850 x1528513666173764/t0(0) o9->lustre1-MDT0000-mdtlov_UUID@10.149.0.153@o2ib1:0/0 lens 224/192 e 0 to 0 dl 1457985285 ref 1 fl Complete:/0/0 rc 0/0
00000100:00000400:10.0:1457985315.555462:0:8601:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:1100s); client may timeout.  req@ffff8803389e7400 x1528513666170728/t0(0) o3->166949c7-be76-a4b7-ac5d-628e75fa8217@10.149.0.153@o2ib1:0/0 lens 488/400 e 24 to 0 dl 1457984215 ref 1 fl Complete:/0/0 rc -5/-5
00000400:02000400:10.0:1457985315.555468:0:8601:0:(watchdog.c:396:lcw_update_time()) Service thread pid 8601 completed after 1700.34s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
00000004:00020000:3.0F:1457985315.603888:0:8345:0:(osd_handler.c:863:osd_trans_commit_cb()) transaction @0xffff88105da4c080 commit error: 2
00000004:00020000:3.0:1457985315.604335:0:8345:0:(osd_handler.c:863:osd_trans_commit_cb()) transaction @0xffff881021607380 commit error: 2
00000004:00020000:3.0:1457985315.604336:0:8345:0:(osd_handler.c:863:osd_trans_commit_cb()) transaction @0xffff88105a2f87c0 commit error: 2
00000004:00020000:3.0:1457985315.604339:0:8345:0:(osd_handler.c:863:osd_trans_commit_cb()) transaction @0xffff880c1570a580 commit error: 2
00000020:02000400:5.0F:1457985321.764371:0:10728:0:(obd_mount_server.c:1492:server_put_super()) server umount lustre1-OST0003 complete
00002000:02000400:8.0F:1457986577.417748:0:18648:0:(ofd_fs.c:540:ofd_server_data_init()) lustre1-OST0003: new disk, initializing
40000000:02000000:8.0:1457986577.417786:0:18648:0:(fid_handler.c:519:seq_server_init()) srv-lustre1-OST0003: No data found on store. Initialize space
00010000:02020000:5.0:1457987127.659942:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457987177.663752:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457987227.667554:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457987277.671342:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457987352.677025:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457987427.682749:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457987502.688486:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457987577.694097:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457987652.699828:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457987752.707409:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457987852.714957:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457987952.722590:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457988052.730149:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457988152.737713:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457988252.745255:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457988352.752877:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457988402.756613:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00010000:02020000:5.0:1457988452.760419:0:18658:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0001_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00000020:02000400:8.0:1457988506.360653:0:19648:0:(obd_config.c:650:class_cleanup()) Failing over lustre1-OST0003
00000020:02000400:5.0:1457988506.625126:0:19648:0:(obd_mount_server.c:1492:server_put_super()) server umount lustre1-OST0003 complete
00000400:00000100:10.0:1457988510.619671:0:21551:0:(lib-move.c:1493:lnet_parse_put()) Dropping PUT from 12345-10.149.0.153@o2ib1 portal 7 match 1528513666176932 offset 0 length 224: 4
00000400:00000100:10.0:1457988517.620337:0:21552:0:(lib-move.c:1493:lnet_parse_put()) Dropping PUT from 12345-10.149.0.153@o2ib1 portal 28 match 1528513666176940 offset 0 length 400: 4
00002000:02000400:8.0:1457988553.360604:0:19773:0:(ofd_fs.c:540:ofd_server_data_init()) lustre1-OST0003: new disk, initializing
40000000:02000000:8.0:1457988553.360645:0:19773:0:(fid_handler.c:519:seq_server_init()) srv-lustre1-OST0003: No data found on store. Initialize space
00000020:02000000:0.0:1457988553.365699:0:19690:0:(obd_mount_server.c:1799:server_calc_timeout()) lustre1-OST0003: Imperative Recovery enabled, recovery window shrunk from 300-900 down to 150-450
00000100:00000400:4.0:1457988710.363224:0:21567:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1457988703/real 1457988703]  req@ffff8806aa440c00 x1528772254767148/t0(0) o400->MGC10.149.0.153@o2ib1@10.149.0.153@o2ib1:26/25 lens 224/224 e 0 to 1 dl 1457988710 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
00000100:00000400:0.0:1457988710.363224:0:21568:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1457988703/real 1457988703]  req@ffff8806aa440800 x1528772254767152/t0(0) o400->lustre1-MDT0000-lwp-OST0003@10.149.0.153@o2ib1:12/10 lens 224/224 e 0 to 1 dl 1457988710 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
00000100:02000400:0.0:1457988710.363234:0:21568:0:(import.c:167:ptlrpc_set_import_discon()) lustre1-MDT0000-lwp-OST0003: Connection to lustre1-MDT0000 (at 10.149.0.153@o2ib1) was lost; in progress operations using this service will wait for recovery to complete
00000100:02020000:4.0:1457988710.363237:0:21567:0:(import.c:174:ptlrpc_set_import_discon()) 166-1: MGC10.149.0.153@o2ib1: Connection to MGS (at 10.149.0.153@o2ib1) was lost; in progress operations using this service will fail
00000100:00000400:1.0:1457988716.363228:0:21556:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1457988710/real 1457988710]  req@ffff8806aa440000 x1528772254767156/t0(0) o38->lustre1-MDT0000-lwp-OST0003@10.149.0.153@o2ib1:12/10 lens 400/544 e 0 to 1 dl 1457988716 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
00000100:00000400:1.0:1457988716.363238:0:21556:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1457988710/real 1457988710]  req@ffff8806aa440400 x1528772254767160/t0(0) o250->MGC10.149.0.153@o2ib1@10.149.0.153@o2ib1:26/25 lens 400/544 e 0 to 1 dl 1457988716 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
00000100:02020000:1.0:1457988735.363701:0:21556:0:(import.c:1346:ptlrpc_import_recovery_state_machine()) 167-0: lustre1-MDT0000-lwp-OST0003: This client was evicted by lustre1-MDT0000; in progress operations using this service will fail.
00000100:02000400:1.0:1457988735.364190:0:21556:0:(import.c:937:ptlrpc_connect_interpret()) Evicted from MGS (at 10.149.0.153@o2ib1) after server handle changed from 0x1d18f5b9b38910aa to 0x1d18f5b9b38913f9
00000100:02000000:9.0F:1457988735.364250:0:19881:0:(import.c:1428:ptlrpc_import_recovery_state_machine()) MGC10.149.0.153@o2ib1: Connection restored to MGS (at 10.149.0.153@o2ib1)
00000100:02000000:8.0:1457988735.364600:0:19880:0:(import.c:1428:ptlrpc_import_recovery_state_machine()) lustre1-MDT0000-lwp-OST0003: Connection restored to lustre1-MDT0000 (at 10.149.0.153@o2ib1)
00000100:00000400:0.0:1457988797.363226:0:21558:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1457988785/real 1457988785]  req@ffff88098ad61c00 x1528772254767228/t0(0) o400->MGC10.149.0.153@o2ib1@10.149.0.153@o2ib1:26/25 lens 224/224 e 0 to 1 dl 1457988797 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
00000100:02020000:0.0:1457988797.363240:0:21558:0:(import.c:174:ptlrpc_set_import_discon()) 166-1: MGC10.149.0.153@o2ib1: Connection to MGS (at 10.149.0.153@o2ib1) was lost; in progress operations using this service will fail
00000100:00000400:9.0:1457988797.543241:0:19915:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1457988786/real 1457988786]  req@ffff881040e7ac00 x1528772254767236/t0(0) o39->lustre1-MDT0000-lwp-OST0003@10.149.0.153@o2ib1:12/10 lens 224/224 e 0 to 1 dl 1457988797 ref 2 fl Rpc:XN/0/ffffffff rc 0/-1
00000020:00020000:9.0:1457988797.543659:0:19915:0:(obd_mount_server.c:1420:server_put_super()) lustre1-OST0003: failed to disconnect lwp. (rc=-110)
00000020:02000400:9.0:1457988797.544110:0:19915:0:(obd_config.c:650:class_cleanup()) Failing over lustre1-OST0003
00000100:00000400:1.0:1457988808.363235:0:21556:0:(client.c:1918:ptlrpc_expire_one_request()) @@@ Request sent has timed out for slow reply: [sent 1457988797/real 1457988797]  req@ffff881040e7a000 x1528772254767240/t0(0) o250->MGC10.149.0.153@o2ib1@10.149.0.153@o2ib1:26/25 lens 400/544 e 0 to 1 dl 1457988808 ref 1 fl Rpc:XN/0/ffffffff rc 0/-1
00000020:02000400:4.0:1457988808.504959:0:19915:0:(obd_mount_server.c:1492:server_put_super()) server umount lustre1-OST0003 complete
00002000:02000400:8.0:1457990002.537263:0:20665:0:(ofd_fs.c:540:ofd_server_data_init()) lustre1-OST0003: new disk, initializing
40000000:02000000:8.0:1457990002.537308:0:20665:0:(fid_handler.c:519:seq_server_init()) srv-lustre1-OST0003: No data found on store. Initialize space
00000100:02000400:8.0:1457991449.156885:0:20664:0:(pinger.c:659:ping_evictor_main()) lustre1-OST0003: haven't heard from client f43cde69-9184-1005-5153-ebfecdf43e29 (at 10.149.0.153@o2ib1) in 234 seconds. I think it's dead, and I am evicting it. exp ffff881052361c00, cur 1457991449 expire 1457991299 last 1457991215
00000020:02000400:1.0:1458043635.931763:0:14780:0:(obd_config.c:650:class_cleanup()) Failing over lustre1-OST0003
00010000:02020000:10.0:1458043638.110674:0:20723:0:(ldlm_lib.c:804:target_handle_connect()) 137-5: lustre1-OST0003_UUID: not available for connect from 10.149.0.153@o2ib1 (no target). If you are running an HA pair check that the target is mounted on the other server.
00000020:02000400:4.0:1458043639.811167:0:14780:0:(obd_mount_server.c:1492:server_put_super()) server umount lustre1-OST0003 complete
00002000:02000400:8.0:1458045010.929021:0:15615:0:(ofd_fs.c:540:ofd_server_data_init()) lustre1-OST0003: new disk, initializing
40000000:02000000:8.0:1458045010.929063:0:15615:0:(fid_handler.c:519:seq_server_init()) srv-lustre1-OST0003: No data found on store. Initialize space
00000100:02000400:8.0:1458045924.006830:0:15614:0:(pinger.c:659:ping_evictor_main()) lustre1-OST0003: haven't heard from client 8b2420c3-d482-a4c1-cf54-46df66387105 (at 10.149.0.153@o2ib1) in 228 seconds. I think it's dead, and I am evicting it. exp ffff881057d71800, cur 1458045924 expire 1458045774 last 1458045696
00000100:00000400:10.0:1458048398.979221:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
  req@ffff8807632e0800 x1528513670062928/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 24 to 0 dl 1458048403 ref 2 fl Interpret:/0/0 rc 0/0
00000100:00000400:10.0:1458048398.979232:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
  req@ffff880f1601a000 x1528513670062924/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 24 to 0 dl 1458048403 ref 2 fl Interpret:/0/0 rc 0/0
00000100:00000400:10.0:1458048398.979235:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
  req@ffff8805c1723000 x1528513670062920/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 24 to 0 dl 1458048403 ref 2 fl Interpret:/0/0 rc 0/0
00000100:00000400:10.0:1458048398.979237:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
  req@ffff880b9b86fc00 x1528513670062896/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 24 to 0 dl 1458048403 ref 2 fl Interpret:/0/0 rc 0/0
00000100:00000400:10.0:1458048398.979239:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
  req@ffff880b0cc0d000 x1528513670062892/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 24 to 0 dl 1458048403 ref 2 fl Interpret:/0/0 rc 0/0
00000100:00000400:10.0:1458048398.979242:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
  req@ffff880b0cc0d400 x1528513670062888/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 24 to 0 dl 1458048403 ref 2 fl Interpret:/0/0 rc 0/0
00000100:00000400:10.0:1458048398.979244:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/5), not sending early reply
  req@ffff880b0cc0d800 x1528513670062884/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 24 to 0 dl 1458048403 ref 2 fl Interpret:/0/0 rc 0/0
00010000:02000400:10.0:1458048559.890425:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048559.890430:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048584.892357:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048584.892359:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048609.894183:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048609.894188:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048634.896089:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048634.896091:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048659.898197:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048659.898198:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048684.899906:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048684.899911:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048709.901785:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048709.901787:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048734.903663:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048734.903665:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048759.905517:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048759.905519:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048784.907384:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048784.907386:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048809.909249:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048809.909250:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048834.911177:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048834.911182:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048859.913075:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048859.913077:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048884.915002:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048884.915004:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048909.916872:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048909.916874:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048934.918772:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048934.918774:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048959.920665:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048959.920667:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458048984.922567:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458048984.922569:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049009.924406:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049009.924408:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049034.926352:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049034.926354:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049059.928261:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049059.928263:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049084.930136:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049084.930138:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049109.931990:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049109.931995:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049134.933894:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049134.933895:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049159.935796:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049159.935797:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049184.937690:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049184.937691:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049209.939574:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049209.939576:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049234.941460:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049234.941462:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049259.943382:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049259.943384:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049284.945256:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049284.945257:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458049309.947134:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458049309.947136:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00000100:00000400:5.0:1458049310.636722:0:16005:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:907s); client may timeout.  req@ffff880b0cc0d800 x1528513670062884/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 24 to 0 dl 1458048403 ref 1 fl Complete:/0/0 rc -5/-5
00000400:02000400:5.0:1458049310.636738:0:16005:0:(watchdog.c:396:lcw_update_time()) Service thread pid 16005 completed after 1507.23s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
00000100:00000400:10.0:1458049310.636744:0:15992:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:907s); client may timeout.  req@ffff880b0cc0d000 x1528513670062892/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 24 to 0 dl 1458048403 ref 1 fl Complete:/0/0 rc -5/-5
00000400:02000400:10.0:1458049310.636754:0:15992:0:(watchdog.c:396:lcw_update_time()) Service thread pid 15992 completed after 1507.23s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
00000100:00000400:5.0:1458049310.636802:0:15989:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:907s); client may timeout.  req@ffff8807632e0800 x1528513670062928/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 24 to 0 dl 1458048403 ref 1 fl Complete:/0/0 rc -5/-5
00000400:02000400:5.0:1458049310.636807:0:15989:0:(watchdog.c:396:lcw_update_time()) Service thread pid 15989 completed after 1507.19s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
00000100:00000400:10.0:1458049310.636841:0:15993:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:907s); client may timeout.  req@ffff880f1601a000 x1528513670062924/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 24 to 0 dl 1458048403 ref 1 fl Complete:/0/0 rc -5/-5
00000100:00000400:4.0:1458049310.636841:0:15995:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:907s); client may timeout.  req@ffff880b9b86fc00 x1528513670062896/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 24 to 0 dl 1458048403 ref 1 fl Complete:/0/0 rc -5/-5
00000400:02000400:10.0:1458049310.636848:0:15993:0:(watchdog.c:396:lcw_update_time()) Service thread pid 15993 completed after 1507.19s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
00000400:02000400:4.0:1458049310.636848:0:15995:0:(watchdog.c:396:lcw_update_time()) Service thread pid 15995 completed after 1507.23s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
00000100:00000400:4.0:1458049310.636856:0:15991:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:907s); client may timeout.  req@ffff880b0cc0d400 x1528513670062888/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 24 to 0 dl 1458048403 ref 1 fl Complete:/0/0 rc -5/-5
00000400:02000400:4.0:1458049310.636860:0:15991:0:(watchdog.c:396:lcw_update_time()) Service thread pid 15991 completed after 1507.23s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
00000100:00000400:5.0:1458049310.636861:0:15996:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (600:907s); client may timeout.  req@ffff8805c1723000 x1528513670062920/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 24 to 0 dl 1458048403 ref 1 fl Complete:/0/0 rc -5/-5
00000400:02000400:5.0:1458049310.636865:0:15996:0:(watchdog.c:396:lcw_update_time()) Service thread pid 15996 completed after 1507.20s. This indicates the system was overloaded (too many service threads, or there were not enough hardware resources).
00010000:02000400:10.0:1458049334.949029:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00000100:00000400:5.0:1458050084.949223:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
  req@ffff880965742c00 x1528513670068192/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 0 to 0 dl 1458050089 ref 2 fl Interpret:/2/0 rc 0/0
00000100:00000400:5.0:1458050084.949234:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
  req@ffff8801ebbc1c00 x1528513670068188/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 0 to 0 dl 1458050089 ref 2 fl Interpret:/2/0 rc 0/0
00000100:00000400:5.0:1458050084.949237:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
  req@ffff88079f08e400 x1528513670068184/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 0 to 0 dl 1458050089 ref 2 fl Interpret:/2/0 rc 0/0
00000100:00000400:5.0:1458050084.949239:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
  req@ffff88079f08ec00 x1528513670068180/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 0 to 0 dl 1458050089 ref 2 fl Interpret:/2/0 rc 0/0
00000100:00000400:5.0:1458050084.949242:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
  req@ffff8804bf4bac00 x1528513670068172/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 0 to 0 dl 1458050089 ref 2 fl Interpret:/2/0 rc 0/0
00000100:00000400:5.0:1458050084.949250:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
  req@ffff88079f08e000 x1528513670068176/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 0 to 0 dl 1458050089 ref 2 fl Interpret:/2/0 rc 0/0
00000100:00000400:5.0:1458050084.949253:0:15601:0:(service.c:1347:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (5/-150), not sending early reply
  req@ffff8804bf4ba400 x1528513670068168/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/432 e 0 to 0 dl 1458050089 ref 2 fl Interpret:/2/0 rc 0/0
00010000:02000400:4.0:1458050091.006277:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1458050091.006283:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:4.0:1458050116.008179:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1458050116.008181:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:4.0:1458050141.010001:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1458050141.010003:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:4.0:1458050166.011973:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1458050166.011975:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:4.0:1458050191.013848:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1458050191.013850:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:4.0:1458050216.015791:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1458050216.015793:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:4.0:1458050241.017660:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1458050241.017661:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458050266.019539:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458050266.019541:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458050291.021480:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458050291.021481:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458050316.023359:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458050316.023361:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458050341.025186:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458050341.025188:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:4.0:1458050366.027073:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1458050366.027075:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:4.0:1458050391.028986:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1458050391.028988:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:4.0:1458050416.030868:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1458050416.030870:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:10.0:1458050441.032791:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:10.0:1458050441.032793:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00010000:02000400:4.0:1458050466.034650:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
00010000:02000400:4.0:1458050466.034656:0:15582:0:(ldlm_lib.c:988:target_handle_connect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) refused reconnection, still busy with 7 active RPCs
00000100:00000400:5.0:1458050481.237177:0:16005:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:392s); client may timeout.  req@ffff8801ebbc1c00 x1528513670068188/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 0 to 0 dl 1458050089 ref 1 fl Complete:/2/0 rc -5/-5
00000100:00000400:10.0:1458050481.237192:0:15992:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:392s); client may timeout.  req@ffff880965742c00 x1528513670068192/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 0 to 0 dl 1458050089 ref 1 fl Complete:/2/0 rc -5/-5
00000100:00000400:5.0:1458050481.237245:0:15989:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:392s); client may timeout.  req@ffff88079f08ec00 x1528513670068180/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 0 to 0 dl 1458050089 ref 1 fl Complete:/2/0 rc -5/-5
00000100:00000400:10.0:1458050481.237287:0:15993:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:392s); client may timeout.  req@ffff8804bf4ba400 x1528513670068168/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 0 to 0 dl 1458050089 ref 1 fl Complete:/2/0 rc -5/-5
00000100:00000400:5.0:1458050481.237293:0:15996:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:392s); client may timeout.  req@ffff88079f08e000 x1528513670068176/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 0 to 0 dl 1458050089 ref 1 fl Complete:/2/0 rc -5/-5
00000100:00000400:4.0:1458050481.237297:0:15995:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:392s); client may timeout.  req@ffff88079f08e400 x1528513670068184/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 0 to 0 dl 1458050089 ref 1 fl Complete:/2/0 rc -5/-5
00000100:00000400:4.0:1458050481.237311:0:15991:0:(service.c:2039:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (755:392s); client may timeout.  req@ffff8804bf4bac00 x1528513670068172/t0(0) o3->d25d68b4-0293-4a1b-bcac-085fc4a02c2c@10.149.0.153@o2ib1:0/0 lens 488/400 e 0 to 0 dl 1458050089 ref 1 fl Complete:/2/0 rc -5/-5
00010000:02000400:10.0:1458050491.036515:0:15582:0:(ldlm_lib.c:707:target_handle_reconnect()) lustre1-OST0003: Client d25d68b4-0293-4a1b-bcac-085fc4a02c2c (at 10.149.0.153@o2ib1) reconnecting
_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to