Hi Lustre community, I am running a lustre setup on a small cluster. It has 2 nodes that can be used for IO with 36 HDDs and 4 SSDs each. All nodes are AArch64 and running Rocky Linux 8.5. I've decided on a non-HA configuration on those two nodes using lustre 2.14.56 + zfs 2.1.2 (both built from source), using the SSDs for the MDT and the HDDs for OST. On one of the nodes I dedicated some HDDs for the MGT. So the nodes are running both MDS and OSS and one of them also the MGS.
One of the nodes crashed unexpectedly. After bringing it back up, the filesystem could not be mounted on clients, hanging indefinitely. Output of dmesg on IO nodes showed that both MDTs were stuck recovering What I have tried so far: 1. sudo lctl --device <mdt device> abort_recovery 2. The usual restarts and reboots and remounts 3. lfsck 4. tunefs --writeconf on every mdt/ost 5. (getting frustrated and desperate) 6. reformat mgs 7. lctl clear_conf <all mdts/osts> I'm not sure how much worse I made the issue. The data on the filesystem was non-critical but it would take a couple of days to rebuild, so I'd rather recover. Right now the situation is as follows: io01 lctl dl: [snassyr@io01 ~]$ sudo lctl dl 0 UP osd-zfs MGS-osd MGS-osd_UUID 4 1 UP mgs MGS MGS 20 2 UP mgc MGC10.31.7.61@o2ib 7fc3c479-903d-4bf5-8239-77f6bb25d72f 4 3 UP osd-zfs storage-MDT0000-osd storage-MDT0000-osd_UUID 9 4 UP mds MDS MDS_uuid 2 5 UP lod storage-MDT0000-mdtlov storage-MDT0000-mdtlov_UUID 3 6 UP mdt storage-MDT0000 storage-MDT0000_UUID 22 7 UP mdd storage-MDD0000 storage-MDD0000_UUID 3 8 UP qmt storage-QMT0000 storage-QMT0000_UUID 3 9 UP osp storage-OST0000-osc-MDT0000 storage-MDT0000-mdtlov_UUID 4 10 UP osp storage-OST0001-osc-MDT0000 storage-MDT0000-mdtlov_UUID 4 11 UP osp storage-MDT0001-osp-MDT0000 storage-MDT0000-mdtlov_UUID 4 12 UP lwp storage-MDT0000-lwp-MDT0000 storage-MDT0000-lwp-MDT0000_UUID 4 13 UP osd-zfs storage-OST0000-osd storage-OST0000-osd_UUID 4 14 UP ost OSS OSS_uuid 2 15 UP obdfilter storage-OST0000 storage-OST0000_UUID 6 16 UP lwp storage-MDT0000-lwp-OST0000 storage-MDT0000-lwp-OST0000_UUID 4 17 UP lwp storage-MDT0001-lwp-OST0000 storage-MDT0001-lwp-OST0000_UUID 4 io02 lctl dl: [snassyr@io02 ~]$ sudo lctl dl 0 UP osd-zfs storage-MDT0001-osd storage-MDT0001-osd_UUID 8 1 UP mgc MGC10.31.7.61@o2ib 30fafefb-17d5-4b7e-b37d-f57b8cec706f 4 2 UP mds MDS MDS_uuid 2 3 UP lod storage-MDT0001-mdtlov storage-MDT0001-mdtlov_UUID 3 4 UP mdt storage-MDT0001 storage-MDT0001_UUID 10 5 UP mdd storage-MDD0001 storage-MDD0001_UUID 3 6 UP osp storage-MDT0000-osp-MDT0001 storage-MDT0001-mdtlov_UUID 4 7 UP osp storage-OST0000-osc-MDT0001 storage-MDT0001-mdtlov_UUID 4 8 UP osp storage-OST0001-osc-MDT0001 storage-MDT0001-mdtlov_UUID 4 9 UP lwp storage-MDT0000-lwp-MDT0001 storage-MDT0000-lwp-MDT0001_UUID 4 10 UP osd-zfs storage-OST0001-osd storage-OST0001-osd_UUID 4 11 UP ost OSS OSS_uuid 2 12 UP obdfilter storage-OST0001 storage-OST0001_UUID 6 13 UP lwp storage-MDT0000-lwp-OST0001 storage-MDT0000-lwp-OST0001_UUID 4 14 UP lwp storage-MDT0001-lwp-OST0001 storage-MDT0001-lwp-OST0001_UUID 4 When mounting the filesystem on a client: io01 lctl dk: 010000:00080000:44.0F:1651608945.066329:1728:7485:0:(ldlm_lib.c:1363:target_handle_connect()) MGS: connection from 2977c7ca-d420-4590-8587-6dcca277b9ce@10.31.7.2@o2ib<mailto:2977c7ca-d420-4590-8587-6dcca277b9ce@10.31.7.2@o2ib> t0 exp 0000000080b2f6b3 cur 7766 last 0 00000020:00000080:44.0:1651608945.066340:1904:7485:0:(genops.c:1357:class_connect()) connect: client 2977c7ca-d420-4590-8587-6dcca277b9ce, cookie 0x58c556ebddf199e6 00000020:01000000:44.0:1651608945.066343:2176:7485:0:(lprocfs_status_server.c:513:lprocfs_exp_setup()) using hash 00000000a7721ebc 00000100:00080000:44.0:1651608945.066354:1984:7485:0:(import.c:85:import_set_state_nolock()) 000000004427b505 : changing import state from RECOVER to FULL 20000000:01000000:44.0:1651608945.069929:1712:7485:0:(mgs_nids.c:636:mgs_get_ir_logs()) Reading IR log storage-cliir bufsize 1048576. 20000000:01000000:44.0:1651608945.069933:1920:7485:0:(mgs_nids.c:193:mgs_nidtbl_read()) fsname storage, entry size 32, pages 4064/1/16/255. 20000000:01000000:44.0:1651608945.069934:1920:7485:0:(mgs_nids.c:193:mgs_nidtbl_read()) fsname storage, entry size 32, pages 4032/1/16/255. 20000000:01000000:44.0:1651608945.069935:1920:7485:0:(mgs_nids.c:193:mgs_nidtbl_read()) fsname storage, entry size 32, pages 4000/1/16/255. 20000000:01000000:44.0:1651608945.069936:1920:7485:0:(mgs_nids.c:193:mgs_nidtbl_read()) fsname storage, entry size 32, pages 3968/1/16/255. 20000000:01000000:44.0:1651608945.069937:1920:7485:0:(mgs_nids.c:205:mgs_nidtbl_read()) Read IR logs storage return with 128, version 15 00000040:00080000:44.0:1651608945.070263:1936:7485:0:(llog_osd.c:233:llog_osd_read_header()) not reading header from 0-byte log 00010000:00080000:76.0F:1651608945.070500:1728:107218:0:(ldlm_lib.c:1363:target_handle_connect()) storage-MDT0000: connection from fcf3db3e-1e74-404e-9b5b-c5ddd03f655e@10.31.7.2@o2ib<mailto:fcf3db3e-1e74-404e-9b5b-c5ddd03f655e@10.31.7.2@o2ib> t0 exp 0000000080b2f6b3 cur 7766 last 0 00010000:00080000:76.0:1651608945.073446:1728:107218:0:(ldlm_lib.c:1363:target_handle_connect()) storage-MDT0000: connection from fcf3db3e-1e74-404e-9b5b-c5ddd03f655e@10.31.7.2@o2ib<mailto:fcf3db3e-1e74-404e-9b5b-c5ddd03f655e@10.31.7.2@o2ib> t0 exp 0000000080b2f6b3 cur 7766 last 0 client lctl dk: 00000080:01200004:23.0F:1651608945.064695:0:97748:0:(super25.c:114:lustre_fill_super()) VFS Op: sb 00000000671f73ed 00000020:01000004:23.0:1651608945.064703:0:97748:0:(obd_mount.c:951:lmd_print()) mount data: 00000020:01000004:23.0:1651608945.064704:0:97748:0:(obd_mount.c:953:lmd_print()) profile: storage-client 00000020:01000004:23.0:1651608945.064704:0:97748:0:(obd_mount.c:954:lmd_print()) device: 10.31.7.61@o2ib:/storage 00000020:01000004:23.0:1651608945.064705:0:97748:0:(obd_mount.c:955:lmd_print()) flags: 2 00000080:01000004:23.0:1651608945.064705:0:97748:0:(super25.c:159:lustre_fill_super()) Mounting client storage-client 00000020:01000004:23.0:1651608945.064744:0:97748:0:(obd_mount.c:340:lustre_start_mgc()) Start MGC 'MGC10.31.7.61@o2ib' 00000020:00000080:23.0:1651608945.064747:0:97748:0:(obd_config.c:1356:class_process_config()) processing cmd: cf005 00000020:00000080:23.0:1651608945.064749:0:97748:0:(obd_config.c:1368:class_process_config()) adding mapping from uuid MGC10.31.7.61@o2ib_0 to nid 0x500000a1f073d (10.31.7.61@o2ib) 00000020:01000004:23.0:1651608945.064759:0:97748:0:(obd_mount.c:191:lustre_start_simple()) Starting OBD MGC10.31.7.61@o2ib (typ=mgc) 00000020:00000080:23.0:1651608945.064760:0:97748:0:(obd_config.c:1356:class_process_config()) processing cmd: cf001 00000020:00000080:23.0:1651608945.064771:0:97748:0:(genops.c:415:class_newdev()) Allocate new device MGC10.31.7.61@o2ib (000000009d5385c4) 00000020:00000080:23.0:1651608945.064792:0:97748:0:(obd_config.c:648:class_attach()) OBD: dev 0 attached type mgc with refcount 1 00000020:00000080:23.0:1651608945.064793:0:97748:0:(obd_config.c:1356:class_process_config()) processing cmd: cf003 00010000:00080000:6.0F:1651608945.065940:0:97748:0:(ldlm_lib.c:115:import_set_conn()) imp 00000000d1420d43@MGC10.31.7.61@o2ib<mailto:00000000d1420d43@MGC10.31.7.61@o2ib>: add connection MGC10.31.7.61@o2ib_0 at head 00000040:01000000:6.0:1651608945.065968:0:97748:0:(llog_obd.c:212:llog_setup()) obd MGC10.31.7.61@o2ib ctxt 1 is initialized 10000000:01000000:30.0F:1651608945.066034:0:97767:0:(mgc_request.c:628:mgc_requeue_thread()) Starting requeue thread 00000020:00000080:6.0:1651608945.066039:0:97748:0:(obd_config.c:752:class_setup()) finished setup of obd MGC10.31.7.61@o2ib (uuid 2977c7ca-d420-4590-8587-6dcca277b9ce) 00000020:00000080:6.0:1651608945.066044:0:97748:0:(genops.c:1357:class_connect()) connect: client 2977c7ca-d420-4590-8587-6dcca277b9ce, cookie 0xb433eadff9fbd3f0 00000100:00080000:6.0:1651608945.066048:0:97748:0:(import.c:533:import_select_connection()) MGC10.31.7.61@o2ib: connect to NID 10.31.7.61@o2ib last attempt 0 00000100:00080000:6.0:1651608945.066049:0:97748:0:(import.c:614:import_select_connection()) MGC10.31.7.61@o2ib: import 00000000d1420d43 using connection MGC10.31.7.61@o2ib_0/10.31.7.61@o2ib<mailto:MGC10.31.7.61@o2ib_0/10.31.7.61@o2ib> 00000100:00080000:6.0:1651608945.066061:0:97748:0:(pinger.c:388:ptlrpc_pinger_add_import()) adding pingable import 2977c7ca-d420-4590-8587-6dcca277b9ce->MGS 00000080:01000000:6.0:1651608945.066081:0:97748:0:(llite_lib.c:1252:ll_fill_super()) llite sb uuid: fcf3db3e-1e74-404e-9b5b-c5ddd03f655e 10000000:01000000:6.0:1651608945.066151:0:97748:0:(mgc_request.c:2201:mgc_process_config()) parse_log storage-client from 0 10000000:01000000:6.0:1651608945.066152:0:97748:0:(mgc_request.c:334:config_log_add()) add config log storage-client-ffff910943185800 10000000:01000000:6.0:1651608945.066154:0:97748:0:(mgc_request.c:215:do_config_log_add()) do adding config log storage-sptlrpc-ffff910e650a0000 10000000:01000000:6.0:1651608945.066155:0:97748:0:(mgc_request.c:90:mgc_name2resid()) log storage-sptlrpc to resid 0x656761726f7473/0x0 (storage) 10000000:01000000:6.0:1651608945.066157:0:97748:0:(mgc_request.c:2060:mgc_process_log()) Process log storage-sptlrpc-ffff910e650a0000 from 1 10000000:01000000:6.0:1651608945.066158:0:97748:0:(mgc_request.c:1102:mgc_enqueue()) Enqueue for storage-sptlrpc (res 0x656761726f7473) 00000100:00080000:6.0:1651608945.066178:0:97748:0:(client.c:1659:ptlrpc_send_new_req()) @@@ req waiting for recovery: (FULL != CONNECTING) req@00000000a5867c8c x1731815472708096/t0(0) o101->MGC10.31.7.61@o2ib@10.31.7.61@o2ib:26/25<mailto:MGC10.31.7.61@o2ib@10.31.7.61@o2ib:26/25> lens 328/344 e 0 to 0 dl 0 ref 2 fl Rpc:WQU/0/ffffffff rc 0/-1 job:'' 00000100:00080000:20.0F:1651608945.066200:0:41661:0:(import.c:1073:ptlrpc_connect_interpret()) MGC10.31.7.61@o2ib: connect to target with instance 0 00000100:00080000:20.0:1651608945.066205:0:41661:0:(import.c:933:ptlrpc_connect_set_flags()) MGC10.31.7.61@o2ib: Resetting ns_connect_flags to server flags: 0xa000011001002020 10000000:01000000:20.0:1651608945.066207:0:41661:0:(mgc_request.c:1327:mgc_import_event()) import event 0x808005 00000100:00080000:20.0:1651608945.066209:0:41661:0:(import.c:85:import_set_state_nolock()) 00000000d1420d43 MGS: changing import state from CONNECTING to FULL 10000000:01000000:20.0:1651608945.066211:0:41661:0:(mgc_request.c:1327:mgc_import_event()) import event 0x808004 00000100:00080000:20.0:1651608945.066213:0:41661:0:(pinger.c:207:ptlrpc_pinger_ir_up()) IR up 00000100:00080000:20.0:1651608945.066215:0:41661:0:(recover.c:218:ptlrpc_wake_delayed()) @@@ waking (set 00000000ffe9ef2e): req@00000000a5867c8c x1731815472708096/t0(0) o101->MGC10.31.7.61@o2ib@10.31.7.61@o2ib:26/25<mailto:MGC10.31.7.61@o2ib@10.31.7.61@o2ib:26/25> lens 328/344 e 0 to 0 dl 0 ref 2 fl Rpc:WQU/0/ffffffff rc 0/-1 job:'' 10000000:01000000:6.0:1651608945.066426:0:97748:0:(mgc_request.c:2132:mgc_process_log()) MGC10.31.7.61@o2ib: configuration from log 'storage-sptlrpc' failed (-2). 10000000:01000000:6.0:1651608945.066428:0:97748:0:(mgc_request.c:215:do_config_log_add()) do adding config log params-ffff910943185800 10000000:01000000:6.0:1651608945.066430:0:97748:0:(mgc_request.c:90:mgc_name2resid()) log params to resid 0x736d61726170/0x3 (params) 10000000:01000000:6.0:1651608945.066430:0:97748:0:(mgc_request.c:215:do_config_log_add()) do adding config log storage-client-ffff910943185800 10000000:01000000:6.0:1651608945.066431:0:97748:0:(mgc_request.c:90:mgc_name2resid()) log storage-client to resid 0x656761726f7473/0x0 (storage) 10000000:01000000:6.0:1651608945.066432:0:97748:0:(mgc_request.c:215:do_config_log_add()) do adding config log storage-cliir-ffff910943185800 10000000:01000000:6.0:1651608945.066433:0:97748:0:(mgc_request.c:90:mgc_name2resid()) log storage-cliir to resid 0x656761726f7473/0x2 (storage) 10000000:01000000:6.0:1651608945.066433:0:97748:0:(mgc_request.c:2060:mgc_process_log()) Process log storage-client-ffff910943185800 from 1 10000000:01000000:6.0:1651608945.066434:0:97748:0:(mgc_request.c:1102:mgc_enqueue()) Enqueue for storage-client (res 0x656761726f7473) 00000020:01000000:4.0F:1651608945.067039:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x0 mark_flg=0x1 00000020:00000080:4.0:1651608945.067043:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:4.0:1651608945.067045:0:97769:0:(obd_config.c:1432:class_process_config()) marker 4 (0x1) storage-clilov lov setup 00000020:01000000:4.0:1651608945.067047:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf001, instance name: storage-clilov-ffff910943185800 00000020:00000080:4.0:1651608945.067048:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf001 00000020:00000080:4.0:1651608945.067052:0:97769:0:(genops.c:415:class_newdev()) Allocate new device storage-clilov-ffff910943185800 (00000000e39226db) 00000020:00000080:4.0:1651608945.067074:0:97769:0:(obd_config.c:648:class_attach()) OBD: dev 1 attached type lov with refcount 1 00000020:01000000:4.0:1651608945.067076:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf003, instance name: storage-clilov-ffff910943185800 00000020:00000080:4.0:1651608945.067077:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf003 00000020:00000080:4.0:1651608945.067139:0:97769:0:(obd_config.c:752:class_setup()) finished setup of obd storage-clilov-ffff910943185800 (uuid fcf3db3e-1e74-404e-9b5b-c5ddd03f655e) 00000020:01000000:4.0:1651608945.067140:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x2 mark_flg=0x2 00000020:00000080:4.0:1651608945.067141:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:4.0:1651608945.067142:0:97769:0:(obd_config.c:1432:class_process_config()) marker 4 (0x2) storage-clilov lov setup 00000020:01000000:4.0:1651608945.067142:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x0 mark_flg=0x1 00000020:00000080:4.0:1651608945.067143:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:4.0:1651608945.067144:0:97769:0:(obd_config.c:1432:class_process_config()) marker 5 (0x1) storage-clilmv lmv setup 00000020:01000000:4.0:1651608945.067144:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf001, instance name: storage-clilmv-ffff910943185800 00000020:00000080:4.0:1651608945.067145:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf001 00000020:00000080:4.0:1651608945.067147:0:97769:0:(genops.c:415:class_newdev()) Allocate new device storage-clilmv-ffff910943185800 (00000000c0d1efc4) 00000020:00000080:4.0:1651608945.067166:0:97769:0:(obd_config.c:648:class_attach()) OBD: dev 2 attached type lmv with refcount 1 00000020:01000000:4.0:1651608945.067167:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf003, instance name: storage-clilmv-ffff910943185800 00000020:00000080:4.0:1651608945.067168:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf003 00000020:00000080:4.0:1651608945.067189:0:97769:0:(obd_config.c:752:class_setup()) finished setup of obd storage-clilmv-ffff910943185800 (uuid fcf3db3e-1e74-404e-9b5b-c5ddd03f655e) 00000020:01000000:4.0:1651608945.067190:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x2 mark_flg=0x2 00000020:00000080:4.0:1651608945.067191:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:4.0:1651608945.067191:0:97769:0:(obd_config.c:1432:class_process_config()) marker 5 (0x2) storage-clilmv lmv setup 00000020:01000000:4.0:1651608945.067192:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x0 mark_flg=0x1 00000020:00000080:4.0:1651608945.067192:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:4.0:1651608945.067193:0:97769:0:(obd_config.c:1432:class_process_config()) marker 6 (0x1) storage-MDT0000 add mdc 00000020:00000080:4.0:1651608945.067194:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf005 00000020:00000080:4.0:1651608945.067195:0:97769:0:(obd_config.c:1368:class_process_config()) adding mapping from uuid 10.31.7.61@o2ib to nid 0x500000a1f073d (10.31.7.61@o2ib) 00000020:01000000:4.0:1651608945.067198:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf001, instance name: storage-MDT0000-mdc-ffff910943185800 00000020:00000080:4.0:1651608945.067199:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf001 00000020:00000080:4.0:1651608945.067201:0:97769:0:(genops.c:415:class_newdev()) Allocate new device storage-MDT0000-mdc-ffff910943185800 (0000000049300b8f) 00000020:00000080:4.0:1651608945.067220:0:97769:0:(obd_config.c:648:class_attach()) OBD: dev 3 attached type mdc with refcount 1 00000020:01000000:4.0:1651608945.067221:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf003, instance name: storage-MDT0000-mdc-ffff910943185800 00000020:00000080:4.0:1651608945.067222:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf003 00010000:00080000:4.0:1651608945.067234:0:97769:0:(ldlm_lib.c:115:import_set_conn()) imp 00000000ec1d7cd0@storage-MDT0000-mdc-ffff910943185800: add connection 10.31.7.61@o2ib at head 00000040:01000000:4.0:1651608945.068071:0:97769:0:(llog_obd.c:212:llog_setup()) obd storage-MDT0000-mdc-ffff910943185800 ctxt 13 is initialized 00000020:00000080:5.0F:1651608945.068125:0:97769:0:(obd_config.c:752:class_setup()) finished setup of obd storage-MDT0000-mdc-ffff910943185800 (uuid fcf3db3e-1e74-404e-9b5b-c5ddd03f655e) 00000020:01000000:5.0:1651608945.068128:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf014, instance name: storage-clilmv-ffff910943185800 00000020:00000080:5.0:1651608945.068129:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf014 00800000:01000000:5.0:1651608945.068132:0:97769:0:(lmv_obd.c:386:lmv_add_target()) Target uuid: storage-MDT0000_UUID. index 0 00000020:01000000:5.0:1651608945.068140:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x2 mark_flg=0x2 00000020:00000080:5.0:1651608945.068141:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:5.0:1651608945.068142:0:97769:0:(obd_config.c:1432:class_process_config()) marker 6 (0x2) storage-MDT0000 add mdc 00000020:01000000:5.0:1651608945.068143:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x0 mark_flg=0x1 00000020:00000080:5.0:1651608945.068144:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:5.0:1651608945.068144:0:97769:0:(obd_config.c:1432:class_process_config()) marker 7 (0x1) storage-client mount opts 00000020:00000080:5.0:1651608945.068146:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf007 00000020:00000080:5.0:1651608945.068146:0:97769:0:(obd_config.c:1386:class_process_config()) mountopt: profile storage-client osc storage-clilov mdc storage-clilmv 00000020:01000000:5.0:1651608945.068147:0:97769:0:(obd_config.c:1058:class_add_profile()) Add profile storage-client 00000020:01000000:5.0:1651608945.068149:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x2 mark_flg=0x2 00000020:00000080:5.0:1651608945.068149:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:5.0:1651608945.068150:0:97769:0:(obd_config.c:1432:class_process_config()) marker 7 (0x2) storage-client mount opts 00000020:01000000:5.0:1651608945.068150:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x0 mark_flg=0x1 00000020:01000004:5.0:1651608945.068151:0:97769:0:(obd_mount.c:990:lustre_check_exclusion()) Check exclusion storage-OST0000 (0) in 0 of 10.31.7.61@o2ib:/storage 00000020:00000080:5.0:1651608945.068153:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:5.0:1651608945.068153:0:97769:0:(obd_config.c:1432:class_process_config()) marker 10 (0x1) storage-OST0000 add osc 00000020:00000080:5.0:1651608945.068154:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf005 00000020:00000080:5.0:1651608945.068155:0:97769:0:(obd_config.c:1368:class_process_config()) adding mapping from uuid 10.31.7.61@o2ib to nid 0x500000a1f073d (10.31.7.61@o2ib) 00000020:01000000:5.0:1651608945.068157:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf001, instance name: storage-OST0000-osc-ffff910943185800 00000020:00000080:5.0:1651608945.068158:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf001 00000020:00000080:5.0:1651608945.068165:0:97769:0:(genops.c:415:class_newdev()) Allocate new device storage-OST0000-osc-ffff910943185800 (000000008f5e9249) 00000020:00000080:5.0:1651608945.068184:0:97769:0:(obd_config.c:648:class_attach()) OBD: dev 4 attached type osc with refcount 1 00000020:01000000:5.0:1651608945.068185:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf003, instance name: storage-OST0000-osc-ffff910943185800 00000020:00000080:5.0:1651608945.068186:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf003 00010000:00080000:5.0:1651608945.068200:0:97769:0:(ldlm_lib.c:115:import_set_conn()) imp 00000000027d94f9@storage-OST0000-osc-ffff910943185800: add connection 10.31.7.61@o2ib at head 00000020:00000080:5.0:1651608945.068344:0:97769:0:(obd_config.c:752:class_setup()) finished setup of obd storage-OST0000-osc-ffff910943185800 (uuid fcf3db3e-1e74-404e-9b5b-c5ddd03f655e) 00000020:01000000:5.0:1651608945.068346:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf00d, instance name: storage-clilov-ffff910943185800 00000020:00000080:5.0:1651608945.068346:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf00d 00020000:01000000:5.0:1651608945.068350:0:97769:0:(lov_obd.c:480:lov_add_target()) uuid:storage-OST0000_UUID idx:0 gen:1 active:1 00020000:01000000:5.0:1651608945.068352:0:97769:0:(lov_obd.c:531:lov_add_target()) tgts: 00000000703900ea size: 2 00020000:01000000:5.0:1651608945.068353:0:97769:0:(lov_obd.c:560:lov_add_target()) idx=0 ltd_gen=1 ld_tgt_count=1 00000020:01000000:5.0:1651608945.068356:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x2 mark_flg=0x2 00000020:00000080:5.0:1651608945.068356:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:5.0:1651608945.068357:0:97769:0:(obd_config.c:1432:class_process_config()) marker 10 (0x2) storage-OST0000 add osc 00000020:01000000:5.0:1651608945.068358:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x0 mark_flg=0x1 00000020:01000004:5.0:1651608945.068358:0:97769:0:(obd_mount.c:990:lustre_check_exclusion()) Check exclusion storage-OST0001 (1) in 0 of 10.31.7.61@o2ib:/storage 00000020:00000080:5.0:1651608945.068359:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:5.0:1651608945.068359:0:97769:0:(obd_config.c:1432:class_process_config()) marker 13 (0x1) storage-OST0001 add osc 00000020:00000080:5.0:1651608945.068360:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf005 00000020:00000080:5.0:1651608945.068361:0:97769:0:(obd_config.c:1368:class_process_config()) adding mapping from uuid 10.31.7.62@o2ib to nid 0x500000a1f073e (10.31.7.62@o2ib) 00000020:01000000:5.0:1651608945.068363:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf001, instance name: storage-OST0001-osc-ffff910943185800 00000020:00000080:5.0:1651608945.068363:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf001 00000020:00000080:5.0:1651608945.068365:0:97769:0:(genops.c:415:class_newdev()) Allocate new device storage-OST0001-osc-ffff910943185800 (000000003e523917) 00000020:00000080:5.0:1651608945.068390:0:97769:0:(obd_config.c:648:class_attach()) OBD: dev 5 attached type osc with refcount 1 00000020:01000000:5.0:1651608945.068391:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf003, instance name: storage-OST0001-osc-ffff910943185800 00000020:00000080:5.0:1651608945.068392:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf003 00010000:00080000:5.0:1651608945.068398:0:97769:0:(ldlm_lib.c:115:import_set_conn()) imp 00000000024053c3@storage-OST0001-osc-ffff910943185800: add connection 10.31.7.62@o2ib at head 00000020:00000080:5.0:1651608945.068505:0:97769:0:(obd_config.c:752:class_setup()) finished setup of obd storage-OST0001-osc-ffff910943185800 (uuid fcf3db3e-1e74-404e-9b5b-c5ddd03f655e) 00000020:01000000:5.0:1651608945.068506:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf00d, instance name: storage-clilov-ffff910943185800 00000020:00000080:5.0:1651608945.068507:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf00d 00020000:01000000:5.0:1651608945.068508:0:97769:0:(lov_obd.c:480:lov_add_target()) uuid:storage-OST0001_UUID idx:1 gen:1 active:1 00020000:01000000:5.0:1651608945.068509:0:97769:0:(lov_obd.c:560:lov_add_target()) idx=1 ltd_gen=1 ld_tgt_count=2 00000020:01000000:5.0:1651608945.068511:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x2 mark_flg=0x2 00000020:00000080:5.0:1651608945.068511:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:5.0:1651608945.068512:0:97769:0:(obd_config.c:1432:class_process_config()) marker 13 (0x2) storage-OST0001 add osc 00000020:01000000:5.0:1651608945.068512:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x0 mark_flg=0x1 00000020:00000080:5.0:1651608945.068513:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:5.0:1651608945.068513:0:97769:0:(obd_config.c:1432:class_process_config()) marker 21 (0x1) storage-MDT0001 add mdc 00000020:00000080:5.0:1651608945.068514:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf005 00000020:00000080:5.0:1651608945.068515:0:97769:0:(obd_config.c:1368:class_process_config()) adding mapping from uuid 10.31.7.62@o2ib to nid 0x500000a1f073e (10.31.7.62@o2ib) 00000020:01000000:5.0:1651608945.068516:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf001, instance name: storage-MDT0001-mdc-ffff910943185800 00000020:00000080:5.0:1651608945.068517:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf001 00000020:00000080:5.0:1651608945.068519:0:97769:0:(genops.c:415:class_newdev()) Allocate new device storage-MDT0001-mdc-ffff910943185800 (0000000003506110) 00000020:00000080:5.0:1651608945.068537:0:97769:0:(obd_config.c:648:class_attach()) OBD: dev 6 attached type mdc with refcount 1 00000020:01000000:5.0:1651608945.068538:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf003, instance name: storage-MDT0001-mdc-ffff910943185800 00000020:00000080:5.0:1651608945.068539:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf003 00010000:00080000:5.0:1651608945.068544:0:97769:0:(ldlm_lib.c:115:import_set_conn()) imp 00000000ab8fbc69@storage-MDT0001-mdc-ffff910943185800: add connection 10.31.7.62@o2ib at head 00000040:01000000:5.0:1651608945.069342:0:97769:0:(llog_obd.c:212:llog_setup()) obd storage-MDT0001-mdc-ffff910943185800 ctxt 13 is initialized 00000020:00000080:5.0:1651608945.069380:0:97769:0:(obd_config.c:752:class_setup()) finished setup of obd storage-MDT0001-mdc-ffff910943185800 (uuid fcf3db3e-1e74-404e-9b5b-c5ddd03f655e) 00000020:01000000:5.0:1651608945.069381:0:97769:0:(obd_config.c:1885:class_config_llog_handler()) cmd cf014, instance name: storage-clilmv-ffff910943185800 00000020:00000080:5.0:1651608945.069382:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf014 00800000:01000000:5.0:1651608945.069383:0:97769:0:(lmv_obd.c:386:lmv_add_target()) Target uuid: storage-MDT0001_UUID. index 1 00000020:01000000:5.0:1651608945.069386:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x2 mark_flg=0x2 00000020:00000080:5.0:1651608945.069387:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:5.0:1651608945.069387:0:97769:0:(obd_config.c:1432:class_process_config()) marker 21 (0x2) storage-MDT0001 add mdc 00000020:01000000:5.0:1651608945.069388:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x0 mark_flg=0x1 00000020:00000080:5.0:1651608945.069389:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:5.0:1651608945.069389:0:97769:0:(obd_config.c:1432:class_process_config()) marker 22 (0x1) storage-client mount opts 00000020:00000080:5.0:1651608945.069390:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf007 00000020:00000080:5.0:1651608945.069390:0:97769:0:(obd_config.c:1386:class_process_config()) mountopt: profile storage-client osc storage-clilov mdc storage-clilmv 00000020:01000000:5.0:1651608945.069391:0:97769:0:(obd_config.c:1058:class_add_profile()) Add profile storage-client 00000020:01000000:5.0:1651608945.069564:0:97769:0:(obd_config.c:1770:class_config_llog_handler()) Marker, inst_flg=0x2 mark_flg=0x2 00000020:00000080:5.0:1651608945.069565:0:97769:0:(obd_config.c:1356:class_process_config()) processing cmd: cf010 00000020:00000080:5.0:1651608945.069565:0:97769:0:(obd_config.c:1432:class_process_config()) marker 22 (0x2) storage-client mount opts 00000040:00080000:5.0:1651608945.069566:0:97769:0:(llog.c:768:llog_process_thread()) stop processing plain 0x4:10:0 index 39 count 39 00000020:01000000:6.0:1651608945.069572:0:97748:0:(obd_config.c:2042:class_config_parse_llog()) Processed log storage-client gen 1-38 (rc=0) 10000000:01000000:6.0:1651608945.069576:0:97748:0:(mgc_request.c:2132:mgc_process_log()) MGC10.31.7.61@o2ib: configuration from log 'storage-client' succeeded (0). 10000000:01000000:6.0:1651608945.069577:0:97748:0:(mgc_request.c:2060:mgc_process_log()) Process log storage-cliir-ffff910943185800 from 1 10000000:01000000:6.0:1651608945.069577:0:97748:0:(mgc_request.c:1102:mgc_enqueue()) Enqueue for storage-cliir (res 0x656761726f7473) 00000020:00000080:6.0:1651608945.069823:0:97748:0:(obd_config.c:1356:class_process_config()) processing cmd: cf00f 00000020:00000080:6.0:1651608945.069851:0:97748:0:(obd_config.c:1356:class_process_config()) processing cmd: cf00f 00000020:00000080:6.0:1651608945.069859:0:97748:0:(obd_config.c:1356:class_process_config()) processing cmd: cf00f 00000020:00000080:6.0:1651608945.069866:0:97748:0:(obd_config.c:1356:class_process_config()) processing cmd: cf00f 10000000:01000000:6.0:1651608945.069893:0:97748:0:(mgc_request.c:2132:mgc_process_log()) MGC10.31.7.61@o2ib: configuration from log 'storage-cliir' succeeded (0). 10000000:01000000:6.0:1651608945.069894:0:97748:0:(mgc_request.c:2060:mgc_process_log()) Process log params-ffff910943185800 from 1 10000000:01000000:6.0:1651608945.069895:0:97748:0:(mgc_request.c:1102:mgc_enqueue()) Enqueue for params (res 0x736d61726170) 00000040:00080000:1.0F:1651608945.070236:0:97775:0:(llog.c:768:llog_process_thread()) stop processing plain 0x2:10:0 index 64768 count 1 00000020:01000000:6.0:1651608945.070243:0:97748:0:(obd_config.c:2042:class_config_parse_llog()) Processed log params gen 1-0 (rc=0) 10000000:01000000:6.0:1651608945.070246:0:97748:0:(mgc_request.c:2132:mgc_process_log()) MGC10.31.7.61@o2ib: configuration from log 'params' succeeded (0). 00000080:01000000:6.0:1651608945.070248:0:97748:0:(llite_lib.c:1312:ll_fill_super()) Found profile storage-client: mdc=storage-clilmv osc=storage-clilov 00000020:00000080:6.0:1651608945.070252:0:97748:0:(genops.c:1357:class_connect()) connect: client fcf3db3e-1e74-404e-9b5b-c5ddd03f655e, cookie 0xb433eadff9fbd43d 00800000:01000000:6.0:1651608945.070256:0:97748:0:(lmv_obd.c:459:lmv_check_connect()) Time to connect fcf3db3e-1e74-404e-9b5b-c5ddd03f655e to storage-clilmv-ffff910943185800 00800000:01000000:6.0:1651608945.070258:0:97748:0:(lmv_obd.c:295:lmv_connect_mdc()) connect to storage-MDT0000-mdc-ffff910943185800(fcf3db3e-1e74-404e-9b5b-c5ddd03f655e) - storage-MDT0000_UUID, fcf3db3e-1e74-404e-9b5b-c5ddd03f655e 00000020:00000080:6.0:1651608945.070260:0:97748:0:(genops.c:1357:class_connect()) connect: client fcf3db3e-1e74-404e-9b5b-c5ddd03f655e, cookie 0xb433eadff9fbd444 00000100:00080000:6.0:1651608945.070261:0:97748:0:(import.c:533:import_select_connection()) storage-MDT0000-mdc-ffff910943185800: connect to NID 10.31.7.61@o2ib last attempt 0 00000100:00080000:6.0:1651608945.070262:0:97748:0:(import.c:614:import_select_connection()) storage-MDT0000-mdc-ffff910943185800: import 00000000ec1d7cd0 using connection 10.31.7.61@o2ib/10.31.7.61@o2ib<mailto:10.31.7.61@o2ib/10.31.7.61@o2ib> 00000100:00080000:6.0:1651608945.070268:0:97748:0:(pinger.c:388:ptlrpc_pinger_add_import()) adding pingable import fcf3db3e-1e74-404e-9b5b-c5ddd03f655e->storage-MDT0000_UUID 00800000:01000000:6.0:1651608945.070284:0:97748:0:(lmv_obd.c:356:lmv_connect_mdc()) Connected to storage-MDT0000-mdc-ffff910943185800(fcf3db3e-1e74-404e-9b5b-c5ddd03f655e) successfully (4) 00800000:01000000:6.0:1651608945.070293:0:97748:0:(lmv_obd.c:295:lmv_connect_mdc()) connect to storage-MDT0001-mdc-ffff910943185800(fcf3db3e-1e74-404e-9b5b-c5ddd03f655e) - storage-MDT0001_UUID, fcf3db3e-1e74-404e-9b5b-c5ddd03f655e 00000020:00000080:6.0:1651608945.070295:0:97748:0:(genops.c:1357:class_connect()) connect: client fcf3db3e-1e74-404e-9b5b-c5ddd03f655e, cookie 0xb433eadff9fbd44b 00000100:00080000:23.0:1651608945.070296:0:41675:0:(client.c:1659:ptlrpc_send_new_req()) @@@ req waiting for recovery: (FULL != CONNECTING) req@0000000063b7198e x1731815472708928/t0(0) o41->storage-MDT0000-mdc-ffff910943185800@10.31.7.61@o2ib:12/10<mailto:storage-MDT0000-mdc-ffff910943185800@10.31.7.61@o2ib:12/10> lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:WQU/0/ffffffff rc 0/-1 job:'' 00000100:00080000:6.0:1651608945.070296:0:97748:0:(import.c:533:import_select_connection()) storage-MDT0001-mdc-ffff910943185800: connect to NID 10.31.7.62@o2ib last attempt 0 00000100:00080000:6.0:1651608945.070297:0:97748:0:(import.c:614:import_select_connection()) storage-MDT0001-mdc-ffff910943185800: import 00000000ab8fbc69 using connection 10.31.7.62@o2ib/10.31.7.62@o2ib<mailto:10.31.7.62@o2ib/10.31.7.62@o2ib> 00000100:00080000:6.0:1651608945.070302:0:97748:0:(pinger.c:388:ptlrpc_pinger_add_import()) adding pingable import fcf3db3e-1e74-404e-9b5b-c5ddd03f655e->storage-MDT0001_UUID 00800000:01000000:6.0:1651608945.070311:0:97748:0:(lmv_obd.c:356:lmv_connect_mdc()) Connected to storage-MDT0001-mdc-ffff910943185800(fcf3db3e-1e74-404e-9b5b-c5ddd03f655e) successfully (4) 00000100:00080000:22.0F:1651608945.070319:0:41676:0:(client.c:1659:ptlrpc_send_new_req()) @@@ req waiting for recovery: (FULL != CONNECTING) req@000000000268eff3 x1731815472709056/t0(0) o41->storage-MDT0001-mdc-ffff910943185800@10.31.7.62@o2ib:12/10<mailto:storage-MDT0001-mdc-ffff910943185800@10.31.7.62@o2ib:12/10> lens 224/368 e 0 to 0 dl 0 ref 1 fl Rpc:WQU/0/ffffffff rc 0/-1 job:'' 00000100:00080000:6.0:1651608945.070323:0:97748:0:(client.c:1659:ptlrpc_send_new_req()) @@@ req waiting for recovery: (FULL != CONNECTING) req@000000004b401ce8 x1731815472709120/t0(0) o41->storage-MDT0000-mdc-ffff910943185800@10.31.7.61@o2ib:12/10<mailto:storage-MDT0000-mdc-ffff910943185800@10.31.7.61@o2ib:12/10> lens 224/368 e 0 to 0 dl 0 ref 2 fl Rpc:WQU/0/ffffffff rc 0/-1 job:'' 00000100:00080000:21.0F:1651608945.070335:0:41661:0:(import.c:85:import_set_state_nolock()) 00000000ec1d7cd0 storage-MDT0000_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:21.0:1651608945.070339:0:41661:0:(import.c:1428:ptlrpc_connect_interpret()) recovery of storage-MDT0000_UUID on 10.31.7.61@o2ib failed (-11) 00000100:00080000:21.0:1651608945.070396:0:41661:0:(import.c:85:import_set_state_nolock()) 00000000ab8fbc69 storage-MDT0001_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:21.0:1651608945.070399:0:41661:0:(import.c:1428:ptlrpc_connect_interpret()) recovery of storage-MDT0001_UUID on 10.31.7.62@o2ib failed (-11) 00000100:00080000:23.0:1651608945.072863:0:97778:0:(import.c:233:ptlrpc_set_import_discon()) mdc: import 00000000ab8fbc69 already not connected (conn 1, was 0): DISCONN 00010000:00080000:23.0:1651608945.072867:0:97778:0:(ldlm_lib.c:98:import_set_conn()) imp 00000000ab8fbc69@storage-MDT0001-mdc-ffff910943185800: found existing conn 10.31.7.62@o2ib, moved to head 00000100:00080000:23.0:1651608945.072869:0:97778:0:(import.c:85:import_set_state_nolock()) 00000000ab8fbc69 storage-MDT0001_UUID: changing import state from DISCONN to CONNECTING 00000100:00080000:23.0:1651608945.072870:0:97778:0:(import.c:533:import_select_connection()) storage-MDT0001-mdc-ffff910943185800: connect to NID 10.31.7.62@o2ib last attempt 0 00000100:00080000:23.0:1651608945.072871:0:97778:0:(import.c:614:import_select_connection()) storage-MDT0001-mdc-ffff910943185800: import 00000000ab8fbc69 using connection 10.31.7.62@o2ib/10.31.7.62@o2ib<mailto:10.31.7.62@o2ib/10.31.7.62@o2ib> 00000100:00080000:21.0:1651608945.072937:0:41661:0:(import.c:85:import_set_state_nolock()) 00000000ab8fbc69 storage-MDT0001_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:21.0:1651608945.072938:0:41661:0:(import.c:1428:ptlrpc_connect_interpret()) recovery of storage-MDT0001_UUID on 10.31.7.62@o2ib failed (-11) 00000100:00080000:30.0:1651608945.073197:0:97779:0:(import.c:233:ptlrpc_set_import_discon()) mdc: import 00000000ec1d7cd0 already not connected (conn 1, was 0): DISCONN 00010000:00080000:30.0:1651608945.073206:0:97779:0:(ldlm_lib.c:98:import_set_conn()) imp 00000000ec1d7cd0@storage-MDT0000-mdc-ffff910943185800: found existing conn 10.31.7.61@o2ib, moved to head 00000100:00080000:30.0:1651608945.073209:0:97779:0:(import.c:85:import_set_state_nolock()) 00000000ec1d7cd0 storage-MDT0000_UUID: changing import state from DISCONN to CONNECTING 00000100:00080000:30.0:1651608945.073211:0:97779:0:(import.c:533:import_select_connection()) storage-MDT0000-mdc-ffff910943185800: connect to NID 10.31.7.61@o2ib last attempt 0 00000100:00080000:30.0:1651608945.073214:0:97779:0:(import.c:614:import_select_connection()) storage-MDT0000-mdc-ffff910943185800: import 00000000ec1d7cd0 using connection 10.31.7.61@o2ib/10.31.7.61@o2ib<mailto:10.31.7.61@o2ib/10.31.7.61@o2ib> 00000100:00080000:4.0:1651608945.073280:0:41661:0:(import.c:85:import_set_state_nolock()) 00000000ec1d7cd0 storage-MDT0000_UUID: changing import state from CONNECTING to DISCONN 00000100:00080000:4.0:1651608945.073281:0:41661:0:(import.c:1428:ptlrpc_connect_interpret()) recovery of storage-MDT0000_UUID on 10.31.7.61@o2ib failed (-11) I'm not sure why on the client node there is something about waiting for recovery/recovery failing as on the io nodes no recovery is happening. This is also the case if I explicitly abort recovery. Could you possibly help me debug this further? Best regards, Stepan ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Volker Rieke Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Neugierige sind herzlich willkommen am Sonntag, den 21. August 2022, von 10:00 bis 17:00 Uhr. Mehr unter: https://www.tagderneugier.de
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org