Dear List,
I'm trying to set up a Lustre 1.6b5 cluster where every nodes except
frontend serve OST, frontend serve MGS+MDT, and every nodes (including
frontend) mount and use Lustre. Somehow there's a weird problem where
some nodes can't mount lustre but some nodes can.
My configuration:
OS: Rocks 4.2.1 Cluster (CentOS 4.4) using stock lustre
2.6.9-42.EL_lustre.1.5.95smp kernel. Frontend has 2 IP (real + private)
and ever compute nodes using private IP.
Lustre: 1.6b5.
Here's log from frontend (MGS+MDT) and the failed client node
Failed client node:
Lustre: mount data:
Lustre: profile: lustre-client
Lustre: device: [EMAIL PROTECTED]:/lustre
Lustre: flags: 2
LustreError: 22040:0:(client.c:579:ptlrpc_check_status()) @@@ type ==
PTL_RPC_MSG_ERR, err == -107
LustreError: 22040:0:(client.c:579:ptlrpc_check_status()) Skipped 3
previous similar messages
LustreError: 22040:0:(mgc_request.c:964:mgc_process_log()) Can't get cfg
lock: -107
LustreError: 3099:0:(mgc_request.c:493:mgc_blocking_ast()) original
grant failed, won't requeue
LustreError: 22040:0:(mgc_request.c:1014:mgc_process_log())
[EMAIL PROTECTED]: the configuration 'lustre-client' could not be read
(-107) from the MGS.
LustreError: [EMAIL PROTECTED]: The configuration 'lustre-client' could
not be read from the MGS (-107). This may be the result of
communication errors between this node and the MGS, or the MGS may not
be running.
Lustre: 0 UP mgc [EMAIL PROTECTED] f19e61f7-623f-55a2-6332-ea987600d10d 5
Lustre: 1 UP ost OSS OSS_uuid 3
Lustre: 2 UP obdfilter lustre-OST0001 lustre-OST0001_UUID 9
LustreError: 22040:0:(llite_lib.c:909:ll_fill_super()) Unable to process
log: -107
Lustre: client 0000010118688000 umount complete
LustreError: 22040:0:(obd_mount.c:1857:lustre_fill_super()) Unable to
mount (-107)
Frontend:
LustreError: 10490:0:(mgs_handler.c:468:mgs_handle()) lustre_mgs:
operation 101 on unconnected MGS
LustreError: 10490:0:(mgs_handler.c:468:mgs_handle()) Skipped 1 previous
similar message
LustreError: 10490:0:(ldlm_lib.c:1317:target_send_reply_msg()) @@@
processing error (-107)
LustreError: 10490:0:(ldlm_lib.c:1317:target_send_reply_msg()) Skipped 3
previous similar messages
I think I strictly follow the guide at
https://mail.clusterfs.com/wikis/lustre/MountConf. I suppose that the
problem occurred because IP confusion on Frontend. But some compute
nodes successfully mount lustre file system frontend.
Regards,
--
-----------------------------------------------------------------------------------
Somsak Sriprayoonsakul
Thai National Grid Center
Software Industry Promotion Agency
Ministry of ICT, Thailand
[EMAIL PROTECTED]
-----------------------------------------------------------------------------------
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss