Hello,

we have a problem in a test setup where clients don't recover after a 
failover of the OSS. Looking at the llog entries on the MGS I see:

#25 (224)marker  10 (flags=0x01, v1.6.5.1) lustre-OST0001  'add osc' Thu 
Nov  6 17:56:23 2008-
#26 (080)add_uuid  nid=10.3.0....@o2ib(0x500000a0300e5)  0: 
1:10.3.0....@o2ib
#27 (080)add_uuid  nid=192.168.50....@tcp(0x20000c0a83281)  0: 
1:10.3.0....@o2ib
#28 (128)attach    0:lustre-OST0001-osc  1:osc  2:lustre-clilov_UUID
#29 (136)setup     0:lustre-OST0001-osc  1:lustre-OST0001_UUID 
2:10.3.0....@o2ib
#30 (080)add_uuid  nid=10.3.0....@o2ib(0x500000a0300e5)  0: 
1:10.3.0....@o2ib
#31 (080)add_uuid  nid=192.168.50....@tcp(0x20000c0a83281)  0: 
1:10.3.0....@o2ib
#32 (104)add_conn  0:lustre-OST0001-osc  1:10.3.0....@o2ib
#33 (128)lov_modify_tgts add 0:lustre-clilov  1:lustre-OST0001_UUID  2:1 
  3:1
#34 (224)marker  10 (flags=0x02, v1.6.5.1) lustre-OST0001  'add osc' Thu 
Nov  6 17:56:23 2008-


If I understand this correctly: the client "knows" where to connect for 
accessing an OST from these entries. And these just display one of the 
two OSSes (10.3.0....@o2ib,192.168.50....@tcp). It is possible that 
there was a mistake when mounting the OST the first time, and it was 
mounted on the wrong OSS (the failover node). Would this lead to such an 
issue?

Is this correctable by re-registering the OST to the MDS (doing the 
"first mount" again)? What do I need to do on the MGS and OST for this 
(tunefs...?)?

Thanks & best regards,
Erich
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to