All,

   Continuing the issues that I reported yesterday...  I found that by 
unlinking lost files that I was able to stop the below error from occurring, 
this gives me hope that systems will stop crashing once all the lost files are 
scrubbed.

LustreError: 7676:0:(sec.c:379:import_sec_validate_get()) import 
ffff880623098800 (NEW) with no sec
LustreError: 7971:0:(sec.c:379:import_sec_validate_get()) import 
ffff880623098800 (NEW) with no sec

   I do note that the inactivated ost doesn't seem to ever REALLY go away.  
After I removed an ost from my test system I noticed that the mds still showed 
it...

On a client hooked up to the test system...
client: lfs df
UUID                   1K-blocks        Used   Available Use% Mounted on
testL-MDT0000_UUID    1819458432       10112  1819446272   0% /testlustre[MDT:0]
testL-OST0000_UUID   57914433152       12672 57914418432   0% /testlustre[OST:0]
testL-OST0001_UUID   57914433408       12672 57914418688   0% /testlustre[OST:1]
testL-OST0002_UUID   57914433408       12672 57914418688   0% /testlustre[OST:2]
OST0003             : inactive device
testL-OST0004_UUID   57914436992      144896 57914290048   0% /testlustre[OST:4

on the mds it still shows as up when I do lctl dl:
mds: lctl dl | grep OST0003
 22 UP osp testL-OST0003-osc-MDT0000 testL-MDT0000-mdtlov_UUID 5 

So I stopped the test system, ran lctl dl again (getting no results), and 
restarted it.  Once the system was back up I still saw OST3 marked as UP with 
lctl dl:
mds: lctl dl | grep OST0003
 11 UP osp testL-OST0003-osc-MDT0000 testL-MDT0000-mdtlov_UUID 5

Why does the mds still think that this OST is up?

w/r,
Kurt J. Strosahl
System Administrator
Scientific Computing Group, Thomas Jefferson National Accelerator Facility
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to