Yes, this machine can't access the mounted file system and caused a kernel panic when we tried to access some files, it also seems to give different and incorrect values when du or df is run on it.
On 22 November 2012 19:34, Dilger, Andreas <[email protected]> wrote: > On 11/22/12 10:25 AM, "Mark Field" <[email protected]> wrote: > > >Hi, > > > >I am currently using lustre 1.8, after a OST failure, I deactivated the > >OST on the MDS and made the change permanent. If I now run lctl dl on > >the client nodes all of them except one show the OST as inactive > > (device 7 in the output below) > > > > > > 0 UP mgc MGC10.214.4.201@o2ib 78b8432f-6331-cae7-8d75-dbaba9708056 5 > > 1 UP lov optstr01-clilov-ffff8103350d0400 > >cd18b560-e476-f55d-6df1-edcbd68c361b 4 > > 2 UP mdc optstr01-MDT0000-mdc-ffff8103350d0400 > >cd18b560-e476-f55d-6df1-edcbd68c361b 5 > > 3 UP osc optstr01-OST0000-osc-ffff8103350d0400 > >cd18b560-e476-f55d-6df1-edcbd68c361b 5 > > 4 UP osc optstr01-OST0001-osc-ffff8103350d0400 > >cd18b560-e476-f55d-6df1-edcbd68c361b 5 > > 5 UP osc optstr01-OST0002-osc-ffff8103350d0400 > >cd18b560-e476-f55d-6df1-edcbd68c361b 5 > > 6 UP osc optstr01-OST0003-osc-ffff8103350d0400 > >cd18b560-e476-f55d-6df1-edcbd68c361b 5 > > 7 IN osc optstr01-OST0004-osc-ffff8103350d0400 > >cd18b560-e476-f55d-6df1-edcbd68c361b 5 > > 8 UP osc optstr01-OST0008-osc-ffff8103350d0400 > >cd18b560-e476-f55d-6df1-edcbd68c361b 5 > > 9 UP osc optstr01-OST0005-osc-ffff8103350d0400 > >cd18b560-e476-f55d-6df1-edcbd68c361b 5 > > > > > > > >The other client is not working correctly, lctl dl looks like this > > > > > > 0 UP mgc MGC10.214.4.201@o2ib 94226c2b-6914-6a92-5c6b-2a27ebff676e 5 > > 1 UP lov optstr01-clilov-ffff81016d482400 > >e7a4a072-c0db-aac9-c13f-bd4189986407 4 > > 2 UP mdc optstr01-MDT0000-mdc-ffff81016d482400 > >e7a4a072-c0db-aac9-c13f-bd4189986407 5 > > 3 UP osc optstr01-OST0000-osc-ffff81016d482400 > >e7a4a072-c0db-aac9-c13f-bd4189986407 5 > > 4 UP osc optstr01-OST0001-osc-ffff81016d482400 > >e7a4a072-c0db-aac9-c13f-bd4189986407 5 > > 5 UP osc optstr01-OST0002-osc-ffff81016d482400 > >e7a4a072-c0db-aac9-c13f-bd4189986407 5 > > 6 UP osc optstr01-OST0003-osc-ffff81016d482400 > >e7a4a072-c0db-aac9-c13f-bd4189986407 5 > > 7 UP osc optstr01-OST0004-osc-ffff81016d482400 > >e7a4a072-c0db-aac9-c13f-bd4189986407 4 > > 8 UP osc optstr01-OST0008-osc-ffff81016d482400 > >e7a4a072-c0db-aac9-c13f-bd4189986407 5 > > 9 UP osc optstr01-OST0005-osc-ffff81016d482400 > >e7a4a072-c0db-aac9-c13f-bd4189986407 5 > > > > > > > >Notice device 7 is 'UP' rather than 'IN' and also the last number on the > >line is 4 not 5. I tried umount and re-mounting the client, and > >rebooting, but it always comes back the same. Is there persistent > > data somewhere on the client that is corrupt in someway and needs to be > >deleted? > > No, there is no persistent data on the clients at all. They get a new > UUID each time they mount, so the servers can't even tell it is the same > node from one mount to the next. > > Presumably this is causing a visible problem, or you wouldn't have > mentioned it? > > Cheers, Andreas > >
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
