Hi all, I recently created a test installation of lustre on our cluster (rocks 4.2.1, CentOS 4.7, lustre 1.6.7.2). The setup is quite simple, 1 MGS/MDS and 4 OSS, each with a single target.
I migrated each user's homedir from our raid5-nfs-shared head node to the lustre mount. I didn't had any problems, everything went fine (I used the automounter for easy transition). The setup has been running fine for a week. Now I added a new user -- still with the old scripts, so I added the user and tried to migrate it afterwards. The result: the user cannot log in. Bash reports something like "identifier removed", apparently the user cannot read any file from his home. Strangely, I can read and write all files fine when I'm root. I can revert the migration and the data is fine (the user can log in). On the MDS, I found the following messages in the log: > LustreError: 3584:0:(mds_open.c:1567:mds_close()) @@@ no handle for file > close ino 30863131: cookie 0xafec72814d4ff48a r...@f5f41400 x1257636/t0 > o35->bb2a441c-fe74-2223-8d98-c2e40170b718@:0/0 lens 296/560 e 0 to 0 dl > 1246615645 ref 1 fl Interpret:/0/0 rc 0/0 > LustreError: 3583:0:(mds_open.c:1567:mds_close()) @@@ no handle for file > close ino 34451538: cookie 0xafec728167e99feb r...@f3caee00 x1319105/t0 > o35->98f257a6-4c8a-7f0b-25fc-d02a17efc2a6@:0/0 lens 296/560 e 0 to 0 dl > 1246615646 ref 1 fl Interpret:/0/0 rc 0/0 > LustreError: 3583:0:(mds_open.c:1567:mds_close()) Skipped 12 previous similar > messages > LustreError: 3584:0:(mds_open.c:1567:mds_close()) @@@ no handle for file > close ino 30862821: cookie 0xafec728146cb237c r...@f35de600 x433424/t0 > o35->3e8da5ff-20e6-19dd-f975-4837dc866...@net_0x200000affffd3_uuid:0/0 lens > 296/560 e 0 to 0 dl 1246615648 ref 1 fl Interpret:/0/0 rc 0/0 > LustreError: 3584:0:(mds_open.c:1567:mds_close()) Skipped 170 previous > similar messages > LustreError: 3584:0:(mds_open.c:1567:mds_close()) @@@ no handle for file > close ino 30863197: cookie 0xafec728150bfaa7b r...@f6f6aa00 x1207616/t0 > o35->b058f8e1-dc62-9e9d-f480-a38c8fe5f36d@:0/0 lens 296/560 e 0 to 0 dl > 1246615651 ref 1 fl Interpret:/0/0 rc 0/0 > LustreError: 3584:0:(mds_open.c:1567:mds_close()) Skipped 18 previous similar > messages > LustreError: 3583:0:(mds_open.c:1567:mds_close()) @@@ no handle for file > close ino 34451538: cookie 0xafec728168149d03 r...@f3779c00 x1072035/t0 > o35->afec1fd6-045f-7d1e-50e5-bbd95a94f...@net_0x200000affffc4_uuid:0/0 lens > 296/560 e 0 to 0 dl 1246615656 ref 1 fl Interpret:/0/0 rc 0/0 > LustreError: 3583:0:(mds_open.c:1567:mds_close()) Skipped 548 previous > similar messages > LustreError: 3584:0:(mds_open.c:1567:mds_close()) @@@ no handle for file > close ino 34451538: cookie 0xafec728168b5e2b6 r...@f7a4662c x318943/t0 > o35->8b9a4c7c-d0ca-5a07-2fed-76b2c29a3953@:0/0 lens 296/560 e 0 to 0 dl > 1246615664 ref 1 fl Interpret:/0/0 rc 0/0 > LustreError: 3584:0:(mds_open.c:1567:mds_close()) Skipped 174 previous > similar messages > LustreError: 5072:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing > error (-116) r...@f5da4600 x458268/t0 > o35->e99372df-e6db-3adb-0884-d238b1ef8a4e@:0/0 lens 296/560 e 0 to 0 dl > 1246615672 ref 1 fl Interpret:/0/0 rc -116/0 > LustreError: 5072:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 1291 > previous similar messages > LustreError: 5126:0:(mds_open.c:1567:mds_close()) @@@ no handle for file > close ino 30863061: cookie 0xafec72814a9a70e8 r...@f37cbe00 x3264550/t0 > o35->90ff7cc4-8f14-5c0e-90ce-1e7fb80533ce@:0/0 lens 296/560 e 0 to 0 dl > 1246615680 ref 1 fl Interpret:/0/0 rc 0/0 > LustreError: 5126:0:(mds_open.c:1567:mds_close()) Skipped 435 previous > similar messages > LustreError: 5148:0:(mds_open.c:1567:mds_close()) @@@ no handle for file > close ino 34524483: cookie 0xafec728168c287d4 r...@f3690c00 x383668/t0 > o35->cc70305b-4ca6-dc64-6f55-97299fc52fd5@:0/0 lens 296/560 e 0 to 0 dl > 1246615720 ref 1 fl Interpret:/0/0 rc 0/0 > LustreError: 5148:0:(mds_open.c:1567:mds_close()) Skipped 51 previous similar > messages > LustreError: 3546:0:(ldlm_lib.c:1643:target_send_reply_msg()) @@@ processing > error (-43) r...@f5f46c00 x3532/t0 > o36->25465654-6df2-739c-a30a-e215b53e324e@:0/0 lens 344/296 e 0 to 0 dl > 1246616320 ref 1 fl Interpret:/0/0 rc 0/0 > LustreError: 3546:0:(ldlm_lib.c:1643:target_send_reply_msg()) Skipped 440 > previous similar messages For each attempted access, the list grows. The inodes are the ones of the files of the users homedir. The OSS log no error. Anyone a clue why this happens? And why only with this user? All other users are working fine. Cheers Arne -- Arne Brutschy Ph.D. Student Email arne.brutschy(AT)ulb.ac.be IRIDIA CP 194/6 Web iridia.ulb.ac.be/~abrutschy Universite' Libre de Bruxelles Tel +32 2 650 3168 Avenue Franklin Roosevelt 50 Fax +32 2 650 2715 1050 Bruxelles, Belgium (Fax at IRIDIA secretary) _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
