On 8/1/2012 4:08 AM, Lars Schimmer wrote: > Ok, now more useful, I hope: > http://tetris.cgv.tugraz.at/afs/work.jedelsbr.fail.txt > > Last lines I did: > > schimmer@tetris /afs/.cgv.tugraz.at/work % vos create trinculo c > work.jedelsbr > Volume 1702200790 created on partition /vicepc of trinculo > schimmer@tetris /afs/.cgv.tugraz.at/work % vos addsite trinculo c > work.jedelsbr > Added replication site trinculo /vicepc for volume work.jedelsbr > schimmer@tetris /afs/.cgv.tugraz.at/work % vos addsite deimos ac > work.jedelsbr > Added replication site deimos /vicepac for volume work.jedelsbr > schimmer@tetris /afs/.cgv.tugraz.at/work % vos addsite phobos ac > work.jedelsbr > Added replication site phobos /vicepac for volume work.jedelsbr > schimmer@tetris /afs/.cgv.tugraz.at/work % vos addsite > afsgraz.igd.fraunhofer.de a work.jedelsbr > Added replication site afsgraz.igd.fraunhofer.de /vicepa for volume > work.jedelsbr > schimmer@tetris /afs/.cgv.tugraz.at/work % fs mkmount jedelsbr > work.jedelsbr -rw > fs:'jedelsbr': No such device > 1 schimmer@tetris /afs/.cgv.tugraz.at/work % > > On another machine direct after: > > root@larissa /afs/.cgv.tugraz.at/work # cd jedelsbr > root@larissa /afs/.cgv.tugraz.at/work/jedelsbr # pwd > /afs/.cgv.tugraz.at/work/jedelsbr > root@larissa /afs/.cgv.tugraz.at/work/jedelsbr # > > root@larissa /afs/.cgv.tugraz.at/work/jedelsbr # fs exa . > File . (1702200790.1.1) contained in volume 1702200790 > Volume status for vid = 1702200790 named work.jedelsbr > Current disk quota is 5000 > Current blocks used are 2 > The partition has 576839755 blocks available out of 1003201864
The section that matters: time 799.027324, pid 1118: Access vp 0xfffffffff9f10400 mode 0xc0 len (0x0, 0x1000) Receive create mount point 'jedelsbr' request: time 799.027325, pid 1118: Symlink dir 0xfffffffff9f10400 link jedelsbr time 799.027325, pid 1118: GetdCache vp 0xfffffffff9f10400 dcache 0x1534aa00 dcache low-version 0xa3, vcache low-version 0xa3 time 799.027326, pid 1118: GetdCache tlen 0x1000 flags 0x1 abyte (0x0, 0x0) Position (0x0, 0x0) time 801.018189, pid 5556: Access vp 0xfffffffff9f10400 mode 0x100 len (0x0, 0x1000) RPC sent to file server ... time 815.092202, pid 1118: Analyze RPC op 9 conn 0x37f40c00 code 0x69 user 0x3ed RPC terminates with VNOSERVICE which implies that the file server Idle Dead timeout period was triggered (1.6.1-1-debian on trinculo). Since the RPC request was abandoned due to idle dead, the RPC was not processed and the symlink must not have been created. VNOSERVICE forces a VLDB check. (This is wrong. The file server is not reporting that the volume is not present.) time 817.096295, pid 1118: Analyze RPC op -1 conn 0xffffffffea134340 code 0x0 user 0x0 time 817.096296, pid 1118: Did a CheckVLDB call for fid (f9f10688:-30717.-2043962556.-30718) = 0 and then the CreateLink RPC is repeated. time 817.096709, pid 1118: Analyze RPC op 9 conn 0x37f40c00 code 0x2f6df10 user 0x3ed This time the file server reports UAEEXISTS which means that there already exists an entry for 'jedelsbr'. At this point the cache manager needs to check whether or not 'jedelsbr' exists in the current directory. If not, its local copy is out of date and the status should be cleared. time 817.096711, pid 1118: Returning code 49733392 from 31 afs_CheckCode(UAEEXISTS) translates the error to EEXISTS which is given to the application. All of that being said. What we need to figure out is why is the file server reporting VNOSERVICE (Server idle dead error) for an RPC that was clearly processed by the file server.
signature.asc
Description: OpenPGP digital signature
