On 8/1/2012 4:08 AM, Lars Schimmer wrote:
> Ok, now more useful, I hope:
> http://tetris.cgv.tugraz.at/afs/work.jedelsbr.fail.txt
> 
> Last lines I did:
> 
> schimmer@tetris /afs/.cgv.tugraz.at/work % vos create trinculo c
> work.jedelsbr
> Volume 1702200790 created on partition /vicepc of trinculo
> schimmer@tetris /afs/.cgv.tugraz.at/work % vos addsite trinculo c
> work.jedelsbr
> Added replication site trinculo /vicepc for volume work.jedelsbr
> schimmer@tetris /afs/.cgv.tugraz.at/work % vos addsite deimos ac
> work.jedelsbr
> Added replication site deimos /vicepac for volume work.jedelsbr
> schimmer@tetris /afs/.cgv.tugraz.at/work % vos addsite phobos ac
> work.jedelsbr
> Added replication site phobos /vicepac for volume work.jedelsbr
> schimmer@tetris /afs/.cgv.tugraz.at/work % vos addsite
> afsgraz.igd.fraunhofer.de a work.jedelsbr
> Added replication site afsgraz.igd.fraunhofer.de /vicepa for volume
> work.jedelsbr
> schimmer@tetris /afs/.cgv.tugraz.at/work % fs mkmount jedelsbr
> work.jedelsbr -rw
> fs:'jedelsbr': No such device
> 1 schimmer@tetris /afs/.cgv.tugraz.at/work %
> 
> On another machine direct after:
> 
> root@larissa /afs/.cgv.tugraz.at/work # cd jedelsbr
> root@larissa /afs/.cgv.tugraz.at/work/jedelsbr # pwd
> /afs/.cgv.tugraz.at/work/jedelsbr
> root@larissa /afs/.cgv.tugraz.at/work/jedelsbr #
> 
> root@larissa /afs/.cgv.tugraz.at/work/jedelsbr # fs exa .
> File . (1702200790.1.1) contained in volume 1702200790
> Volume status for vid = 1702200790 named work.jedelsbr
> Current disk quota is 5000
> Current blocks used are 2
> The partition has 576839755 blocks available out of 1003201864

The section that matters:

time 799.027324, pid 1118: Access vp 0xfffffffff9f10400 mode 0xc0 len
(0x0, 0x1000)

Receive create mount point 'jedelsbr' request:
time 799.027325, pid 1118: Symlink dir 0xfffffffff9f10400 link jedelsbr
time 799.027325, pid 1118: GetdCache vp 0xfffffffff9f10400 dcache
0x1534aa00 dcache low-version 0xa3, vcache low-version 0xa3
time 799.027326, pid 1118: GetdCache tlen 0x1000 flags 0x1 abyte (0x0,
0x0) Position (0x0, 0x0)
time 801.018189, pid 5556: Access vp 0xfffffffff9f10400 mode 0x100 len
(0x0, 0x1000)

RPC sent to file server ...

time 815.092202, pid 1118: Analyze RPC op 9 conn 0x37f40c00 code 0x69
user 0x3ed

RPC terminates with VNOSERVICE which implies that the file server Idle
Dead timeout period was triggered (1.6.1-1-debian on trinculo).  Since
the RPC request was abandoned due to idle dead, the RPC was not
processed and the symlink must not have been created.

VNOSERVICE forces a VLDB check.  (This is wrong.  The file server is
not reporting that the volume is not present.)

time 817.096295, pid 1118: Analyze RPC op -1 conn 0xffffffffea134340
code 0x0 user 0x0
time 817.096296, pid 1118: Did a CheckVLDB call for fid
(f9f10688:-30717.-2043962556.-30718) = 0

and then the CreateLink RPC is repeated.

time 817.096709, pid 1118: Analyze RPC op 9 conn 0x37f40c00 code
0x2f6df10 user 0x3ed

This time the file server reports UAEEXISTS which means that there
already exists an entry for 'jedelsbr'.   At this point the cache
manager needs to check whether or not 'jedelsbr' exists in the current
directory.  If not, its local copy is out of date and the status should
be cleared.

time 817.096711, pid 1118: Returning code 49733392 from 31

afs_CheckCode(UAEEXISTS) translates the error to EEXISTS which is given
to the application.

All of that being said.  What we need to figure out is why is the file
server reporting VNOSERVICE (Server idle dead error) for an RPC that was
clearly processed by the file server.


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to