We've been having issues with replicated volume releases that we haven't been able to track down. It appears to have started once we moved to OpenAFS (we were Transarc AFS previously), but we are not entirely sure. We are using Sun machines running Solaris 9 and OpenAFS 1.2.8 as file servers. The clients are a mix of Sun Solaris and Windows XP using OpenAFS 1.2.8 or later.
Here is a typical transaction sequence...
* Change a single file in the RW volume, where...
fs1 contains the RW volume and 1 replica and is in the building I am in. fs2 & fs3 contain the RO volumes and are in another building.
Then...
C:\>vos release -verbose coe.xpnet.system
coe.xpnet.system
RWrite: 537081842 ROnly: 537081843 Backup: 537081844
number of sites -> 4
server fs1.uncc.edu partition /vicepe RW Site
server fs2.uncc.edu partition /vicepc RO Site
server fs1.uncc.edu partition /vicepe RO Site
server fs3.uncc.edu partition /vicepf RO Site
This is a complete release of the volume 537081842
Recloning RW volume ...
Updating existing ro volume 537081843 on fs2.uncc.edu ...
Starting ForwardMulti from 537081843 to 537081843 on fs2.uncc.edu.
Could not end transaction on a ro volume: Possible communication failure
Updating existing ro volume 537081843 on fs3.uncc.edu ...
Starting ForwardMulti from 537081843 to 537081843 on fs3.uncc.edu.
Could not end transaction on a ro volume: Possible communication failure
updating VLDB ... done
Released volume coe.xpnet.system successfullyThen, without doing anything...do it again...
C:\>vos release -verbose coe.xpnet.system
coe.xpnet.system
RWrite: 537081842 ROnly: 537081843 Backup: 537081844
number of sites -> 4
server fs1.uncc.edu partition /vicepe RW Site
server fs2.uncc.edu partition /vicepc RO Site
server fs1.uncc.edu partition /vicepe RO Site
server fs3.uncc.edu partition /vicepf RO Site
This is a complete release of the volume 537081842
Recloning RW volume ...
Updating existing ro volume 537081843 on fs2.uncc.edu ...
Starting ForwardMulti from 537081843 to 537081843 on fs2.uncc.edu.
Updating existing ro volume 537081843 on fs3.uncc.edu ...
Starting ForwardMulti from 537081843 to 537081843 on fs3.uncc.edu.
updating VLDB ... done
Released volume coe.xpnet.system successfullyNow, add a new single file to the volume, then...
C:\>vos release -verbose coe.xpnet.system
coe.xpnet.system
RWrite: 537081842 ROnly: 537081843 Backup: 537081844
number of sites -> 4
server fs1.uncc.edu partition /vicepe RW Site
server fs2.uncc.edu partition /vicepc RO Site
server fs1.uncc.edu partition /vicepe RO Site
server fs3.uncc.edu partition /vicepf RO Site
This is a complete release of the volume 537081842
Recloning RW volume ...
Updating existing ro volume 537081843 on fs2.uncc.edu ...
Starting ForwardMulti from 537081843 to 537081843 on fs2.uncc.edu.
Updating existing ro volume 537081843 on fs3.uncc.edu ...
Starting ForwardMulti from 537081843 to 537081843 on fs3.uncc.edu.
Could not end transaction on a ro volume: Possible communication failure
updating VLDB ... done
Released volume coe.xpnet.system successfullyNow, delete the file that was just created, then...
C:\>vos release -verbose coe.xpnet.system
coe.xpnet.system
RWrite: 537081842 ROnly: 537081843 Backup: 537081844
number of sites -> 4
server fs1.uncc.edu partition /vicepe RW Site
server fs2.uncc.edu partition /vicepc RO Site
server fs1.uncc.edu partition /vicepe RO Site
server fs3.uncc.edu partition /vicepf RO Site
This is a complete release of the volume 537081842
Recloning RW volume ...
Updating existing ro volume 537081843 on fs2.uncc.edu ...
Starting ForwardMulti from 537081843 to 537081843 on fs2.uncc.edu.
Updating existing ro volume 537081843 on fs3.uncc.edu ...
Starting ForwardMulti from 537081843 to 537081843 on fs3.uncc.edu.
updating VLDB ... done
Released volume coe.xpnet.system successfullyC:\>vos release -verbose coe.xpnet.system
coe.xpnet.system
RWrite: 537081842 ROnly: 537081843 Backup: 537081844
number of sites -> 4
server fs1.uncc.edu partition /vicepe RW Site
server fs2.uncc.edu partition /vicepc RO Site
server fs1.uncc.edu partition /vicepe RO Site
server fs3.uncc.edu partition /vicepf RO Site
This is a complete release of the volume 537081842
Recloning RW volume ...
Updating existing ro volume 537081843 on fs2.uncc.edu ...
Starting ForwardMulti from 537081843 to 537081843 on fs2.uncc.edu.
Updating existing ro volume 537081843 on fs3.uncc.edu ...
Starting ForwardMulti from 537081843 to 537081843 on fs3.uncc.edu.
updating VLDB ... done
Released volume coe.xpnet.system successfullyC:\>
Anybody got any clues? There is no networking problem that we are aware of. The file servers outside our building certainly have full network access.
Does the 'vos release' command on my client work by contacting each read-only server and tell it to update it's replica?
We see our problem regularly. Our only remedy...to make sure the volume is released, is to do the vos release again, just as in the above sequences.
Thanks for any help,
Rodney
Rodney M. Dyer Windows Systems Programmer Mosaic Computing Group William States Lee College of Engineering University of North Carolina at Charlotte Email: [EMAIL PROTECTED] Web: http://www.coe.uncc.edu/~rmdyer Phone (704)687-3518 Help Desk Line (704)687-3150 FAX (704)687-2352 Office 267 Smith Building
_______________________________________________ OpenAFS-info mailing list [EMAIL PROTECTED] https://lists.openafs.org/mailman/listinfo/openafs-info
